Multimodal Interaction Activity

Extending the Web to support multiple modes of interaction.

News

Introduction

The Mission

The Multimodal Interaction Activity seeks to extend the Web to allow users to dynamically select the most appropriate mode of interaction for their current needs, including any disabilities, whilst enabling developers to provide an effective user interface for whichever modes the user selects. Depending upon the device, users will be able to provide input via speech, handwriting, and keystrokes, with output presented via displays, pre-recorded and synthetic speech, audio, and tactile mechanisms such as mobile phone vibrators and Braille strips.

Multimodal interaction offers significant ease of use benefits over uni-modal interaction, for instance, when hands-free operation is needed, for mobile devices with limited keypads, and for controlling other devices when a traditional desktop computer is unvailable to host the application user interface. This is being driven by advances in embedded and network-based speech processing that are creating opportunities for integrated multimodal Web browsers and for solutions that separate the handling of visual and aural modalities, for example, by coupling a local XHTML user agent with a remote VoiceXML user agent.

Target Audience

The Multimodal Interaction Working Group (member only link) should be of interest to a range of organizations in different industry sectors:

Mobile
Multimodal applications are of particular interest for mobile devices. Speech offers a welcome means to interact with smaller devices, allowing one-handed and hands-free operation. Pen input enables handwriting, gestures, drawings and specialized notations. Users benefit from being able to choose which modalities they find convenient in any situation. The Working Group should be of interest to companies developing smart phones and personal digital assistants or who are interested in providing tools and technology to support the delivery of multimodal services to such devices.
Automotive Telematics
With the emergence of dashboard integrated high resolution color displays for navigation, communication and entertainment services, W3C's work on open standards for multimodal interaction should be of interest to companies working on developing the next generation of in-car systems.
Multimodal interfaces in the office
Multimodal has benefits for desktops, wall mounted interactive displays, multi-function copiers, and other office equipment, offering a richer user experience and the chance to use speech and pens as alternatives to the mouse and keyboard. W3C's standardization work in this area should be of interest to companies developing client software and application authoring technologies, and who wish to ensure that the resulting standards live up to their needs.
Multimodal interfaces in the home
In addition to desktop access to the Web, multimodal interfaces are expected to add value to remote control of home entertainment systems, as well as finding a role for other systems around the home. Companies involved in developing embedded systems and consumer electronics should be interested in W3C's work on multimodal interaction.

Current Situation

The Multimodal Interaction Working Group was launched in 2002 following a joint workshop between the W3C and the WAP Forum. The Working Group's initial focus was on use cases and requirements. This led to the publication of the W3C Multimodal Interaction Framework, and in turn to work on extensible multi-modal annotations (EMMA), and InkML, an XML language for ink traces. The Working Group has also worked on integration of composite multimodal input; dynamic adaptation to device configurations, user preferences and environmental conditions (now transferred to the Device Independence Activity); modality component interfaces; and a study of current approaches to interaction management. The Working Group has now been re-chartered through 31 January 2009 under the terms of the W3C Patent Policy (5 February 2004 Version). To promote the widest adoption of Web standards, W3C seeks to issue Recommendations that can be implemented, according to this policy, on a Royalty-Free basis. The Working Group is chaired by Deborah Dahl. The W3C Team Contact is Kazuyuki Ashimura.

We want to hear from you!

We are very interested in your comments and suggestions. If you have implemented multimodal interfaces, please share your experiences with us, as we are particularly interested in reports on implementations and their usability for both end-users and application developers. We welcome comments on any of our published documents. If you have a proposal for multimodal authoring language, please let us know. To subscribe to the discussion list send an email to www-multimodal-request@w3.org with the word subscribe in the subject header. Previous discussion can be found in the public archive. To unsubscribe send an email to www-multimodal-request@w3.org with the word unsubscribe in the subject header.

How to join the Working Group

If your organization is already a member of W3C, ask your W3C Advisory Comittee Representative (member only link) to fill out the online registration form to confirm that your organization is prepared to commit the time and expense involved in particpating in the group. You will be expected to attend all Working Group meetings (about 3 or 4 times a year) and to respond in a timely fashion to email requests. Further details about joining are available on the Working Group (member only link) page. Requirements for patent disclosures, as well as terms and conditions for licensing essential IPR are given in the W3C Patent Policy.

More information about the W3C is available, as is information about joining W3C.

Patent Disclosures

W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent.

Revised publication target dates

Specification FPWD LC CR PR Rec
Multimodal Architecture and Interfaces Completed TBD June 2008 December 2008 February 2009
EMMA Completed Completed Completed

CR period ends on 14 April, 2008
May 2008 TBD
InkML Completed Completed 2Q 2008 TBD TBD

Work in Progress

This is intended to give you a brief summary of each of the major work items under development by the Multimodal Interaction Working Group. The suite of specifications is known as the W3C Multimodal Interaction Framework.

Current Work

The following indicates current work items. Additional work is expected on topics described in section 4 of the charter, including multimodal authoring, modality component interfaces, composite multimodal input, and coordinated multimodal output.

Multimodal Architecture

A loosely coupled architecture for the Multimodal Interaction Framework that focuses on providing a general means for components to communicate with each other, plus basic infrastructure for application control and platform services. Work is continuing on how the architecture can be realized in terms of well defined component interfaces and eventing models.

Extensible Multi-Modal Annotations (EMMA)

EMMA is being developed as a data exchange format for the interface between input processors and interaction management systems. It will define the means for recognizers to annotate application specific data with information such as confidence scores, time stamps, input mode (e.g. key strokes, speech or pen), alternative recognition hypotheses, and partial recognition results etc. EMMA is a target data format for the semantic interpretation specification being developed in the Voice Browser Activity, and which describes annotations to speech grammars for extracting application specific data as a result of speech recognition. EMMA supercedes earlier work on the natural language semantics markup language in the Voice Browser Activity.

InkML - an XML language for digital ink traces

This work item sets out to define an XML data exchange format for ink entered with an electronic pen or stylus as part of a multimodal system. This will enable the capture and server-side processing of handwriting, gestures, drawings, and specific notations for mathematics, music, chemistry and other fields, as well as supporting further research on this processing. The Ink subgroup maintains a separate public page devoted to W3C's work on pen and stylus input.

Related Materials

For more details on other organizations see the Multimodal Interaction Charter.

W3C Team Contacts

Kazuyuki Ashimura <kazuyuki@w3.org> - Multimodal Interaction Activity Lead

Copyright © 2002-2007 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements. This page was last updated on $Date: 2008/03/18 12:51:20 $