Beruflich Dokumente
Kultur Dokumente
INNOVATION GALLERY
Multimodal interfaces
march 2005
Multimodal interfaces, movement and voice to control machines The progress made these past few years in the elds of movement and voice recognition today enable us to think of new ways to control the machines around us. Combining these two dimensions in fact make it possible to eliminate contact, and thereby develop dialogue interfaces that are simpler and more natural. France Telecom researchers are therefore working on developing new tools for tomorrow that could simplify our relationship with machines and offer rich new applications, especially in areas such as co-working or video games.
u What is it ?
To act on our environment by means of an interface or a tool, we must make a movement: press a switch or a remote control button, turn a steering wheel, use a keyboard or a mouse. We must in other words be in physical contact with an object. France Telecom researchers work in the eld of voice and especially movement recognition will soon make it possible to eliminate touch and to interact with our environment in a simpler way. The idea is therefore to use voice and hands to control a machine. Since we use these two from early childhood, there is nothing to be learnt. Besides, voice and hands are tools that we use every day: they are always available, not cumbersome, and we dont need to share them, because we all have our own! The user can for instance use his hands instead of the mouse to control his computer. Adding voice to the movements, he can control a variety of more complex actions. These two modes are therefore complementary. What France Telecoms R&D teams have in mind, is to take the best of each in order to simplify man-machine relationships. Just as our vocabulary is rich, our bodies have a big plasticity, if one thinks of the vast number of movements a person makes in the course of a day. Although they are progressing, articialCopyright France Tlcom - 2005
1/5
sight technologies are not yet able to recognise the multitude of movement activities carried out in different environments and contexts.
On the other hand, the pointing gesture is already well recognised by articial-sight technologies. It is very effective for spacial or navigational designation tasks. This could lead to the creation of more natural man-machine interfaces, by which control of the environment would be simplied: objects can be pointed at and displaced, while voice commands can carry out more complex actions. Co-working Interacting through voice and movement, from a distance and without touching anything, with large screens and big interaction volumes, facilitates group work in collaborative spaces. In fact, besides touch, remote interaction also frees the user from the interface dimension. In front of a large representation, several people can share the same view and therefore interact easily around the same application, wherever they are in the room. With an electronic projection screen, life-size representations are possible that can create a feeling of presence and immersion. Interaction between people also energises co-working, especially if they all have the same tools. By making use of the possibilities offered by broadband telecommunication networks, these collaborative virtual environments can be shared by several remote teams. In this case, articial-sight technologies can reproduce each participants movements and then project them on the screen in the form of virtual clones or avatars.
Copyright France Tlcom - 2005
New game possibilities Articial-sight technologies also open up new possibilities in the eld of video games. By eliminating joystick, keyboard and mouse, the player can experience new sensations by becoming more immerged in the virtual game environment. He can for instance use his own body movements to animate a game character. This could even add a choreographic or athletic dimension to the performances that are usually required.
2/5
3/5
The second approach is for the moment the one that France Telecom has chosen for multimodal interfaces. The objective of this research is of course the recognition of continuous language, which would allow the user to express himself freely. In this case, the machine must not only detect the words, but also the sentence structures. The grammar and parts of speech (subject, verb, adjectives, object, etc.) should be taken into account by the machine to full the users demands as closely as possible.
u
France Telecom researchers know-how in this eld led to the founding in August 2000 of the start-up Telisma, which develops voice-recognition software for mass-public applications.
u When ?
Researchers in the eld of articial sight or movement recognition have only just started and a lot of progress still remains to be made. However, applications based on movement reproduction, to animate a video game character for instance, have almost reached technological maturity. Speech-movement interfaces are currently being developed by France Telecom researchers, for instance as part of a pluridisciplinary project on multimodality. As far as co-working is concerned, the researchers are taking part in more general programmes, on telepresence and immersion. These projects could become reality in only a few years time. The TELIM (telepresence and immersion) project, launched in January 2005, combines all work related to collaborative interfaces and interpersonal communications. Its objectives are to study uses and disruptive uses in these elds, by looking at theoretical studies or prototypes developed within France Telecoms R&D Division. For instance, the augmented reality project SPIN-3D is a collaborative platform for visualising virtual objects three-dimensionally. Also in this context, a project for steering with the nger and eye of an avatar (or synthesis agent) is currently under development. Also in the eld of articial sight, a project for surveillance through activity analysis, destined for people who are elderly or suffering from a handicap or memory loss, however forms part
Copyright France Tlcom - 2005
4/5
of a much longer-term endeavour. Movement-recognition technologies in fact still have a long way to go towards the ne analysis of movements and making analogies between them. This very forward-looking domain could however give rise to a new tool for remote personal assistance.
5/5