German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49.

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49 681) 302-5341 e-mail: wahlster@dfki.de WWW:http://www.dfki.de/~wahlster Cyber Assist International Symposium 2001 Tokyo, March 6, 2001 Prof. Wolfgang Wahlster SmartKom: Multimodal Dialogs with Mobile Web Users

© W. Wahlster System Input Channels Output Channels Storage HD Drive CD-ROM visual tactile auditory haptic MEDIA (physical information carriers) MODALITIES (human senses) languagegraphicsgesture User CODE (systems of symbols) mimics Code, Media and Modalities

© W. Wahlster SmartKom: Intuitive Multimodal Interaction MediaInterface European Media Lab Uinv. Of Munich Univ. of Stuttgart Saarbrücken Aachen Dresden Berkeley Stuttgart MunichUniv. of Erlangen Heidelberg Main Contractor DFKI Saarbrücken The SmartKom Consortium: Project Budget: € 25 M Project Duration: 4 years Ulm

© W. Wahlster SmartKom-Home/Office: A Versatile Agent-based Interface SmartKom-Public: A Multimodal Communication Booth SmartKom-Mobile: A Handheld Communication Assistant Media Analysis Kernel of SmartKom Interface Agent Interaction Management Application Manage- ment Media Design SmartKom: A Transportable and Transmutable Interface Agent

© W. Wahlster User(s) Media Analysis Design Media Fusion Output Rendering Representation and Inference User Model Discourse Model Domain Model Task Model Media Models Interaction Management Media Analysis Input Processing Information Applications People Intention Recognition Media Design Application Interface Discourse Modeling User Modeling Presentation Design Language Graphics Gesture Biometrics Language Graphics Gesture Animated Presentation Agent The Architecture of the SmartKom Agent (cf. Maybury/Wahlster 1998)

© W. Wahlster Camera GPS Microphone Loudspeaker Stylus-Activated Sketch Pad Wearable Compute Server Docking Station for Car PC Biosensor for Authentication & Emotional Feedback GSM for Telephone, Fax, Internet Connectivity SmartKom-Mobile: A Handheld Communication Assistant

© W. Wahlster Smartcard/ Credit Card for authentication and billing Docking station for PDA/Notebook/ Camcorder high speed and broad bandwidth Internet connectivity High-resolution scanner Loudspeaker Room microphone Face-tracking camera Virtual touchscreen protected against vandalism Multipoint video conferencing SmartKom-Public: A Multimodal Communication Booth

© W. Wahlster Integration of Speech and Gesture Advantages: For the sender: Economic specification of referents -The description becomes shorter and may be underspecified. For the recipient: Fast recognition of referents - Speech processing and orientation in an intended direction are performed simultanuously. Speech and gesture input disambiguate each other. Disadvantages: Employing gestures leads to an increase of elliptic utterances (  speech analysis is getting more complex). Multiple pointing gestures in one utterance may lead to reference problems.

Augmented Reality: Combining Speech, Gestures and Graphics for Mobile Web Access Mobile Dialog with a Virtual Tourist Guide for the Heidelberg Castle Location-adaptive Query Interpretation

© W. Wahlster Speech-based Access to 3D Virtual Views Multimodal Output from a Digital Library and Speech-based Access to Internet Content Augmented Reality: Combining Speech, Gestures and Graphics for Mobile Web Access

© W. Wahlster Semantic Representation Language Semantic Representation Language Mimics Description Language Mimics Description Language Gesture Description Language Gesture Description Language Ontologies Knowledge Representation Language Inference Component Knowledge Representation Language Inference Component DBMS/ KBMS/ WWW DBMS/ KBMS/ WWW Mimics Analysis Mimics Generation Gesture Analysis Gesture Generation Parsing Mimics Gestures Modality-Specific Representation Languages as an Intermediate Representation before Media Fusion Speech Input M3L based on XML

© W. Wahlster SmartKom‘s Data Collection of Multimodal Dialogs User Side-view Camera Face-tacking Camera with Microphone Environmental Noise Microphone Array Screen Projected Webpage Face-tacking Camera Loudspeaker Microphone Array User Bird’s-eye Camera LCD Beamer SIVIT- Camera

© W. Wahlster ANVIL: Multi-Track Annotation of Video and Language Annotation Tool for Multimodal Interaction trans-literated speech rhetorical relations theme-rheme Postures, Gestures http://www.dfki.de/~kipp/research/anvil.html...

© W. Wahlster Mobile Presentation Unit for SmartKom-Public 2 Sony DSR-PD100AP Video Cameras LCD-Beamer ASK C5 SIVIT Gesture Recognition Unit Microphones (Microphone Array) Speakers 3 Dual Pentiums III, 500

© W. Wahlster Which feature films are shown tonight on TV? Combination of Speech and Gesture in SmartKom I show you a survey of tonight's TV films. I can't find anything interesting. Then I'll go to the movies. Here you see a programme listing of the movies shown in Heidelberg today. This one I would like to see. Where is it shown? On this map all movie theatres are highlighted, that are showing "A Little Christmas Story".

© W. Wahlster Frame Languages Object-oriented Modelling Primitives Frame Languages Object-oriented Modelling Primitives Concept Languages/ Terminological Logics Formal Semantics Subsumption, Inferences Concept Languages/ Terminological Logics Formal Semantics Subsumption, Inferences Web Languages XML and RDF Syntax Web Languages XML and RDF Syntax M3L M3L Integrates Three Language Families

© W. Wahlster [...] cinema_17a Europa 225 230 [...] 0.5542 0.1950 0.9892 0.7068 pid1234 [...] [...] cinema_17a Europa 225 230 [...] 0.5542 0.1950 0.9892 0.7068 pid1234 [...] M3L Representation of the Multimodal Discourse Context Blackboard with Presentation Context of the Previous Dialog Turn

© W. Wahlster M3L Representation of the Word Lattice Produced by the Speech Recognizer for “ There [  ] I would like to get a reservation.“ 2000-12-07T13:44:37.900Z shortPause [...] 5 7 gern 6.51343 PT0.57S PT0.84S 5 7 gerne 6.19579 PT0.57S PT0.84S [...] 2000-12-07T13:44:37.900Z shortPause [...] 5 7 gern 6.51343 PT0.57S PT0.84S 5 7 gerne 6.19579 PT0.57S PT0.84S [...]

© W. Wahlster 2000-12-07T14:45:03.125 PT0.040S 2000-12-07T14:45:03.125 PT0.040S 0.872641 0.477261 tarrying dynamic Gesture Recognition and Gesture Analysis “There [  ] I would like to get a reservation.“ Gesture Lattice as Result of Gesture Recognition Result of Gesture Analysis [...] tarrying dynStructId30 1 dynStructId28 2 [...] cinema_17a Europa 225 230 [...] [...] tarrying dynStructId30 1 dynStructId28 2 [...] cinema_17a Europa 225 230 [...]

© W. Wahlster Language Analysis and Media Fusion: Turn8: “There [  ] I would like to get a reservation.“ [...] acoustic 60.95448 understanding 0.928571 reserve cinema_17a Europa [...] [...] acoustic 60.95448 understanding 0.928571 reserve cinema_17a Europa [...] Confidence in the Speech Recognition Result Confidence in the Speech Understanding Result Planning Act Object Reference

© W. Wahlster Output Synchronization: Speech, Gesture, Graphics, Animation 11 declarative [...] eine 2.1539 2.2829 Übersicht 2.2829 3.2997 [...] 11 declarative [...] eine 2.1539 2.2829 Übersicht 2.2829 3.2997 [...]

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49.

Similar presentations

Presentation on theme: "German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49.

Similar presentations

Presentation on theme: "German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49."— Presentation transcript:

Similar presentations

About project

Feedback