German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49.

Slides:



Advertisements
Similar presentations
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Advertisements

TeleMorph & TeleTuras: Bandwidth determined Mobile MultiModal Presentation Student: Anthony J. Solon Supervisors: Prof. Paul Mc Kevitt Kevin Curran School.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Co-funded by the European Union Semantic CMS Community IKS impact on DFKI research Final Review Luxembourg March 13/14, 2013 Tilman Becker DFKI GmbH.
An overview of EMMA— Extensible MultiModal Annotation Michael Johnston AT&T Labs Research 8/9/2006.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Irek Defée Signal Processing for Multimodal Web Irek Defée Department of Signal Processing Tampere University of Technology W3C Web Technology Day.
Empirical and Data-Driven Models of Multimodality Advanced Methods for Multimodal Communication Computational Models of Multimodality Adequate.
MediaHub: An Intelligent Multimedia Distributed Hub Student: Glenn Campbell Supervisors: Dr. Tom Lunney Prof. Paul Mc Kevitt School of Computing and Intelligent.
XISL language XISL= eXtensible Interaction Sheet Language or XISL=eXtensible Interaction Scenario Language.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Media Coordination in SmartKom Norbert Reithinger Dagstuhl Seminar “Coordination and Fusion in Multimodal Interaction” Deutsches Forschungszentrum für.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
© W. Wahlster, DFKI IUI99, International Conference on Intelligent User Interfaces Los Angeles, January 6th, 1999 Agent-based Multimedia Interaction for.
Hardware/Software Computers? - computer-based society Hardware Trends - mobility, portability, wireless, Internet –Mainframes –Midrange –Micros (PCs) -
Definition and Aspects
Computing ESSENTIALS     CHAPTER Ch 9Copyright 2003 The McGraw-Hill Companies, Inc Graphics, Multimedia, and Artificial Intelligence computing.
New Technologies Are Surfacing Everyday. l Some will have a dramatic affect on the business environment. l Others will totally change the way you live.
DFKI Approach to Dialogue Management Norbert Reithinger, Elsa Pecourt, Markus Löckelt
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Building the Design Studio of the Future Aaron Adler Jacob Eisenstein Michael Oltmans Lisa Guttentag Randall Davis October 23, 2004.
Computer and Internet Basics.
Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.
Your Interactive Guide to the Digital World Discovering Computers 2012.
© W. Wahlster, DFKI Third International Conference on Autonomous Agents Agents 99 Seattle, May German Research Center for Artificial Intelligence,
Communication 200 Media Narratives Negroponte, “Being Digital” Kris Samuelson Byron Reeves.
Introduction to Computers
ACL, ECCAI and the Verbmobil/SmartKom Consortia German Research Center for Artificial Intelligence Stuhlsatzenhausweg 3, Geb Saarbrücken Tel.:
Prof. Wolfgang Wahlster German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( )
Alternative Input Devices Part B There will be a test on this information (both part a & b).
Teaching with Multimedia and Hypermedia
GUI: Specifying Complete User Interaction Soft computing Laboratory Yonsei University October 25, 2004.
Brussels, 04 March 2004Workshop „New Communication Paradigms for 2020“ Semantic Routing, Service Discovery and Service Composition Gregor Erbach German.
Integrating Educational Technology into the Curriculum
DFKI GmbH, , R. Karger Indo-German Workshop on Language Technologies Reinhard Karger, M.A. Deutsches Forschungszentrum für Künstliche Intelligenz.
44 CHAPTER SPECIALIZED APPLICATION SOFTWARE Graphics 1. Desktop publishing 2. Image editors 3. Illustration programs 4. Image galleries 5. Graphic.
Working group on multimodal meaning representation Dagstuhl workshop, Oct
Markup of Multimodal Emotion-Sensitive Corpora Berardina Nadja de Carolis, Univ. Bari Marc Schröder, DFKI.
MediaHub: An Intelligent MultiMedia Distributed Platform Hub Glenn Campbell, Tom Lunney & Paul Mc Kevitt School of Computing and Intelligent Systems Faculty.
CHAPTER FOUR COMPUTER SOFTWARE.
Break-out Group # D Research Issues in Multimodal Interaction.
APML, a Markup Language for Believable Behavior Generation Soft computing Laboratory Yonsei University October 25, 2004.
Multimodal Information Access Using Speech and Gestures Norbert Reithinger
IT Introduction to Information Technology CHAPTER 01.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
TeleMorph & TeleTuras: Bandwidth determined Mobile MultiModal Presentation Student: Anthony J. Solon Supervisors: Prof. Paul Mc Kevitt Kevin Curran School.
Towards multimodal meaning representation Harry Bunt & Laurent Romary LREC Workshop on standards for language resources Las Palmas, May 2002.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
卓越發展延續計畫分項三 User-Centric Interactive Media ~ 主 持 人 : 傅立成 共同主持人 : 李琳山,歐陽明,洪一平, 陳祝嵩 水美溫泉會館研討會
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
ENTERFACE 08 Project 1 “MultiParty Communication with a Tour Guide ECA” Mid-term presentation August 19th, 2008.
Intelligent Robot Architecture (1-3)  Background of research  Research objectives  By recognizing and analyzing user’s utterances and actions, an intelligent.
Österreichisches Forschnungsinstitut für Artificial Intelligence Representational Lego for ECAs Brigitte Krenn.
HCI 입문 Graphics Korea University HCI System 2005 년 2 학기 김 창 헌.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
A MBI L EARN Ambient Intelligent Multimodal Learning Environment for Children 100 day review December 2008 Jennifer Hyndman Supervisors: Dr. Tom Lunney,
DFKI GmbH, , R. Karger Perspectives for the Indo German Scientific and Technological Cooperation in the Field of Language Technology Reinhard.
German Research Center for Artificial Intelligence DFKI GmbH Saarbruecken, Germany WWW: Eurospeech.
Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects. MAO Yuhang, DING Xiao-Qing, NI Yang, LIN Shiuan-Sung, Laurence LIKFORMAN,
Understanding Naturally Conveyed Explanations of Device Behavior Michael Oltmans and Randall Davis MIT Artificial Intelligence Lab.
Stanford hci group / cs376 u Jeffrey Heer · 19 May 2009 Speech & Multimodal Interfaces.
What is Multimedia Anyway? David Millard and Paul Lewis.
NCP meeting Jan 27-28, 2003, Brussels Colette Maloney Interfaces, Knowledge and Content technologies, Applications & Information Market DG INFSO Multimodal.
Your Interactive Guide to the Digital World Discovering Computers 2012 Chapter 13 Computer Programs and Programming Languages.
MULTIMODAL AND NATURAL COMPUTER INTERACTION Domas Jonaitis.
© W. Wahlster, DFKI IST ´98 Workshop „The Language of Business - the Business of Language“ Vienna, 2 December 1998 German Research Center for Artificial.
SPECIALIZED APPLICATION SOFTWARE
Multimodal Human-Computer Interaction New Interaction Techniques 22. 1
Presentation transcript:

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: ( ) WWW: Cyber Assist International Symposium 2001 Tokyo, March 6, 2001 Prof. Wolfgang Wahlster SmartKom: Multimodal Dialogs with Mobile Web Users

© W. Wahlster Natural Language Dialog Graphical User interfaces Gestural Interaction Multimodal Interaction Merging Various User Interface Paradigms

© W. Wahlster System Input Channels Output Channels Storage HD Drive CD-ROM visual tactile auditory haptic MEDIA (physical information carriers) MODALITIES (human senses) languagegraphicsgesture User CODE (systems of symbols) mimics Code, Media and Modalities

© W. Wahlster SmartKom: Intuitive Multimodal Interaction MediaInterface European Media Lab Uinv. Of Munich Univ. of Stuttgart Saarbrücken Aachen Dresden Berkeley Stuttgart MunichUniv. of Erlangen Heidelberg Main Contractor DFKI Saarbrücken The SmartKom Consortium: Project Budget: € 25 M Project Duration: 4 years Ulm

© W. Wahlster SmartKom-Home/Office: A Versatile Agent-based Interface SmartKom-Public: A Multimodal Communication Booth SmartKom-Mobile: A Handheld Communication Assistant Media Analysis Kernel of SmartKom Interface Agent Interaction Management Application Manage- ment Media Design SmartKom: A Transportable and Transmutable Interface Agent

© W. Wahlster User(s) Media Analysis Design Media Fusion Output Rendering Representation and Inference User Model Discourse Model Domain Model Task Model Media Models Interaction Management Media Analysis Input Processing Information Applications People Intention Recognition Media Design Application Interface Discourse Modeling User Modeling Presentation Design Language Graphics Gesture Biometrics Language Graphics Gesture Animated Presentation Agent The Architecture of the SmartKom Agent (cf. Maybury/Wahlster 1998)

© W. Wahlster Camera GPS Microphone Loudspeaker Stylus-Activated Sketch Pad Wearable Compute Server Docking Station for Car PC Biosensor for Authentication & Emotional Feedback GSM for Telephone, Fax, Internet Connectivity SmartKom-Mobile: A Handheld Communication Assistant

© W. Wahlster Smartcard/ Credit Card for authentication and billing Docking station for PDA/Notebook/ Camcorder high speed and broad bandwidth Internet connectivity High-resolution scanner Loudspeaker Room microphone Face-tracking camera Virtual touchscreen protected against vandalism Multipoint video conferencing SmartKom-Public: A Multimodal Communication Booth

© W. Wahlster SpeechMike Virtual Touchscreen Natural Gesture Recognition SmartKom-Home/Office: Versatile Agent-based Interface

© W. Wahlster Integration of Speech and Gesture Advantages: For the sender: Economic specification of referents -The description becomes shorter and may be underspecified. For the recipient: Fast recognition of referents - Speech processing and orientation in an intended direction are performed simultanuously. Speech and gesture input disambiguate each other. Disadvantages: Employing gestures leads to an increase of elliptic utterances (  speech analysis is getting more complex). Multiple pointing gestures in one utterance may lead to reference problems.

© W. Wahlster XTRA: Interpretation of pointing gestures (eXpert TRAnslator, Wahlster et al. 1986)

© W. Wahlster Multimodal Input and Output in the SmartKom System

© W. Wahlster Unification-based Media Fusion “MOVE THIS HERE” Source: Michael Johnston

© W. Wahlster Unification-based Media Fusion “MOVE THIS HERE” Source: Michael Johnston

© W. Wahlster

Augmented Reality: Combining Speech, Gestures and Graphics for Mobile Web Access Mobile Dialog with a Virtual Tourist Guide for the Heidelberg Castle Location-adaptive Query Interpretation

© W. Wahlster Multimodal Route Description Mobile Speech Translation and Multilingual Information Access Augmented Reality: Combining Speech, Gestures and Graphics for Mobile Web Access

© W. Wahlster Speech-based Access to 3D Virtual Views Multimodal Output from a Digital Library and Speech-based Access to Internet Content Augmented Reality: Combining Speech, Gestures and Graphics for Mobile Web Access

© W. Wahlster Multimodal Input and Output in SmartKom Input by the UserOutput by the Presentation agent Speech Gesture Mimics

© W. Wahlster Semantic Representation Language Semantic Representation Language Mimics Description Language Mimics Description Language Gesture Description Language Gesture Description Language Ontologies Knowledge Representation Language Inference Component Knowledge Representation Language Inference Component DBMS/ KBMS/ WWW DBMS/ KBMS/ WWW Mimics Analysis Mimics Generation Gesture Analysis Gesture Generation Parsing Mimics Gestures Modality-Specific Representation Languages as an Intermediate Representation before Media Fusion Speech Input M3L based on XML

© W. Wahlster The SmartKom Control GUI

© W. Wahlster SmartKom‘s Data Collection of Multimodal Dialogs User Side-view Camera Face-tacking Camera with Microphone Environmental Noise Microphone Array Screen Projected Webpage Face-tacking Camera Loudspeaker Microphone Array User Bird’s-eye Camera LCD Beamer SIVIT- Camera

© W. Wahlster ANVIL: Multi-Track Annotation of Video and Language Annotation Tool for Multimodal Interaction trans-literated speech rhetorical relations theme-rheme Postures, Gestures

© W. Wahlster Mobile Presentation Unit for SmartKom-Public 2 Sony DSR-PD100AP Video Cameras LCD-Beamer ASK C5 SIVIT Gesture Recognition Unit Microphones (Microphone Array) Speakers 3 Dual Pentiums III, 500

© W. Wahlster Which feature films are shown tonight on TV? Combination of Speech and Gesture in SmartKom I show you a survey of tonight's TV films. I can't find anything interesting. Then I'll go to the movies. Here you see a programme listing of the movies shown in Heidelberg today. This one I would like to see. Where is it shown? On this map all movie theatres are highlighted, that are showing "A Little Christmas Story".

© W. Wahlster Three Levels of Mark-up Languages for the Web Content : Structure : Form = 1 : n : m WWW Document Content Structure Form OIL/M3L XML HTML

© W. Wahlster Frame Languages Object-oriented Modelling Primitives Frame Languages Object-oriented Modelling Primitives Concept Languages/ Terminological Logics Formal Semantics Subsumption, Inferences Concept Languages/ Terminological Logics Formal Semantics Subsumption, Inferences Web Languages XML and RDF Syntax Web Languages XML and RDF Syntax M3L M3L Integrates Three Language Families

© W. Wahlster [...] cinema_17a Europa [...] pid1234 [...] [...] cinema_17a Europa [...] pid1234 [...] M3L Representation of the Multimodal Discourse Context Blackboard with Presentation Context of the Previous Dialog Turn

© W. Wahlster M3L Representation of the Word Lattice Produced by the Speech Recognizer for “ There [  ] I would like to get a reservation.“ T13:44:37.900Z shortPause [...] 5 7 gern PT0.57S PT0.84S 5 7 gerne PT0.57S PT0.84S [...] T13:44:37.900Z shortPause [...] 5 7 gern PT0.57S PT0.84S 5 7 gerne PT0.57S PT0.84S [...]

© W. Wahlster T14:45: PT0.040S T14:45: PT0.040S tarrying dynamic Gesture Recognition and Gesture Analysis “There [  ] I would like to get a reservation.“ Gesture Lattice as Result of Gesture Recognition Result of Gesture Analysis [...] tarrying dynStructId30 1 dynStructId28 2 [...] cinema_17a Europa [...] [...] tarrying dynStructId30 1 dynStructId28 2 [...] cinema_17a Europa [...]

© W. Wahlster Language Analysis and Media Fusion: Turn8: “There [  ] I would like to get a reservation.“ [...] acoustic understanding reserve cinema_17a Europa [...] [...] acoustic understanding reserve cinema_17a Europa [...] Confidence in the Speech Recognition Result Confidence in the Speech Understanding Result Planning Act Object Reference

© W. Wahlster Result of the Action Planner: Presentation Tasks and Presentation Results list add [...] 20:00 [...] list add [...] 20:00 [...]

© W. Wahlster Input into the Language Generator list Meine Braut, ihr Vater und ich Europa [...] list Meine Braut, ihr Vater und ich Europa [...]

© W. Wahlster Language Generation [...] die Anfangszeiten [...] Auf der Übersicht sehen Sie die Anfangszeiten des Films Schmalspurganoven im Kino Europa

© W. Wahlster Output Synchronization: Speech, Gesture, Graphics, Animation 11 declarative [...] eine Übersicht [...] 11 declarative [...] eine Übersicht [...]