Presentation is loading. Please wait.

Presentation is loading. Please wait.

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49.

Similar presentations


Presentation on theme: "German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49."— Presentation transcript:

1 German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: ( ) WWW:http://www.dfki.de/~wahlster Wolfgang Wahlster SmartKom: Modality Fusion for a Mobile Companion based on Semantic Web Technologies Cyber Assist Consortium Second International Symposium - Information Environment for Mobile and Ubiquitous Computing Era - Tokyo, 25 March 2003

2 © W. Wahlster Multimodal UMTS Systems Intelligent Interaction with Mobile Internet Services Access to web content and web services anywhere and anytime Access to corporate networks and virtual private networks from any device Access to edutainment and infotainment services Access to edutainment and infotainment services Access to all messages (voice, , multimedia, MMS) from any single device Access to all messages (voice, , multimedia, MMS) from any single device Personalization Localization

3 © W. Wahlster MM Dialogue Back- Bone Home: Consumer Electronics EPG Public: Cinema, Phone, Fax, Mail, Biometrics Mobile: Car and Pedestrian Navigation Application Layer SmartKom-Mobile SmartKom-Public SmartKom-Home/Office SmartKom: A Highly Portable Multimodal Dialogue System

4 © W. Wahlster A Demonstration of SmartKom’s Multimodal Interface for the Federal President of Germany Dr. Rau

5 © W. Wahlster SmartKom`s SDDP Interaction Metaphor SDDP = Situated Delegation-oriented Dialogue Paradigm Anthropomorphic Interface = Dialogue Partner User specifies goal delegates task cooperate on problems asks questions presents results Service 1 Service 2 Service 3 Webservices Personalized Interaction Agent See: Wahlster et al. 2001, Eurospeech

6 © W. Wahlster SmartKom‘s Use of Semantic Web Technology Three Layers of Annotations Personalized Presentation M3L Content high Structure XML medium Layout HTML low

7 © W. Wahlster SmartKom: Intuitive Multimodal Interaction MediaInterface European Media Lab Uinv. Of Munich Univ. of Stuttgart Saarbrücken Aachen Dresden Berkeley Stuttgart MunichUniv. of Erlangen Heidelberg Main Contractor Scientific Director W. Wahlster DFKI Saarbrücken The SmartKom Consortium: Project Budget: € 25.5 million, funded by BMBF (Dr. Reuse) and industry Project Duration: 4 years (September 1999 – September 2003) Ulm

8 © W. Wahlster Outline of the Talk 1.The Markup Language Layer Model of SmartKom 2.Modality Fusion in SmartKom 3.The Role of the Semantic Web Language M3L 4.Providing Coherence in Multimodal Dialogs by Ontology-based Overlay 5. Conclusions

9 © W. Wahlster Personalization Mapping Web Content Onto a Variety of Structures and Layouts From the “one-size fits-all“ approach of static webpages to the “perfect personal fit“ approach of adaptive webpages Structure XML 1 XML 2 XML n Content M3L Layout HTML 11 HTML 1m HTML 21 HTML 2o HTML 31 HTML 3p

10 © W. Wahlster The Markup Language Layer Model of SmartKom M3L MultiModal Markup Language OIL Ontology Inference Layer XMLS eXtended Markup Language Schema RDFS Resource Description Framework Schema XML eXtended Markup Language RDF Resource Description Framework HTML Hypertext Markup Language

11 © W. Wahlster Spoken Dialogue Graphical User interfaces Gestural Interaction Multimodal Interaction SmartKom: Merging Various User Interface Paradigms Facial Expressions Biometrics

12 © W. Wahlster Multimodal Input and Output in SmartKom Fusion and Fission of Multiple Modalities Input by the User Output by the Presentation agent Speech Gesture Facial Expressions

13 © W. Wahlster Symbolic and Subsymbolic Fusion of Multiple Modes Speech Recognition Gesture Recognition Prosody Recognition Facial Expression Recognition Lip Reading Subsymbolic Fusion - Neuronal Networks - Hidden Markov Models Symbolic Fusion - Graph Unification - Bayesian Networks Reference Resolution and Disambiguation Modality-Free Semantic Representation

14 © W. Wahlster Personalized Interaction with WebTVs via SmartKom (DFKI with Sony, Philips, Siemens) User: Switch on the TV. Smartakus: Okay, the TV is on. User: Which channels are presenting the latest news right now? Smartakus: CNN and NTV are presenting news. User: Please record this news channel on a videotape. Smartakus: Okay, the VCR is now recording the selected program. Example: Multimodal Access to Electronic Program Guides for TV

15 © W. Wahlster Using Facial Expression Recognition for Affective Personalization (3’) Smartakus: Which of these features do you want to see? Processing ironic or sarcastic comments (1) Smartakus: Here you see the CNN program for tonight. (2)User: That’s great.  (3)Smartakus: I’ll show you the program of another channel for tonight. (2’)User: That’s great. 

16 © W. Wahlster The SmartKom Demonstrator System Camera for Gestural Input Microphone Multimodal Control of TV-Set Multimodal Control of VCR/DVD Player Camera for Facial Analysis

17 © W. Wahlster Unification of Scored Hypothesis Graphs for Modality Fusion in SmartKom Word Hypothesis Graph with Acoustic Scores Clause and Sentence Boundaries with Prosodic Scores Scored Hypotheses about the User‘s Emotional State Gesture Hypothesis Graph with Scores of Potential Reference Objects Intention Recognizer Selection of Most Likely Interpretation Modality Fusion Mutual Disambiguation Reduction of Uncertainty Intention Hypotheses Graph

18 © W. Wahlster SmartKom‘s Computational Mechanisms for Modality Fusion and Fission Modality Fusion Modality Fission Ontological Inferences Unification Overlay Operations Planning Constraint Propagation M3L: Modality-Free Semantic Representation

19 © W. Wahlster The Role of the Semantic Web Language M3L M3L (Multimodal Markup Language) defines the data exchange formats used for communication between all modules of SmartKom M3L is partioned into 40 XML schema definitions covering SmartKom‘s discourse domains The XML schema event.xsd captures the semantic representation of concepts and processes in SmartKom‘s multimodal dialogs

20 © W. Wahlster OIL2XSD: Using XSLT Stylesheets to Convert an OIL Ontology to an XML Schema

21 © W. Wahlster Using Ontologies to Extract Information from the Web MyOnto-Movie :title :description :actors MyOnto-Person :name :birthday :director Film.de-Movie :title :description Kinopolis.de-Movie :name :critics :o-title :main actor Mapping of Metadata

22 © W. Wahlster I would like to send an to Koiti M3L as a Meaning Representation Language for the User‘s Input

23 © W. Wahlster Exploiting Ontological Knowledge to Understand and Answer the User‘s Queries T10:25:46 Schwarzenegger/name> Pro7 Which movies with Schwarzenegger are shown on the Pro7 channel?

24 © W. Wahlster SmartKom’s Multimodal Dialogue Back-Bone Communication Blackboards Data Flow Context Dependencies Analyzers External Services Modality Fusion Discourse Modeling Action Planning Modality Fission Generators Speech Gestures Facial Expressions Speech Graphics Gestures Dialogue Manager

25 © W. Wahlster DO 1 DO 2 VO 1 DO 10 DO 3 DO 9 Modality Layer Discourse Layer System: This [  ] is a list of films showing in Heidelberg. heidelberg list LO 2 LO 3... Domain Layer DomainObject 1 ticketfirst DO 11 DO 12 reserve LO 4 LO 5 LO 6 DomainObject 2 GO 1...  User: Please reserve a ticket for the first one. Smartkom‘s Three-Tiered Discourse Model DO = Discourse Object, LO = Linguistic Object GO = Gestural Object, VO = Visual Object cf. M. Löckelt et. al. 2002, N. Pfleger 2002

26 © W. Wahlster theater: MovieTheater movie: Movie reservationNumber: PositiveInteger SmartKom’s Domain Model based on M3L Used for communication in the back-bone Frame-based ontology; representation as Typed Feature Structures in M3L (XML) name: String director: Person cast: PersonList yearOfProduction: PositiveInteger… address: Address seats: SeatStructure… CinemaReservation Application objects composed of subobjects Slots: Feature paths meaningful for the dialogue (entities that can be talked about / referenced to); e.g. movie:director:lastName in a CinemaReservation object Slots can recursively contain other slots firstName: String lastName: String…

27 © W. Wahlster Overlay Operations Using the Discourse Model Augmentation and Validation –compare with a number of previous discourse states: fill in consistent information compute a score –for each hypothesis - background pair: –Overlay (covering, background) Covering: Background: Intention Hypothesis Lattice Selected Augmented Hypothesis Sequence

28 © W. Wahlster The Overlay Operation Versus the Unification Operation Nonmonotonic and noncommutative unification-like operation Inherit (non-conflicting) background information two sources of conflicts: –conflicting atomic values overwrite background (old) with covering (new) –type clash assimilate background to the type of covering; recursion Unification Overlay cf. J. Alexandersson, T. Becker 2001

29 © W. Wahlster Example for Overlay User: "What films are on TV tonight?" System: [presents list of films] User: "That‘s a boring program, I‘d rather go to the movies." How do we inherit “tonight” ?

30 © W. Wahlster Domain Model: A Type Hierarchy of TFS A named entertainment at some time A named TV program at some time on some channel A named Movie at some time at some cinema

31 © W. Wahlster Unification Simulation Films on TV tonight Fail – type clash

32 © W. Wahlster Overlay Simulation Go to the moviesFilms on TV tonight Assimilation Background Covering

33 © W. Wahlster "Formal" Definition Overlay Let –co be covering –bg be background Step 1: –Assimilate(co,bg) T bg co Step 2: –Overlay(co,assimilate(co,bg)) If co and bg are frames: recursion If co is empty: use bg If bg is empty: use co If conflict: use co

34 © W. Wahlster Domain Models with Multiple Inheritance Assimilate(co,bg) –Compute the set of minimal upper bounds (MUB) –Specialize the MUBs –Unify the specialized MUBs T cobg Overlay remains untouched MUB

35 © W. Wahlster Overlay - Scoring Four fundamental scoring parameters: –Number of features from Covering (co) –Number of features from Background (bg) –Number of type clashes (tc) –Number of conflicting atomic values (cv) Codomain [-1,1] Higher score indicates better fit (1  overlay(c,b)  unify(c,b))

36 © W. Wahlster Analysis of U4: Example: Enrichment and Validation U4: What’s on TV tonight? S5: [Displays a list of films] Here you see a list of films running tonight. U6: That seems not very interesting, show me the cinema program. Discourse context

37 © W. Wahlster Example: Enrichment and Validation U4: What’s on TV tonight? S5: [Displays a list of films] Here you see a list of films running tonight. U6: That seems not very interesting, show me the cinema program. Discourse context Analysis of U6: Overlay ( U6, U4) Result: (Score: )

38 © W. Wahlster Example: Enrichment and Validation U4: What’s on TV tonight? S5: [Displays a list of films] Here you see a list of films running tonight. U6: That seems not very interesting, show me the cinema program. Discourse context Analysis of U6: Overlay ( U6, U2) Result: (Score: -1)

39 © W. Wahlster Animation of Scoring Parameters Background Covering Number of features from Covering (co) Number of features from Background (bg) Number of type clashes (tc) Number of conflicting atomic values (cv) Result:

40 © W. Wahlster The High-Level Control Flow of SmartKom

41 © W. Wahlster M3L Specification of a Presentation Task EuroSport T14:00: T15:00:00 Sport News sport... leanForward APGOAL3000 generatorAction GraphicsAndSpeech

42 © W. Wahlster SmartKom‘s Presentation Planner The Presentation Planner generates a Presentation Plan by applying a set of Presentation Strategies to the Presentation Goal. GlobalPresent PresentAddSmartakus DoLayout EvaluatePersonaNode Inform TryToPresentTVOverview ShowTVOverview SetLayoutData ShowTVOverview SetLayoutData PersonaAction SendScreenCommand Generation of Layout Smartakus Actions GenerateText... Speak cf. J. Müller, P. Poller, V. Tschernomas 2002

43 © W. Wahlster Seamless integration and mutual disambiguation of multimodalinput and output on semantic and pragmatic levels Situated understanding of possibly imprecise, ambiguous, or incom- plete multimodal input Context-sensitive interpretation of dialog interaction on the basis of dynamic discourse and context models Adaptive generation of coordinated, cohesive and coherent multimodal presentations Semi- or fully automatic completion of user-delegated tasks through the integration of information services Intuitive personification of the system through a presentation agent Salient Characteristics of SmartKom

44 © W. Wahlster Various types of unification, overlay, constraint processing, planning and ontological inferences are the fundamental processes involved in SmartKom‘s modality fusion and fission components. The key function of modality fusion is the reduction of the overall uncertainty and the mutual disambiguation of the various analysis results based on a three-tiered representation of multimodal discourse. We have shown that a multimodal dialogue sytsem must not only understand and represent the user‘s input, but its own multimodal output. Conclusions

45 © W. Wahlster First International Conference on Perceptive & Multimodal User Interfaces (PMUI’03) November 5-7 th, 2003 Delta Pinnacle Hotel, Vancouver, B.C., Canada Conference Chair Sharon Oviatt, Oregon Health & Science Univ., USA Program Chairs Wolfgang Wahlster, DFKI, Germany Mark Maybury, MITRE, USA PMUI’03 is sponsored by ACM, and will be co-located in Vancouver with ACM’s UIST’03. This meeting follows three successful Perceptive User Interface Workshops (with PUI’01 held in Florida) and three International Multimodal Interface Conferences initiated in Asia (with ICMI’02 held in Pittsburgh).

46 © W. Wahlster March 2003 ISBN x 9, 392 pp., 98 illus. $40.00/£26.95 (CLOTH) Edited by Dieter Fensel, James A. Hendler, Henry Lieberman and Wolfgang Wahlster Foreword by Tim Berners-Lee

47


Download ppt "German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49."

Similar presentations


Ads by Google