Presentation is loading. Please wait.

Presentation is loading. Please wait.

ENTERFACE ’08 Project 2 “Multimodal High Level Data Integration” Final Report August 29th, 2008.

Similar presentations


Presentation on theme: "ENTERFACE ’08 Project 2 “Multimodal High Level Data Integration” Final Report August 29th, 2008."— Presentation transcript:

1 eNTERFACE ’08 Project 2 “Multimodal High Level Data Integration” Final Report August 29th, 2008

2 Application challenges 2 users in their home/office environment unrestricted natural language free human behavior

3 Components integrated Speech Recognizer Video Analyzer Sound Waves Syntactic Analyzer Recognized String Sequence of Images Semantic Analyzer Syntactic Triple Knowledge Base Fusion Mechanism Human Behavior Analyzer Movements Coordinates Movements Meanings Advise People Linguistic meanings Audio Stream Video Stream

4 Audio StreamVideo Stream Sphinx-4 Open CV Sound Waves C & C Parser Recognized String Sequence of Images C & C Boxer Syntax Analysis Protegè Jena Fusion Mechanism Human Behavior Analyzer Movements Coordinates Movements Meanings Advise People Linguistic meanings Semantic Validation

5 Example Scenario [Ronald] I want to call Nick. Nick mentioned that he attended a wine tasting course. [Beto] It sounds interesting, I like wine. [Ronald] Actually I plan to join the next class. He also mentioned a book about French wines, but I cannot recall the name of the author. [Beto] Why don't you send a mail to Nick? [Ronald] Maybe I can find a book about it in the library. [Beto] Yes, you are right. [Beto] Did you find it? [Ronald] Yes, I did.

6 Hints for plan recognition by speech Alerts: want, need, wish, require, going to, plan, look for, wonder, can, may, must, do you know, do we have, etc. Stop-alerts: - negation ( I am not going to …) - past tense ( Yesterday I was going to …)

7 Maybe I can find a book about it in the library Ronald is moving towards the book shelves

8 Decision making If (Ronald) [wants to send] {email to Nick} & (Ronald [is moving to] {the computer} | He [is close to] {the computer}) then open the mail client with the “to” field filled with nick@uclouvain.be If (Ronald) [can] find {book} [about] {it} [in] {the library} & (Ronald [is moving to] {the library} then There is a book about French wines on the first shelf. If (Ronald) [can] find {book} [about] {it} [in] {the library} & (Ronald [is moving to] {the computer}) then Open a web search website and put the keyword in the search field.

9 Achievements spatial relationships (based on the fixed “anchor” objects in the room) semantic fusion of events not coinciding in time good results in speaker identification: synchronisation between image and speech identification an open framework to manage fusion between two (our case) or more modalities was created during the project and will be enhanced further each component can run in a separated machine thanks to the distribution mechanism interchanging data through a TCP/IP network.

10 Future work implement effective learning efficient decision making even from information fragments spatial relationships relatively to moving people 3D video analysis detection of orientation of the people in the scene eye gaze tracking recognition of various types of gestures dealing with natural language redundancy (repeating the same idea in different words)

11 Further development of results integration on the OpenInterface platform (openinterface.org) create an open-source community around the project to - gain ideas and contributions from outside - have new modalities to fuse create a website, a forum, a mailing list


Download ppt "ENTERFACE ’08 Project 2 “Multimodal High Level Data Integration” Final Report August 29th, 2008."

Similar presentations


Ads by Google