ENTERFACE 08 Project 2 “multimodal high-level data integration” Mid-term presentation August 19th, 2008.

Slides:



Advertisements
Similar presentations
CHART or PICTURE INTEGRATING SEMANTIC WEB TO IMPROVE ONLINE Marta Gatius Meritxell González TALP Research Center (UPC) They are friendly and easy to use.
Advertisements

GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
AVATAR: Advanced Telematic Search of Audivisual Contents by Semantic Reasoning Yolanda Blanco Fernández Department of Telematic Engineering University.
Logics for Data and Knowledge Representation Projects and thesis introduction.
Presented by Sam Supervised by Prof. Michael Lyu.
ENTERFACE’08 Multimodal high-level data integration Project 2 1.
Sensor-based Situated, Individualized, and Personalized Interaction in Smart Environments Simone Hämmerle, Matthias Wimmer, Bernd Radig, Michael Beetz.
ICIP 2000, Vancouver, Canada IVML, ECE, NTUA Face Detection: Is it only for Face Recognition?  A few years earlier  Face Detection Face Recognition 
SSP Re-hosting System Development: CLBM Overview and Module Recognition SSP Team Department of ECE Stevens Institute of Technology Presented by Hongbing.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
John Hu Nov. 9, 2004 Multimodal Interfaces Oviatt, S. Multimodal interfaces Mankoff, J., Hudson, S.E., & Abowd, G.D. Interaction techniques for ambiguity.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Multimedia Search and Retrieval Presented by: Reza Aghaee For Multimedia Course(CMPT820) Simon Fraser University March.2005 Shih-Fu Chang, Qian Huang,
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
LYU 0102 : XML for Interoperable Digital Video Library Recent years, rapid increase in the usage of multimedia information, Recent years, rapid increase.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
These courseware materials are to be used in conjunction with Software Engineering: A Practitioner’s Approach, 6/e and are provided with permission by.
CS335 Principles of Multimedia Systems Multimedia and Human Computer Interfaces Hao Jiang Computer Science Department Boston College Nov. 20, 2007.
ECDL 2002 Employing Smart Browsers to Support Flexible Information Presentation in Petri net-based Digital Libraries Unmil P. Karadkar, Richard Furuta.
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
1 User Interface Design CIS 375 Bruce R. Maxim UM-Dearborn.
Natural Language Understanding
Smart Learning Services Based on Smart Cloud Computing
Fault Tolerant Sensor Network for Border Activity Detection B. Cukic, V. Kulathumani, A. Ross Lane Department of CSEE West Virginia University NC-BSI,
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Institute of Informatics and Telecommunications – NCSR “Demokritos” Bootstrapping ontology evolution with multimedia information extraction C.D. Spyropoulos,
Multimedia Specification Design and Production 2013 / Semester 2 / week 8 Lecturer: Dr. Nikos Gazepidis
Recognition of meeting actions using information obtained from different modalities Natasa Jovanovic TKI University of Twente.
Working group on multimodal meaning representation Dagstuhl workshop, Oct
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
Mining Minds Mr. Amjad UsmanMr. Amjad Usman19-July-2014KHU High-level Context Awareness.
Break-out Group # D Research Issues in Multimodal Interaction.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
1 Technologies for distributed systems Andrew Jones School of Computer Science Cardiff University.
CHAPTER TEN AUTHORING.
Towards multimodal meaning representation Harry Bunt & Laurent Romary LREC Workshop on standards for language resources Las Palmas, May 2002.
Subtask 1.8 WWW Networked Knowledge Bases August 19, 2003 AcademicsAir force Arvind BansalScott Pollock Cheng Chang Lu (away)Hyatt Rick ParentMark (SAIC)
10th International Baltic Conference on Databases and Information Systems July 8-11, 2012, Vilnius, Lithuania Learner Model’s Utilization in the e-Learning.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
A Context Model based on Ontological Languages: a Proposal for Information Visualization School of Informatics Castilla-La Mancha University Ramón Hervás.
ENTERFACE 08 Project 1 “MultiParty Communication with a Tour Guide ECA” Mid-term presentation August 19th, 2008.
ENTERFACE ’08 Project 2 “Multimodal High Level Data Integration” Final Report August 29th, 2008.
EASAIER Enabling Access to Sound Archives through Integration, Enrichment and Retrieval Ying Ding.
Software Architecture Evaluation Methodologies Presented By: Anthony Register.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Introduction to Interactive Media Interactive Media Tools: Authoring Applications.
Chapter 1. Cognitive Systems Introduction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Park, Sae-Rom Lee, Woo-Jin Statistical.
TEMPLATE DESIGN © E-Eye : A Multi Media Based Unauthorized Object Identification and Tracking System Tolgahan Cakaloglu.
Toward an Open Source Textual Entailment Platform (Excitement Project) Bernardo Magnini (on behalf of the Excitement consortium) 1 STS workshop, NYC March.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Spring 2007 COMP TUI 1 Computer Vision for Tangible User Interfaces.
ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.
Stochastic Grammars: Overview Representation: Stochastic grammar Representation: Stochastic grammar Terminals: object interactions Terminals: object interactions.
Slide no 1 Cognitive Systems in FP6 scope and focus Colette Maloney DG Information Society.
1 Applying Principles To Reading Presented By Anne Davidson Michelle Diamond.
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
WP6 Emotion in Interaction Embodied Conversational Agents WP6 core task: describe an interactive ECA system with capabilities beyond those of present day.
Introduction to MPEG  Moving Pictures Experts Group,  Geneva based working group under the ISO/IEC standards.  In charge of developing standards for.
Design Evaluation Overview Introduction Model for Interface Design Evaluation Types of Evaluation –Conceptual Design –Usability –Learning Outcome.
MPEG 7 &MPEG 21.
ENTERFACE 08 Project 9 “ Tracking-dependent and interactive video projection ” Mid-term presentation August 19th, 2008.
Detection, Tracking and Recognition in Video Sequences Supervised By: Dr. Ofer Hadar Mr. Uri Perets Project By: Sonia KanOra Gendler Ben-Gurion University.
MULTIMEDIA SYSTEMS CBIR & CBVR. Schedule Image Annotation (CBIR) Image Annotation (CBIR) Video Annotation (CBVR) Video Annotation (CBVR) Few Project Ideas.
Digital Video Library - Jacky Ma.
Visual Information Retrieval
WP2 INERTIA Distributed Multi-Agent Based Framework
A Forest of Sensors: Using adaptive tracking to classify and monitor activities in a site Eric Grimson AI Lab, Massachusetts Institute of Technology
Multimodal Human-Computer Interaction New Interaction Techniques 22. 1
Demystifying Web Content Accessibility Guidelines
Presentation transcript:

eNTERFACE 08 Project 2 “multimodal high-level data integration” Mid-term presentation August 19th, 2008

Team Olga Vybornova (Université catholique de Louvain, UCL-TELE, Belgium) ‏ Hildeberto Mendonça (Université catholique de Louvain, UCL-TELE, Belgium) ‏ Ao Shen (University of Birmingham, UK) ‏ Daniel Neiberg (TMH/CTT, KTH Royal Institute of Technology, Sweden) ‏ David Antonio Gomez Jauregui (TELECOM and Management SudParis, France) ‏

Project objectives to augment and improve the previous work, look for new methods of data fusion to resolve the problem and implement a/the technique distinguishing between the data from different modalities that should be fused and the data that should not be fused but analyzed separately to explore and employ a context-aware cognitive architecture for decision-making purposes. 3

4 A set of variables describing states of the world (user’s input, an object, an event, behavior, etc.) represented in different media and through different information channels. GOAL OF DATA FUSION: The result of the fusion (merging semantic content from multiple streams) should give an efficient joint interpretation of the multimodal behavior of the user(s) – to provide effective and advanced interaction Background - Multimodality

Audio Stream Video Stream Speech Recognizer Video Analyzer Sound Waves Syntactic Analyzer Recognized String Sequence of Images Semantic Analyzer Syntactic Triple Knowledge Base Fusion Mechanism Human Behavior Analyzer Movements Coordinates Movements Meanings Advise People Linguistic meanings

Audio Stream Video Stream Speech Recognizer Video Analyzer Sound Waves Syntactic Analyzer Recognized String Sequence of Images Semantic Analyzer Syntactic Triple Knowledge Base Fusion Mechanism Human Behavior Analyzer Movements Coordinates Movements Meanings Advise People Linguistic meanings

Audio Stream Video Stream Sphinx-4 Open CV Sound Waves C & C Tool Parser Recognized String Sequence of Images C & C Tool Boxer Syntax Analysis Protegè Jena Fusion Mechanism Human Behavior Analyzer Movements Coordinates Movements Meanings Advise People Linguistic meanings Semantic Validation

Integration 8 All tools are integrated through socket communication C++ and Java interoperating normally The interchanging data format is XML Verifiable Easy data identification Easy data compatibility Low cost of manipulation Processing XML on demand Main issues: transparency, extensibility and customization

Speech Recognition 9 Sphinx 4 Integrated in system! Fined tuned for maximum length of n-best lists 2 Language models created Scenario dependent 3-grams, 150 Words 86,9% Accuracy, Speed: 0,94 X real time Wall Street Journal + scenarios 3-grams, 5000 words 68,6% Accuracy, Speed: 3,19 X real time

Speech Identification 10 Standard GMM-based speaker identification system Developed in Matlab To the right are the results from a 2-person development set as a function of Gaussians

Speech Recognition Output 11 yesterday i received an from nick yesterday i received an from nick yesterday i received an from nick to yesterday i received an from nick for

Syntax and Semantics 12

Syntax and Semantics 13

Syntax and Semantics 14

Image Processing 15 OpenCV Library (Open Source) ‏ Motion History to calculate the motion direction Matching template to identify objects in the scene Gaussian probability distribution to model the color of clothes Background subtraction technique to detect the foreground Blob identification to track people in the scene

Image Processing 16

Image Processing Output 17

Ontology 18 Restricted-domain ontology – structure and its instantiation Pattern situations (semantic frames) ‏ User profile - a priori collected information about users - preferences, social relationships information, etc. - and dynamically obtained data Using Protegè to create and edit Using Jena to manage the ontology data

Ontology 19

Project schedule 20 Overall progress: 65 % WP1: Workshop preparation – Done WP2: Integration of multimodal components – Done WP3: Multimodal fusion implementation – Running WP4: Scenario implementation and reporting – To do Strategic changes to achieve the goal: Everybody focusing on the fusion mechanism Less priority on the improvement of modalities Each risky task has a plan B associated with less time consuming, but less robust too.

Next Steps 21 Intergration of WordNet into the ontology Rules to process human behavior Mapping the semantic analysis with the ontology Fusion mechanism