Presentation is loading. Please wait.

Presentation is loading. Please wait.

Institute of Informatics and Telecommunications – NCSR “Demokritos” Bootstrapping ontology evolution with multimedia information extraction C.D. Spyropoulos,

Similar presentations


Presentation on theme: "Institute of Informatics and Telecommunications – NCSR “Demokritos” Bootstrapping ontology evolution with multimedia information extraction C.D. Spyropoulos,"— Presentation transcript:

1 Institute of Informatics and Telecommunications – NCSR “Demokritos” Bootstrapping ontology evolution with multimedia information extraction C.D. Spyropoulos, G. Paliouras, V. Karkaletsis, D. Kosmopoulos, I. Pratikakis, S. Perantonis, B. Gatos

2 The facts  STRP, IST-2004-2.4.7 “Semantic-based Knowledge and Content Systems”  Start: March 1 2006, End: February 28, 2009  Budget: 5.075.678 Euro, Funding: 3.150.000 Euro  Consortium –Inst. of Informatics & Telecommunications, NCSR “Demokritos” (SKEL & CIL), Greece (Coordinator) –Fraunhofer Institute for Media Communication (NetMedia), Germany –Dip. di Informatica e Comunicazione, University of Milano (ISLab), Italy –Inst. of Telematics and Informatics CERTH (IPL), Greece –Hamburg University of Technology (STS), Germany –Tele Atlas, The Netherlands  More than 30 people already active in the project  Project portal: http://www.boemie.org/http://www.boemie.org/

3 Objectives  Providing technology to represent and evolve domain-specific multimedia ontologies.  Moving from low-level, general-purpose, single- modality feature extraction towards semantic, multimedia analysis.  Robust and scalable ontology-driven multimedia content extraction through ontology evolution.

4 Approach  Driven by domain-specific multimedia ontologies, BOEMIE information extraction systems will be able to identify high-level semantic features in image, video, audio and text and fuse these features for optimal extraction.  The ontologies will be continuously populated and enriched using the extracted semantic content.  This is a bootstrapping process, since the enriched ontologies will in turn be used to drive the multimedia information extraction system.

5 The end user’s view  The user wants to see the marathon of the 2006 athletics world championship in Athens. She wants to retrieve images and video of participating athletes in previous marathons. –The system has extracted the participating athletes’ names from official Web sites. –It has also populated the marathon ontology with images and video of past events, relating them to the athletes through fusion with audio and text.

6 The end user’s view  The user also wants to select a good view of the event, by retrieving images and video, associated with landmarks of the city. –The system has identified landmarks in visual information about past marathons in Athens and has thus georeferenced the content. –Reasoning can associate the city landmarks with the event and the related content.

7 The service provider’s view EVOLVED ONTOLOGY INITIAL ONTOLOGY POPULATION & ENRICHMENT COORDINATION INTERMEDIATE ONTOLOGY ONTOLOGY EVOLUTION TOOLKIT LEARNING TOLS REASONING ENGINE MATCHING TOOLS ONTOLOGY MANAGEMENT TOOL ONTOLOGY INITIALIZATION AND CONTENT MANAGEMENT TOOL ONTOLOGY EVOLUTION EVENTS DATABASE MAPS DATABASE MAP ANNOTATION INTERFACE SEMANTICS EXTRACTION RESULTS OTHER ONTOLOGIES SEMANTICS EXTRACTION MULTIMEDIA CONTENT SEMANTICS EXTRACTION TOOLKIT TEXT EXTRACTION TOOLS AUDIO EXTRACTION TOOLS INFORMATION FUSION TOOLS VISUAL EXTRACTION TOOLS FROM VISUAL CONTENT FROM NON-VISUAL CONTENT FROM FUSED CONTENT Content Collection (crawlers, spiders, etc.)

8 The service provider’s view Customize and use the system: –Intialization: collecting, extending and merging ontologies for domains –Training: collecting a training data set, using it for the training of the semantics extraction and ontology evolution tools –Information gathering: continuous collection of content from various sources –Semantics extraction: applying the trained tools to the incoming stream of content –Ontology evolution: populating and enriching the ontologies using the results of the extraction task –Information positioning: linking the extracted data to the map data

9 Semantics extraction  No single modality is powerful enough to support robust and large-scale extraction.  Emphasis on fusion of multiple modalities, using reasoning and handling uncertainty.  Contribution to the state of the art in visual content analysis, due to its richness and the difficulty of extracting semantics.  Non-visual content will provide supportive evidence, to improve precision.

10 Multimedia semantic model  A multimedia ontology describes the structure of multimedia content and visual characteristics of content objects in terms of low-level features.  One or more domain ontologies, e.g. about athletics.  A geographic ontology, e.g. about landmarks.  An event ontology, e.g. about athletic events.  Potential contribution: –Uncertainty in concept descriptions. –Spatial and temporal relations.

11 Ontology evolution  Ontology population and enrichment, i.e., addition of concepts, relations, properties and instances.  Coordination of homogeneous ontologies (same domain) and heterogeneous ontologies (e.g. domain and multimedia ontologies).  Potential contribution: –Ontology population from multimedia content. –Combination of different types of reasoning for enrichment and coordination. –Matching, coordination and versioning of the integrated semantic model.

12 Open issues: semantics extraction  Annotating training data for image and video.  Segment-level and document-level annotation and tracking.  Modeling of modality-specific domain concepts.  Use of entities extracted by one modality in the analysis of another.  Synchronization of different modalities.  The role of the semantic model in fusion and in single-modality analysis.  Support for concept and relation discovery from visual content.  Scalability!

13 Open issues: semantic model  Do we need to go beyond description logics, e.g. cannot support temporal reasoning in event detection?  What type of uncertainty and how is it going to be incorporated?  Combination of ontologies and reasoning with specialized databases, e.g. geographic.  Identify “detectable” concepts for various modalities.

14 Open issues: ontology evolution  Combination of different types of reasoning in ontology learning.  Incremental reasoning services to support evolution.  Evaluation of ontology enrichment.  Combination of evidence (e.g. from instances, lexical, etc.) for matching.  Comparison of ontology versions.  Minimization of human involvement!

15 Open issues: system integration  Implementation of the bootstrapping process, integrating semantic extraction and ontology evolution, through the semantic model.  Crawling for content collection and content quality assessment.  Distributed storage and indexing.  Demonstration of added value for the end user!

16 BOEMIE workshop BOEMIE 2006 Workshop on Ontology Evolution and Multimedia Information Extraction http://www.boemie.org/boemie2006 October 6, 2006, Podebrady, Czech Republic in EKAW 2006 15th International Conference on Knowledge Engineering and Knowledge Management http://ekaw.vse.cz/


Download ppt "Institute of Informatics and Telecommunications – NCSR “Demokritos” Bootstrapping ontology evolution with multimedia information extraction C.D. Spyropoulos,"

Similar presentations


Ads by Google