Presentation is loading. Please wait.

Presentation is loading. Please wait.

LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 1 Cross-Media Indexing in the Reveal-This System Murat Yakici,

Similar presentations


Presentation on theme: "LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 1 Cross-Media Indexing in the Reveal-This System Murat Yakici,"— Presentation transcript:

1 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 1 Cross-Media Indexing in the Reveal-This System Murat Yakici, Fabio Crestani Dept. Computer & Information Sciences University of Strathclyde, Glasgow, UK

2 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 2 Overview Application Scenario Reveal-This Project Cross-Media Indexing –Process Model –Information Model –Indexing Model Evaluation Future work

3 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 3 Application Scenario Persistency MPEG-7 Coded … How about the Armani jacket that I saw on Fashion TV

4 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 4 The Reveal-This (R-T) Project Aims at: Developing content programming technology able to: 1.capture 2.semantically index 3.categorise 4.cross-link … Multiplatform, multimedia and multilingual digital content…

5 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 5 The Reveal-This Project (2) … as well as provide the system user with: 1.semantic search 2.retrieval 3.summarisation 4.translation… functionalities Clearly an ambitious project!

6 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 6 Digital Content in R-T Digital Content is –Distributed over different platforms –Repurposed and delivered to diverse devices –Can be in a range of media types –Rapidly consumed (on demand provision) Key issues –Managing meta-data –Managing data

7 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 7 Reveal-This Architecture

8 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 8 Cross-Media Indexing Component (CMIC) Addresses the 2 nd scientific objective and in part also the 4 th Part of Cross-Media Indexing and Analysis Subsystem Media meta-data integration and indexing Service

9 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 9 CMIC Overview Builds relationships among concepts extracted from different processors –such as video, speech and text analysis How do we do it?

10 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 10 CMIC Overview (2) Two interpretations of “media” 1.Different sources on same topic 2.Single source but different type on the same topic

11 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 11 CMIC Features Transform source XML (feature) streams into an internal cross-media representation (MPEG-7) in order to make comparisons in the same similarity space Capture relations across media Cross-link media (within document as cross- media) Augment digital objects with semantic information (MPEG-7) Store meta-data and relations Online indexing and retrieval Support to various languages English, Greek, French Enable push and pull of events

12 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 12 CMIC Process Model

13 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 13 CMIC Information Model Structures and patterns for representing digital content at any level –What is the unit of information processing across R-T system? –Are there any common patterns out there? –What are the emerging standards? We adoped MPEG-7 as the information model

14 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 14 MPEG-7 Overview Standard is finalised “How to describe content” Provides diverse and large set of description elements Tools for –Multiplexing of descriptions –Synchronization of descriptions with content

15 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 15 MPEG-7 How to describe content –A set of description schemes (DS) and descriptors (D) –A language to specify description schemes (Description Definition Language - DDL) –A scheme for binary coding the descriptions Consider DSs as a library of descriptions –Feel free to pick and use appropriate subset of relevant DSs depending on your requirements.

16 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 16 Subtitles Face 1 Face 2 Speaker 2 Speaker 1 Relevant Segment And Others… Studio Setting Transitions Zoom in Closed captions Noise Music Transcription … Task of Cross-Media Indexing

17 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 17 Subtitles Face 1 Face 2 Speaker 2 Speaker 1 Relevant Segment Semantic Indexing Person 2 Sea Person 1 Boat Sailing

18 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 18 Challenge Signal and Semantic Gap problem Task entails dealing with –uncertainty –imprecision –inconsistency from each single media analysis module

19 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 19 Indexing Models supported by CMIC –Tf = term frequency –Tf*Idf = gives less importance to a term, if it appears in high number of stories –Modal Tf = the term frequency provided from each processing module (such as image analysis, text processing etc.) is incorporated to the previous approaches –Dempster-Shafer Multi-Evidence Approach

20 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 20 Dempster-Shafer (1) Dempster-Shafer combines two or more bodies of evidence defined with in the same frame of discernment into one body of evidence. Every modality individually gives a support for a single story, a term’s existence in one modality is counted as an evidence to support the topical similarity hypothesis. Each processing module is treated as a probability function also called as Source of Evidence or Base Probability Assignment (BPA).

21 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 21 Dempster-Shafer (2) Combination

22 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 22 Initial experience Accuracy of processing units – In general Granularity of Topics and Categories –Topics and Categories are not describing on the same level –Terms in categories/Topics do not appear in text (Inconsistencies between values received) Confidence scores & Ranking Faces are detected but… Indexing in other languages (Greek and French) Indexing Model Indexing time depends on the indexing model

23 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 23 Evaluation Task oriented user experiment. –The users are given a test video sequence and then asked to detect-recognize faces, transcribe text and describe certain aspects. Their descriptions are regarded as confidence level 1. –The tasks are introduced. The users try to find out information needed to accomplish each individual task and find most relevant segments –Compare this with CMIC’s output.

24 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 24 Evaluation (2) Data annotation by experts: relevance assessments: A three-hour multimedia test collection Covering politics, travel and news In English, French and Greek languages

25 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 25 Towards complex models After we finish testing the simpler indexing models we will explore more complex models: –Bayesian Networks –Kernel Canonical Correlation Analysis –Gaussian Mixture Models

26 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 26 Future work Research –Add and improve indexing models –Benchmark performance with current version on updated data sets –Evaluation with annotated data set Software engineering tasks –Add management and administration services –Use in push and pull services –Adapt to user profile –Integrate with other services (summarisation, etc.)

27 LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 27 Thank you Questions?


Download ppt "LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May 2006 1 Cross-Media Indexing in the Reveal-This System Murat Yakici,"

Similar presentations


Ads by Google