Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference.

Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference (ISWC2003)

10/21/2003 Raphaël Troncy - ISWC'20031 Description of the AV content A three step process : –identification of the content creator and the content provider : Dublin Core metadata, VRA core categories … –structural decomposition in video segments corresponding to the logical structure of the program : time-code, spatial coordinates –semantic description of these segments : controlled vocabulary, thesaurus, free text annotation

10/21/2003 Raphaël Troncy - ISWC'20032 Description of the AV content Segmentation –locate and date some events Description –type each segment with an AV genre –type each segment with a general thematic –describe the scene (who, when, where, what, …) describe the logical structure describe the semantics of the content

10/21/2003 Raphaël Troncy - ISWC'20033 Example Q : Find all AV sequences of type interview with Sandy Casar and concerning the Paris-Nice cycling race –noise answer : there are other sports news in the sequence –incomplete answer : the interview was broadcasted in two parts and began in a previous sequence –the query cannot be extended ! 13 [Indoor Set: 6 th part] at 18:43:56:00 - 00:09:06:00. – Eurosport In studio, the second part of the interview, from Nice, of Sandy CASAR by Jean René GODART about the Paris-Nice cycling race and a few sports news with pictures commented by Alexandre BOYON and Laurent PUYAT. Q : Find all AV sequences of type dialog sequence with a rider and concerning any cycling race with several stages

10/21/2003 Raphaël Troncy - ISWC'20034 Requirements : –express models that constrain the logical structure identify an interview inside a report of a sports magazine –represent the meaning contained in this structure a cartoon is a fiction with no real characters –describe semantically the content of each sequence the Prologue is always an individual time trial numbered stage 0  Which languages are the most suitable to perform all these tasks ? Problems Weak use of the logical structures Descriptions are not made for reasoning  make the AV descriptions accessible to automated processes

10/21/2003 Raphaël Troncy - ISWC'20035 "Pure" documentary approaches General bibliographic description languages (DC, VRA) MPEG-7 : the new multimedia description language ? –three components: D, DS and DDL –structure: segment = abstract unit defined by temporal localization or masks –semantics: entity–attribute–relation model + thesaurus for structuring the knowledge (Classification Schemes) –tools: Videto (ZGDV), Vizard (EU-IST Project), MovieTool (© Ricoh) Extensions –XML Schema : add structure without semantics TV Anytime, Mdéfi [Tran Thuong, 2003] –Classification Schemes : very poor expressivity COALA [Fatemi, 2003]

10/21/2003 Raphaël Troncy - ISWC'20036 KR approaches: OWL+RDF Definition of concepts and relations StudioProgram  and ( HomogeneousProgram (all hasPart StudioSequence) ) Definition of axioms HomogeneousProgram  HeterogeneousProgram =   Problem : the structure of the document (i.e. the context) is lost !  let us merge the two approaches !

10/21/2003 Raphaël Troncy - ISWC'20037 General architecture

10/21/2003 Raphaël Troncy - ISWC'20038 The Audio-visual Ontology Methodology of construction: [Bachimont et al., EKAW’02] –Conceptualization : differential principles –Formalization : formal definitions, axioms –Operationalization : export into a KR language AV domain: –Production objects (program, sequence, AV genre), Properties (theme), Persons, Technical Process (shooting, recording, post- production), Signal descriptors (audio, video), etc. Tools: –Conceptualization : DOE [Bachimont et al., EKAW’02] –Formalization : OilEd [Bechhofer, KI’01] –Languages : DAML+OIL … OWL DOE and ontologies are available at : http://opales.ina.fr/public/ontologies/

10/21/2003 Raphaël Troncy - ISWC'20039 The Audio-visual Ontology

10/21/2003 Raphaël Troncy - ISWC'200311 Generate XML Schema types OWL Class Sub-class Restriction on properties Union of classes XML Schema Complex type Extension Element of the content model Choice in the content model XSLT ? Some concepts (program, sequence) extend the MPEG-7 Segment type, hence the descriptions are MPEG-7 valid

10/21/2003 Raphaël Troncy - ISWC'200312 Build description schemes for the documents Let us watch some sports magazine –construction of a simple schema based on StudioSequence, Report and Interview –a Report contains some FilmClips of Broadcast Live Sports The schema provides the description skeleton for several sports magazine: –Téléfoot (soccer) –VéloClub (cycling) –3 Partout (multisports)

10/21/2003 Raphaël Troncy - ISWC'200314 SegmenTool [French project CHAPERON]

10/21/2003 Raphaël Troncy - ISWC'200315 Instantiate a document content model...... T00:24:19 PT00H00M07S... KB RDF triples

10/21/2003 Raphaël Troncy - ISWC'200317 The Cycling Ontology

10/21/2003 Raphaël Troncy - ISWC'200318 Knowledge base population Cycling Domain text + Base of facts

10/21/2003 Raphaël Troncy - ISWC'200319 Implementation of the KB Sesame : architecture for the storage of RDF triples [Broekstra, 2002] –Supports different query languages: RQL, RDQL and SeRQL –Implements the RDFS semantics (RDF-MT engine) BOR : reasoner for the DAML+OIL language [Simov & Jordanov, 2002] SeBOR : integration of the two systems, done in the On-To-Knowledge EU-IST Project –Enhanced inference services are provided –Closed to what OWL DL reasoner will perform

10/21/2003 Raphaël Troncy - ISWC'200320 Sesame+BOR interface Demo

10/21/2003 Raphaël Troncy - ISWC'200321 Conclusion General architecture for reasoning on descriptions of video documents: –Modeling of 2 ontologies (methodology + DOE) –Formalization of these ontologies (OilEd, OWL) –Creation of document schemes (extended MPEG-7) –Creation of instances of these schemas: the structure of the descriptions (SegmenTool + XSLT transformation for creating a base of RDF triples) –Creation of a Knowledge Base of events related to cycling race and use of an adapted reasoner ( Sesame + BOR, ©AIdministrator-NL & ©OntoText-BG )

10/21/2003 Raphaël Troncy - ISWC'200322 Future work Development integration –provide a simple interface for querying on both the structure and the content of the video –watch the AV sequences corresponding to the RDF triples returned by SeBOR Mid-term objectives –scalability: test the system on a large base of videos annotated with real users –use the future OWL reasoners Long-term objectives –use this architecture with another domain (other than cycling) –will we have to simply build another ontology ? what do we have to adapt ?

Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference.

Similar presentations

Presentation on theme: "Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference.

Similar presentations

Presentation on theme: "Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference."— Presentation transcript:

Similar presentations

About project

Feedback