Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information A Multi-Layered,

Similar presentations


Presentation on theme: "Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information A Multi-Layered,"— Presentation transcript:

1 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information A Multi-Layered, XML-Based Approach to the Integration of Linguistic and Semantic Annotations Thierry Declerck, Paul Buitelaar University of the Saarland & DFKI GmbH Saarbrücken, Germany In this presentation are also slides and graphics included, which are taken from three presentations at the EUROLAN 2003 in Bucharest. Authors are P.Vossen (Wordnet, EuroWordNet, Global Wordnet), A. Lenci (Computational Lexicons and the Semantic Web) and Srini Narayanan (FrameNet Meets the Semantic Web). Also included are graphics from M. Fernández-López and A. Gómez-Pérez Asun Gomez Perez (UPM) from the deliverable 1.2 of the Esperonto Project

2 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Overview Semantic Web Applications of LT  Annotation of Web Documents with Ontology- based Metadata (Knowledge Markup)  Ontology Learning through Text Mining from Annotated Corpora Integration of Annotations  Use of Different Tools  Use of Different Knowledge Sources Motivations

3 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Overview … Linguistic and Semantic Annotations  Linguistic: e.g. PoS, Lemma, Phrase Structure  Semantic: e.g. Concepts, Relations, Events Objectives: Integration of… … Annotations from Different Resources  e.g. Different Domains … Annotations in Different Formats  e.g. from Different Tools

4 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Knowledge Markup and Knowledge Extraction Text/Speech/Image-Video Text/Speech/Media Mining Concepts, Relations, Events Linguistic and Media Analysis Linguistic, Low-level Image and Semantic Annotations

5 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Annotations Projects, Tools and Resources Projects  MuchMore:Cross-lingual Information Retrieval, Medical Domain  Mumis: Content-based Multimedia Retrieval, Soccer Domain Tools and Resources  MuchMore:Integration of Shprot (TnT, Mmorph, Chunkie) with Semantic Tagging Tools (UMLS – Medical Semantic Resource, EuroWordNet)  Mumis:Schug, Integration of SPPC with Rule-based Chunking and Shallow Dependency Analysis, Event Structure (Mumis Soccer Ontology)

6 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information document sentence umlsterms xrceterms ewnterms semrels gramrels chunks text cui sense umlsterm xrceterm ewnterm semrel gramrel chunk token to id from to offset from id code type term2 term1 id pref tui code pref tui type id to id from type id pos lemma msh cui msh Annotations MuchMore

7 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Balint syndrom is a combination of symptoms including simultanagnosia, a disorder of spatial and object-based attention, disturbed spatial perception and representation, and optic ataxia resulting from bilateral parieto-occipital lesions. Balint syndrom is a combination of symptoms... spatial perception and representation... > Annotations MuchMore: Linguistic

8 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Balint syndrom is a combination of symptoms including simultanagnosia, a disorder of spatial and object-based attention, disturbed spatial perception and representation, and optic ataxia resulting from bilateral parieto-occipital lesions. Annotations MuchMore: Semantic

9 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information DocumentSentenceParagraph PP VG NP NE AP AdvP Subord-Clause Annotations Mumis

10 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information AP TYPE STRUK AP_AGR STRING AP_HEAD W Annotations Mumis

11 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information VG TYPE VG_SUBCAT_STEM STRING KLAMMER VG_STRG SENT_STRING VG_TYPE VG_AGR STRUK VG_HEAD... VG W Annotations Mumis

12 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information W INFL STRING CLAUSE_PRED_SUBCAT CLAUSE_PP_LIST... CLAUSE_TYPE TC CLAUSE_SUBJ CLAUSE_PRED_STRG STEM TYPE SENT_STRING CLAUSE_VG_LIST CLAUSE_PRED_AGR CLAUSE POS CLAUSE_PP_ADJUNKT CLAUSE_NP_LIST Annotations Mumis

13 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Annotations Integration Objectives  Integrate Linguistic and Semantic Information from the MuchMore and Mumis Annotations, e.g.  Enrich MuchMore: Head/Complement of Chunks, Clauses  Enrich Mumis: EuroWordNet, Medical Ontology Approach  MuchMore uses Multilayered Annotation over Indexes (‘standoff’)  Introduce Mumis Annotations as Additional Layers Problems  Integration of Overlapping Layers (i.e. Additional Attributes)

14 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Industrie, Handel und Dienstleistungen werden in der ersten Liste aufgeführt, wobei die in Klammern gesetzten Zahlen auf die Mutterfirmen hinweisen. (Industry, trade and services are mentioned in the first list, in which numbers within brackets point to parent companies.) …. Annotations Mumis: Linguistic

15 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Ein Freistoss von Christian Ziege aus 25 Metern geht über das Tor. (A 25-meter free-kick by Christian Ziege goes over the goal.) Annotations Mumis: Semantic

16 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Conclusions MuchMore and MUMIS Work in Progress  Development of Compatibility between the Formats  Full Integration of the Formats Possible Future Work  Integration of the Formats on a more Abstract Level, i.e. by Use of Data Categories as Specified by ISO/TC37/SC4  Separating Text Data from Annotation. Multiple pointing to Annotations.  Extension to Multimedia

17 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Esperonto: Overview Applications Router Agent XMLDAMLOILRDF(S) Certificate Workbench Maintenance Multilinguality Reengineering Mapping Ontology Repository Service Tagger/ Wrapper Web Server Provider Dynamic Information Provider Static Information Provider Multimedia Data Provider Multilingual NL Understanding World Wide Web Semantic Web Visualization Service Provider SemASP Multilingual NL Generation Portal Agent Tagger/ Wrapper Tagger/ Wrapper Tagger/ Wrapper Router Semantic indices, Concept instances

18 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Ontologies (Classification) Lassila and McGuinness [Lassila and McGuinness, 2001] categorization

19 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Ontologies(classification) Van Heijst and colleagues [Van Heijst et al., 1997] categorization

20 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Knowledge Architecture

21 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Esperonto Knowledge Architecture

22 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Abstracting over Linguistic Information in Esperonto Ontology_1: NP Head:N Mod: {Adj*,PP?} Spec: {Det? PossPron} Type: {RefNP, ProNP, DateNP,etc.} Ontology_2: PP Head: Prep Type: {LocPP,DatePP, etc.} Comp: NP Ontology_3: Grammatical Functions Subject, Object, Ind. Object NP Adjunct, PP Adjunct, etc.. Ontology_4: Dependencies Head Comp Mod Spec

23 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information From WordNet to EuroWordNet voorwerp {object} lepel {spoon} werktuig{tool} tas {bag} bak {box} blok {block} lichaam {body} Wordnet1.5Dutch Wordnet bag spoon box object natural object (an object occurring naturally) artifact, artefact (a man-made object) instrumentality blockbody container device implement tool instrument

24 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Relations of EWN to Top-Level Ontologies ReferenceOntologyClasses: BOX ContainerProduct; SolidTangibleThing Language-Neutral Ontology object box container box container WordNet1.5 Language-Specific Wordnets doos voorwerp Dutch Wordnet EuroWordNet Top-Ontology: Form: Cubic Function: Contain Origin: Artifact Composition: Whole

25 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Framenet: Events in Syntactic Context  events  artifacts, built objects  natural kinds, parts and aggregates  institutions, belief systems, practices  space, time, location, motion  etc. Let us take a commercial transaction as an example of an event. The following (partial) wordlist is showing lexical realization of the event: Verbs: pay, spend, cost, buy, sell, charge Nouns:cost, price, payment Adjectives: expensive, cheap

26 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Semantic and Domain Specific Information in the Simple/Parole Framework semantic frame semantic relations ontology

27 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Combining Ontological and “Linguistic Ontology” (EWN, Parole/Simple) Torschuss abzieh URL: DFB home page/glossary

28 Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information Actual Work Including FrameNet for 3 Languages. Including new semantic classes for Adj., Adverbs, Polarity etc. New improved annotation schema for syntactic/Semantic annotation A declarative set of mapping rule Linguistic  Ontology (domain ontologies). The Onto-LT frameowrk (see paper by P. Buitelaar & al at LREC).


Download ppt "Linguistic and Semantic Information for the Semantic Web LREC 2004, ISO Working Group on the Representation of Multimodal Semantic Information A Multi-Layered,"

Similar presentations


Ads by Google