Presentation is loading. Please wait.

Presentation is loading. Please wait.

Annotation for the Semantic Web Yihong Ding A PhD Research Area Background Study.

Similar presentations


Presentation on theme: "Annotation for the Semantic Web Yihong Ding A PhD Research Area Background Study."— Presentation transcript:

1 Annotation for the Semantic Web Yihong Ding A PhD Research Area Background Study

2 2 Introduction Current web is designed for humans Semantic web (next-generation web) is designed for both humans and machines Semantic annotation –Disclose semantic meanings of web content –Convert current HTML web pages to machine- understandable semantic web pages

3 3 Outline Historical Review Current Status Related Research Fields Future Challenges

4 4 Semantic Annotation in Ancient Ages No evidence when humans started to annotate text about 350 BC history of semantic annotation ≈ history of ontologies

5 5 The First Dream of Modern Semantic Annotation July 1945, Vannevar Bush, As We May Think, The Atlantic Monthly Bush's dream device –humans could acquire information (World Wide Web) –humans could contribute their own ideas (Web Annotation) from/to the community

6 6 Web Annotation before 1999 [Heck et. al., 1999] Developing better user interfaces Improving storage structures Increasing annotation sharability Example systems: ComMentor, Annotator TM, Third Voice, CritLink, CoNote, and Futplex

7 7 Semantic Labeling before 1999 Dublin Core Metadata Standard [http://dublincore.org/] –15 element sets encapsulate data Superimposed Information [Delcambre et. al., 2001] marks Superimposed Layer Base Layer Information Source 1 Information Source 2 Information Source n … –Title –Subject –Description –Creator –Publisher –Contributor –Date –Type –Format –Identifier –Source –Language –Relation –Coverage –Rights

8 8 Status of Current Web Semantic Annotation Studies Interactive annotation Automatic annotation

9 9 Interactive Annotation Systems Lets humans interact through machine interfaces to annotate documents Problems –Inconsistency –Error-proneness –Lack of scalability Values –Easy to implement –Suitable for small-scale tasks and experiments –Helpful to build corpora for evaluations

10 10 Interactive Annotation Systems Annotea [Kahan et. al., 2001] –W3C project –An open RDF infrastructure for shared web annotations SHOE (Simple HTML Ontology Extensions) [Heflin et. al., 2000] –University of Maryland, College Park –Manual annotator using SHOE ontologies

11 11 Automatic Annotation Systems Common feature: use of ontologies Typical approaches –Annotation with automatic ontology generation (1 system) –Annotation with automatic information extraction (6 systems)

12 12 Annotation with Ontology Generation SCORE (Semantic Content Organization and Retrieval Engine) [Sheth et. al., 2002] Voquette (now acquired by Semagix Co.), University of Georgia

13 13 Annotation with Automatic IE Ont-O-Mat [Handschuh et. al., 2002] –University of Karlsruhe at Germany MnM [Vargas-Vera et. al., 2002] –Open University of United Kingdom Common features –DAML+OIL ontologies –Supervised adaptive learning with Lazy-NLP (Amilcare) –Annotation stored inside web pages Differences –MnM allows multiple ontologies at one time –MnM also stores annotations in a knowledge base –Ont-O-Mat uses OntoBroker both as an annotation server and as a reasoning engine

14 14 Annotation with Automatic IE KIM Platform [Kiryakov et. al., 2004] –Ontotext Lab., Sirma Group, a Canadian-Bulgarian joint venture SemTag [Dill et. al., 2003] –IBM Almaden Research Center Similar features –Use one special designed upper-level ontology, KIM ontology vs. TAP ontology Specific features –KIM uses an NLP tool (GATE) to extract information –KIM stores annotations in a separate file –SemTag uses inductive learning to extract information –SemTag annotates 264 million Web pages and generate approximately 434 million semantic tags

15 15 Annotation with Automatic IE Stony Brook Annotator [Mukherjee et. al., 2003] –Stony Brook University –Structural analysis of DOM tree for HTML pages –Drawbacks Taxonomic relationships only No generic labeling algorithm disclosed RoadRunner Labeller [Arlotta et. al., 2003] –Università di Roma Tre and Università della Basilicata –Automatic assign label names based on image recognition –Drawbacks Semantic meaning of labels unknown Difficulty in associating labels with ontologies

16 16 Related Research Fields Semantic Web Information extraction Ontology related topics Conceptual modeling Logic languages Web services

17 17 Semantic Web Weaving the Web [Berners-Lee 1999], birth of the Semantic Web The Semantic Web [Berners-Lee et. al., 2001]

18 18 Information Extraction [Laender et. al., 2002] 1.Human-guided approaches Wrapper languages, Modeling-based tools No annotation examples Too heavily human involvement 2.Non-ontology-based approaches HTML-aware tools: StonyBrook tool [Mukherjee et. al., 2003], RoadRunner Labeller [Arlotta et. al., 2003] NLP-based tools: Ont-O-Mat [Handschuh et. al., 2002], MnM [Vargas-Vera et. al., 2002], KIM platform [Kiryakov et. al., 2004] ILP-based tools: SemTag [Dill et. al., 2003] Require extra alignment between extraction categories in wrappers and concepts in ontologies 3.Ontology-based Approaches Ontology-based tools: my proposal Not require alignment, resilient to web page layouts Slow in execution time

19 19 Ontology Related Topics Ontology languages [W3C, OWL] –Knowledge representation and reasoning Ontology generation [Ding et. al., 2002a] –Annotation domain specification Ontology enrichment [Parekh et. al., 2004 ] –Annotation domain specification expanding Ontology population [Alani et. al., 2003] –Annotation result output Ontology mapping and merging [Ding et. al., 2002b] –Large-scale annotation requires large-scale ontologies –Small-scale ontologies are less expensive to build –Ontology mapping creates the links among small-scale ontologies –Ontology merging fuses small-scale ontologies into a large-scale ontology

20 20 Conceptual Modeling Annotation requires knowledge modeling Ontology is a type of conceptual modeling ER Model [Chen 1976] –The most influential conceptual model –Influence OSM model, basis of data-extraction ontology

21 21 Logic Languages Logic foundation provides reasoning and inference power for modeling languages Examples –First-order logic [Smullyan 1995] –Description logics [Brachman et. al., 1984]

22 22 Web Services More and more, web services become the typical application in semantic web scenario. Two ways aligning web services with semantic annotation –Web service annotation [Brodie 2003] –Semantic annotation web service

23 23 Summary and Future Challenges Annotation for the semantic web –Enable machine-understandable web –Support semantic searching –Support global-wide web services –Still an unsolved problem Main technical challenges –Direct ontology-driven annotation mechanism –Concept disambiguation –Automatic domain ontology generation –Scalability


Download ppt "Annotation for the Semantic Web Yihong Ding A PhD Research Area Background Study."

Similar presentations


Ads by Google