Presentation is loading. Please wait.

Presentation is loading. Please wait.

Towards the Semantic Web 6 Generating Ontologies for the Semantic Web: OntoBuilder R.H.P. Engles and T.Ch.Lech 이 은 정 2005. 2. 17.

Similar presentations


Presentation on theme: "Towards the Semantic Web 6 Generating Ontologies for the Semantic Web: OntoBuilder R.H.P. Engles and T.Ch.Lech 이 은 정 2005. 2. 17."— Presentation transcript:

1 Towards the Semantic Web 6 Generating Ontologies for the Semantic Web: OntoBuilder R.H.P. Engles and T.Ch.Lech 이 은 정 2005. 2. 17

2 2 1. The overall  OntoBuilder : Extraction of information from texts for building knowledge bases. Consist of the two modules OntoExtract and OntoWrapper.

3 3 1.1 The overall architecture     ’   tel pers05731 about par05car RDF Annotated Data Repository Data Repository (external) OIL-Core ontology repository RDF Ferret User RQL OIL-Core OntoEdit Spectacle OntoExtractOntoWrapper OntoShare Knowledge Engineer Sesame OMMLINRO

4 4  OntoExtract: Semi-automatic Ontology construction from unstructured information (natural language sources).  OntoWrapper: Semi-automatic Ontology construction from semi-structured and structured information sources. extract information from places on specific site s (e.g. names, email addresses, telephone nu mbers). 1.2 OntoExtract and OntoWrapper(1/2)

5 5 1.2 OntoExtract and OntoWrapper(2/2)  CORPORUM is dependent on a linguistic analysis of a given text, comprising normalization, tokenization and part-of-speech tagging.  Relations between concepts are defined (e. g. subClassOf relations, or InstanceOf relations).  Through semantic analysis of a domain, the tool can automatically generate relation between words within a domain.  Visualization of such semantic structures can than be used for navigation and browsing through document s ets.

6 6 2. OntoExtract(1/3)  OntoExtract supports analysis of natural language texts and generates lightweight, domain specific ontologies of these texts (utilizing already existing knowledge from a central data repository).  OntoExtract is able to: analysis of natural language, provide initial ontologies, refine existing ontologies, find relations between key terms in documents, find instances of concepts within document, finds classes, sub-class relationships.

7 7 2. OntoExtract(2/3)  How does OntoExtract currently work: parses, tokenizes and analyses text, generates nodes and relations between them, enhances specific aspects of the discovered kno wledge item using a background repository(co ntaining general knowledge of the world, represented in Sesame), and the final analysis results are submitted to the RDFS server Sesame.

8 8 rdf:Class motorcycle holidays rdf:type weaklyRelatedTo MC_001 rdf:type hasColor “black” “long” hasSize Sesame background knowledge Sesame domain knowledge

9 9 3. OntoWrapper  OntoWrapper deal with the analysis of structured pages allow the user to define XML/RDF templates, variables and rule sets to perform a structured analysis of a specific domain generate the merged output and sending it to the Sesame repository as data statements about specific pages.

10 10 4.1 Generating Semantic Structures(1/2)  Generation of semantic knowledge in information extraction is based upon the result of parsing steps that can be of varying ‘analysis depth’.  Level of Linguistic Analysis Tokenization Lexical/Morphological Analysis  POS tagging Syntactic Analysis Semantic/Pragmatic Analysis Discourse Analysis  CORPORUM’s lexical analysis includes: text normalization, tokenization, POS tagging

11 11 4.1 Generating Semantic Structures(2/2)  In OntoExtract the initial analysed and annotated text is transformed into an internal representation that makes use of a variety of linguistic analysis steps to come to an initial interpretation of what is written.  Representation contains the original text, its annotations, but also the resolutions performed on it.  The semantic structures undergo a translation such a more formal representation.

12 12 4.2 Generating Ontologies from Textual Resources  How the translation from linguistics into formalisms can be done properly problem of representation level : what knowledge should be represented at the ontology level/ fact level (what represents an ‘instance’/ ‘concept’) problem of dealing with the inheritance problem consistency between extracted ontologies and their truth within specific domains  Ontologies are extracted from single documents taken from the web( concepts are extracted, created). These are set into relation with each other, augmented with properties and found instances are hooked up to them.

13 13 4.3 Visualization and Navigation  The exported semantic network structures and be run through a graph layout algorithm in order to generate visualizations (with CCA viewer).  Intercluster relationships are used to navigate from one cluster to another by relevant concepts.

14 14 5. Issues in Using Automated Text Extraction for Ontology Building using IE on Web Resources  Internet has an additional challenge : multi- cultural background of the authors  Generated ontologies can be used as ‘seed ontologies’, automatically generated from a variety of user defined documents.

15 15

16 16

17 17

18 18

19 19

20 20


Download ppt "Towards the Semantic Web 6 Generating Ontologies for the Semantic Web: OntoBuilder R.H.P. Engles and T.Ch.Lech 이 은 정 2005. 2. 17."

Similar presentations


Ads by Google