Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Ontology Creation Methodology: A Phased Approach

Similar presentations


Presentation on theme: "An Ontology Creation Methodology: A Phased Approach"— Presentation transcript:

1 An Ontology Creation Methodology: A Phased Approach
Jon Atle Gulla Norwegian University of Science and Technology; Norway Vijay Sugumaran Oakland University, USA

2 Agenda Ontology development Traditional ontology learning
Limitations of ontology learning A phased approach to ontology learning

3 The Challenge How to develop large complex ontologies?
How to keep ontologies updated in dynamic domains?

4 Ontology Modeling vs. Learning
Traditional ontology engineering approach Project: Form team of ontology and domain experts Ontology & domain experts: Collaborative manual modeling process Domain experts: Verify ontology against domain knowledge Ontology experts: Verify ontology against syntactic and semantic quality measures Expensive and time-consuming approach Stable domains assumed Ontology learning approach: Domain experts: Find representative domain text Tool: Extract candidate classes, individuals and properties automatically from domain texts Ontology & domain experts: Verify candidate structures and complete ontology Can also be used to verify domain quality of existing ontology Cost-effective approach Not unproblematic in dynamic domains

5 Agenda Ontology development Traditional ontology learning
Limitations of ontology learning A phased approach to ontology learning

6 Ontology Learning Basis
People communicate using domain-specific concepts People document using domain-specific concepts Ontology learning: Extract ontology structures from written documentation Requirements: Documents representative for domain terminology Documents cover all the terminology Well-defined and consistent use of terminology in domain Ontology discussions Realm of ontology engineering Ontology in use Realm of ontology learning

7 Levels of Ontology Learning
Degree of difficulty  x,y(manager(x,y) → report(y,x)) Rules Relations FINANCE(ag:SPONSOR, go: PROJECT) Concept hierarchies is_a(MANAGER, EMPLOYEE) Concepts PROJECT Synonyms (leader, manager, lead) Terms sponsors, costs, charter

8 Ontology Learning Strategies
Term extraction Linguistic analysis Statistical analysis Synonyms Classification-based techniques Distribution-based techniques Concept formation Structure recognition Keyphrase generation Instance learning Concept hierarchy Clustering Lexico-syntactic patterns Head-modifier approaches Subsumption approaches Relations Association rules Concept vectors Rules Structure recognition for meta-property recognition Dependency trees and path similarities

9 Ontology Learning Process
Scope management WBS Business need Constituent components Product description ... PMBOK Abstract elements Constraints Properties Rules Domain text Concept candidates Search ontology Reference set Automatic extraction of concept and relationship candidates Manual selection of candidates and completion of model

10 Ex 1. Learning Concept/Individual Candidates
Scope planning is the process of progressively elaborating and documenting the project work (project scope) that produces the product of the project. Scope/NNP planning/NN is/VBZ the/DT process/NN of/IN progressively/RB elaborating/VBG and/CC documenting/VBG the/DT project/NN work/NN (/( project/NN scope/NN )/) that/WDT produces/VBZ the/DT product/NN of/IN the/DT project/NN ./. POS tagging Scope planning is the process of progressively elaborating and documenting the project work (project scope) that produces the product of the project. Stopword removal (571 words) Scope plan process progress elaborate document project work project scope produce product project Lemmatization/stemming (POS tags not shown) {scope planning, process, project work, project scope, product, project} Select consecutive nouns as candidate phrases {(scope planning, ), (project scope, ), (product, ), (project work, ), (project, ), (process, )} Calculate tf.idf score for phrases

11 Classes Relevant to the Drama Genre
Data sources: IMDB, Wikipedia, Videoload Keyphrase extraction technique Noun phrases ranked according to various statistical measures

12 Ex 2. Learning Relationship Candidates
Tokenizer GATE Sentence splitter Tagger Lemmatizer Noun phrase extractor Noun phrase indexer Association rules miner Association rules Concept profiles Concept similarity calculation profile builder Lucene Document Paragraph Light stemmer Relationship merger

13 Relationships Relevant to Drama Genre
Association rules on extracted concepts

14 Automatic OWL Generation

15 Agenda Ontology development Traditional ontology learning
Limitations of ontology learning A phased approach to ontology learning

16 Limitations of Ontology Learning
Different techniques produce different results Different data sources produce different results Lost control over process Extensive verification of final ontology needed New data hard to combine with old data

17 Agenda Ontology development Traditional ontology learning
Limitations of ontology learning A phased approach to ontology learning

18 Ontology Learning for Entertainment Domain
Ontology evolution for Deutsche Telecom’s Videoload download service What does Brangelina mean? Should Pitt be Brad Pitt or Michael Pitt? Actor vs. Schauspieler? All movies of Brad Pitt? Last movie of Pitt?

19 Ontology Learning Project
Duration: Nov 2007 – Nov 2009 Domain: movie download service Ontology analysis and creation based on indexed noun phrases from movie documents Ontology used for search and navigation on top of FAST search platform Ontology learning challenges: Domain changes from one day to another No consistent domain terminology No professional domain terminology Multiple languages Movies about anything... unlimited domain Ontology needs to be up to date to support search

20 Ontology Workbench 3 phases that are carried out independently
Crawling into Lucene indices Supervised extraction of candidates Combining candidates into ontology structures

21 Interactive Ontology Development
Expandable indices Subset of data source Focus of analysis List of techniques Partial results Stored results Set operations for combining results

22 Thank you


Download ppt "An Ontology Creation Methodology: A Phased Approach"

Similar presentations


Ads by Google