Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jönköpings Tekniska Högskola. Semi-automatic Pattern-based Ontology Construction PhD Thesis – Framework and Preliminary Results Eva Blomqvist

Similar presentations


Presentation on theme: "Jönköpings Tekniska Högskola. Semi-automatic Pattern-based Ontology Construction PhD Thesis – Framework and Preliminary Results Eva Blomqvist"— Presentation transcript:

1 Jönköpings Tekniska Högskola

2 Semi-automatic Pattern-based Ontology Construction PhD Thesis – Framework and Preliminary Results Eva Blomqvist (blev@jth.hj.se)blev@jth.hj.se http://www.ontology.se/eva.htm Main supervisor: Kurt Sandkuhl, Jönköping University (saku@jth.hj.se)saku@jth.hj.se Assistant supervisor: Henrik Eriksson, Linköping University 2008-04-28

3 Outline Presentation and introduction Motivation and research questions Related work Ontology patterns – our view Initial experiments – first iteration of PhD work Proposed framework – OntoCase Summary of results Open issues and future work

4 Presentation Jönköping, Sweden Jönköping University

5 Presentation Centre for Evolving IT in Networked Organisations www.hj.se/cenit www.hj.se/cenit Information Engineering Research Group infoeng.hj.se infoeng.hj.se –Information Logistics Demand discovery and modelling Ontology-based information supply (3 PhD students, 1 post-doc and “half” an assistant professor) –Enterprise Modelling –Model-based Software Engineering

6 Presentation “Ontology group” –Manual methods for ontology construction, focus on SMEs (1 PhD student) –Semi-automatic pattern-based methods for ontology construction (1 PhD student) –Ontology matching (1 PhD student) –Ontology languages, translations, and rules (1 post-doc) –Ontologies for competence modelling –Enterprise application ontologies, ontologies for personalised and context-based information supply

7 Motivation Enterprise ontology construction Resource and time consuming More art than engineering? Problems –Companies do not have the resources and the time, the cost is too high –Misspent efforts, making simple mistakes Proposed solutions? –Well-specified and specialized manual methods –Reuse and reengineering –Semi-automatic methods (Ontology Learning)

8 Motivation (contd.) Observations… –Many ILOG applications do not require very complex ontologies –Many ILOG applications do not require “perfect” ontologies –The more help the ontology engineer can get the better, but what is really useful to the user? Issues in existing OL approaches –Low quality results, large and diverse –No traceability –No predefined requirements and no focussing –Few possibilities to include background knowledge

9 Related Work Reuse Ontology libraries and ontology search engines Ontology ranking schemes Modularisation To support reusability Ontology patterns Inspired by software patterns Initially manual use, as templates Problems –You need to know what you want, how to search for it, how to know when you’ve found it and how to use it!

10 Related Work Ontology Learning –Commonly based on text corpus input (OL from text) –Set of algorithms for extracting single “elements” –Tools exist OntoLT, Text2Onto, OntoGen, Abraxas … Problems –Dependent on the quality and focus of input texts Only explicit information is extracted –High dependence on user intervention an expertise Tuning, setting variables, validating, post-processing… –Large, diverse and low quality result Does it help the user to get 500 unconnected terms?

11 Research questions What are ontology patterns? How can ontology patterns be used in semi- automatic ontology construction? What are the effects of pattern-usage on the resulting ontologies?

12 Ontology engineering patterns – Our view “…a set of ontological entities, structures or construction principles that recur, either exactly replicated or in an adapted form, within some set of ontologies or is envisioned to recur within some future set of ontologies” Characteristics –Experience-based or designed (mining or template-writing) –Logic structure or content –Level of abstraction and granularity

13 Ontology engineering patterns – Our view Abstraction levels –Application patterns (Albertsen & Blomqvist, 2007) –Architecture patterns –Design patterns –Syntactic patterns Granularity and scope –Single “elements” and their representation –Parts solving specific sub-problems –Complete ontology Pattern or module? - Not a clear distinction Patterns imply a certain level of consensus and reusability…

14 Ontology Design Patterns General ontology design patterns – examples from LOA-CNR Portal coming: www.ontologydesignpatterns.orgwww.ontologydesignpatterns.org Participation (http://wiki.loa-cnr.it/index.php/ LoaWiki:DesignPatternDiagrams)

15 Ontology Design Patterns OntoCase domain-dependent ontology design pattern – example Requirements and product features (adapted from Data Model Pattern)

16 First attempt at pattern-based OL Pattern classification and characteristics (Blomqvist, 2005) Design pattern construction from existing pattern sources, like data model patterns according to (Hay, 1996) Initial testing of existing OL systems: text processing tailored for OL significantly improves precision and recall compared to “standard” methods (Blomqvist, 2007) Pattern selection and combination in the SEMCO-project (Blomqvist et al., 2006) –Parallel construction – manual and OL, comparison through ontology evaluation methods –Pattern-approach: more relations and better structure, mostly at intermediate abstraction levels, poor coverage of terms and lack of general abstractions

17 Lessons learned and current focus Important to bridge the abstraction gap between patterns and extracted terms and relations Composition must be based on knowledge about patterns, possibly also abstract top-structure Need for evaluation phase in construction Ways to construct/extract/propose new patterns Current focus –Refined pattern categories and characteristics (presented previously) –OntoCase framework and detailed methods for retrieval and reuse of patterns (following slides) –Effects on output (still to be evaluated thoroughly)

18 OntoCase - a framework inspired by case- based reasoning

19 OntoCase – Pattern Base Current Ontology design patterns and “candidates” Index structure – domain, name and concept labels Future work Architecture patterns and reference architectures Relations between pattern More meta-information describing patterns (including competency questions?)

20 OntoCase Retrieval of Pattern Set Input Text corpus Pattern base Steps a)Construct representation of input (assumption: at a minimum contains a set of terms and unnamed binary relations with confidence information) b)Selection of architecture pattern/reference architecture and statement of competency questions c)Match input representation to patterns from pattern base d)Select appropriate set of patterns Output Input representation – terms and relations Set of selected patterns with matching results Selected architecture pattern/reference architecture Core issues: Bridge abstraction gap – use background knowledge Pattern ranking through term to concept matching, relation matching and quality assessment (Blomqvist, 2008)

21 Pattern ranking and selection Term & relation extraction (currently based on TextToOnto) Ranking based on –Concept coverage Direct term coverage (string matching) Indirect term coverage (head heuristic and WordNet) –Relation coverage Based on term matching results –“Usefulness/quality” of matched parts (intuition: estimated value of adding the parts to the resulting ontology) Density of concept “environment” Proximity of matched concepts Pattern selection based on total coverage over learnt input representation (learnt ontology)

22 Evaluation – pattern ranking Fraction of top-10 Fraction of bottom-10 Avg. diffMax diff OntoCase 0.6 2.756 AktiveRank 0.40.33.427 Stringm. (exact) 0.2 3.257 Stringm. (inexact) 0.2 4.8311 Classification –Reference – manual classification of 30 patterns –Measure – fraction of the top-10 results that were “correct” and fraction of bottom-10 results that were “correct” Ranking –Reference – manual ranking of 12 patterns –Measure – average steps from “correct” position and maximum difference

23 OntoCase Pattern Reuse Input Input representation Set of selected patterns with matching information Selected architecture pattern/reference architecture Steps Iterate over selected patterns Specialise using matching information Adapt using matching information and heuristics Integrate in resulting ontology - pattern composition Output Initial ontology Core issues: Pattern composition = ontology merging? Use pattern background information, like origin and relations

24 Pattern specialisation, adaptation and composition Specialisation – “attach” input representation terms and relations –Direct term matches => synonyms –Indirect term matches => subclasses of the most specific concept –Relation matches => add relation label Adaptation –Default: keep only matched parts! But… preserve structure => heuristics Keep taxonomic structure Add also unmatched relations between added terms –Future work: Preserve pattern reference for tracking and manual evaluation Composition –Based on top-level ontologies and assumption of overlap –Based on extracted relations – add relations between added terms –Future work: pattern relations (specialisation, overlap) explicit in pattern base

25 OntoCase Ontology Revision – Future Work Input Text corpus and input representation Initial ontology Steps Evaluate initial ontology (possibly with the help of the user) Extend/enrich ontology (additional external information sources) Attach elements from the input representation “Clean” ontology Remove redundancy Reduce inconsistency Transform to required ontology representation depending on user needs Output “Resulting” ontology

26 OntoCase Pattern Retain Phase – Future Work Input Resulting ontology (and information from the construction process) Pattern base Steps Extract feedback for the used patterns (Used/Unused parts) Extract possible variations of the used patterns (Changed parts) Extract new pattern candidates Through finding “modules” or strongly connected sub- ontologies Additional manual selection, generalisation and validation Output Revised patterns and pattern feedback Set of new pattern candidates

27 Summary - OntoCase Ontology engineering needs to be semi-automatic but the quality of ontology output from existing OL systems are not sufficient, in addition a lot of user involvement is needed OntoCase gives an overall framework for semi-automatic ontology construction based on patterns including tasks to be performed in each phase Patterns are a means of reusing experience and knowledge, we intend to show that patterns increase the quality of the output OntoCase in total aims to Further automate the OL process (compared to existing OL approaches using only text corpus input) Introduce knowledge reuse in OL through patterns Produce better quality of the output ontologies (than existing OL approaches)

28 Future work OntoCase Planned evaluations: –Redo the SEMCO-caseCompare to: –Department ontologyTex2Onto, OntoGen etc. –Set of texts from the web(+ manual result) Two more phases! –Pattern-based evaluation and revision –Pattern extraction and refinement Adhere closer to CBR – competency questions and improved pattern base index Reference architectures for enterprise ontology

29 Future work Ontology Engineering Patterns Patterns to support evolution and maintenance of ontologies – change is a major issue! Patterns to support provenance and traceability – important to see why certain parts are included! User focus in Ontology Engineering and OL –What is really useful?? –What patterns are good and why?? => Experimentation!!

30 Future work – research ideas Applications of the enterprise application ontologies –Connect to the model-based software engineering –Realise the domain repository ideas How to construct ontologies for the Semantic Web? –What are the needs and wants of “real” applications and users on the Web? –How do OL and patterns help ontology engineers? What patterns are good and which are not? –“Personal” and evolving ontologies, constructed and evolving semi-automatically on the web using patterns and other reusable components (connection to web 2.0?)

31 Jönköpings Tekniska Högskola


Download ppt "Jönköpings Tekniska Högskola. Semi-automatic Pattern-based Ontology Construction PhD Thesis – Framework and Preliminary Results Eva Blomqvist"

Similar presentations


Ads by Google