Jönköpings Tekniska Högskola. Semi-automatic Pattern-based Ontology Construction PhD Thesis – Framework and Preliminary Results Eva Blomqvist

Slides:



Advertisements
Similar presentations
The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
Advertisements

Lukas Blunschi Claudio Jossen Donald Kossmann Magdalini Mori Kurt Stockinger.
Search in Source Code Based on Identifying Popular Fragments Eduard Kuric and Mária Bieliková Faculty of Informatics and Information.
Lecture # 2 : Process Models
Design Concepts and Principles
Practical Business Modeling in the Unified Process Tom Morgan Software Architect, Fidelity National Information Services
ELPUB 2006 June Bansko Bulgaria1 Automated Building of OAI Compliant Repository from Legacy Collection Kurt Maly Department of Computer.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Requirements Engineering n Elicit requirements from customer  Information and control needs, product function and behavior, overall product performance,
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
Software Metrics II Speaker: Jerry Gao Ph.D. San Jose State University URL: Sept., 2001.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya Fridman Noy and Mark A. Musen.
Annotating Documents for the Semantic Web Using Data-Extraction Ontologies Dissertation Proposal Yihong Ding.
Queensland University of Technology An Ontology-based Mining Approach for User Search Intent Discovery Yan Shen, Yuefeng Li, Yue Xu, Renato Iannella, Abdulmohsen.
Architectural Design Principles. Outline  Architectural level of design The design of the system in terms of components and connectors and their arrangements.
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Semi-Automatic Generation of Mini-Ontologies from Canonicalized Relational Tables Chris Hathaway.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
Course Instructor: Aisha Azeem
Architectural Design.
Objects What are Objects Observations
Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.
Špindlerův Mlýn, Czech Republic, SOFSEM Semantically-aided Data-aware Service Workflow Composition Ondrej Habala, Marek Paralič,
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
Chapter 2 The process Process, Methods, and Tools
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Applying Belief Change to Ontology Evolution PhD Student Computer Science Department University of Crete Giorgos Flouris Research Assistant.
University of Dublin Trinity College Localisation and Personalisation: Dynamic Retrieval & Adaptation of Multi-lingual Multimedia Content Prof Vincent.
9/14/2012ISC329 Isabelle Bichindaritz1 Database System Life Cycle.
An Introduction to Design Patterns. Introduction Promote reuse. Use the experiences of software developers. A shared library/lingo used by developers.
1 Technologies for (semi-) automatic metadata creation Diana Maynard.
Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.
10/18/20151 Business Process Management and Semantic Technologies B. Ramamurthy.
1/26/2004TCSS545A Isabelle Bichindaritz1 Database Management Systems Design Methodology.
1 Introduction to Software Engineering Lecture 1.
The Systems Development Life Cycle
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
Software Engineering Saeed Akhtar The University of Lahore Lecture 6 Originally shared for: mashhoood.webs.com.
Component Based SW Development and Domain Engineering 1 Component Based Software Development and Domain Engineering.
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
Object-Oriented Modeling: Static Models. Object-Oriented Modeling Model the system as interacting objects Model the system as interacting objects Match.
Software Architecture Evaluation Methodologies Presented By: Anthony Register.
1 Knowledge Acquisition and Learning by Experience – The Role of Case-Specific Knowledge Knowledge modeling and acquisition Learning by experience Framework.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 4 Slide 1 Software Processes.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Software Development Process CS 360 Lecture 3. Software Process The software process is a structured set of activities required to develop a software.
Banaras Hindu University. A Course on Software Reuse by Design Patterns and Frameworks.
Viewpoint Modeling and Model-Based Media Generation for Systems Engineers Automatic View and Document Generation for Scalable Model- Based Engineering.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Advanced Higher Computing Science The Project. Introduction Worth 60% of the total marks for the course Must include: An appropriate interface using input.
Design Engineering 1. Analysis  Design 2 Characteristics of good design 3 The design must implement all of the explicit requirements contained in the.
The Development Process of Web Applications
3.3. Case-Based Reasoning (CBR)
Lecture 9- Design Concepts and Principles
Chapter 6 Database Design
Program comprehension during Software maintenance and evolution Armeliese von Mayrhauser , A. Marie Vans Colorado State University Summary By- Fardina.
MSc in Artificial Intelligence Student: Hsiang-Ling Kuo
Experience Management
Introduction to Software Testing
CSc4730/6730 Scientific Visualization
ece 627 intelligent web: ontology and beyond
Lecture 9- Design Concepts and Principles
Software Engineering Practice: A Generic View
Authors: Barry Smyth, Mark T. Keane, Padraig Cunningham
Business Process Management and Semantic Technologies
Dept. of Computation, UMIST
Presentation transcript:

Jönköpings Tekniska Högskola

Semi-automatic Pattern-based Ontology Construction PhD Thesis – Framework and Preliminary Results Eva Blomqvist Main supervisor: Kurt Sandkuhl, Jönköping University Assistant supervisor: Henrik Eriksson, Linköping University

Outline Presentation and introduction Motivation and research questions Related work Ontology patterns – our view Initial experiments – first iteration of PhD work Proposed framework – OntoCase Summary of results Open issues and future work

Presentation Jönköping, Sweden Jönköping University

Presentation Centre for Evolving IT in Networked Organisations Information Engineering Research Group infoeng.hj.se infoeng.hj.se –Information Logistics Demand discovery and modelling Ontology-based information supply (3 PhD students, 1 post-doc and “half” an assistant professor) –Enterprise Modelling –Model-based Software Engineering

Presentation “Ontology group” –Manual methods for ontology construction, focus on SMEs (1 PhD student) –Semi-automatic pattern-based methods for ontology construction (1 PhD student) –Ontology matching (1 PhD student) –Ontology languages, translations, and rules (1 post-doc) –Ontologies for competence modelling –Enterprise application ontologies, ontologies for personalised and context-based information supply

Motivation Enterprise ontology construction Resource and time consuming More art than engineering? Problems –Companies do not have the resources and the time, the cost is too high –Misspent efforts, making simple mistakes Proposed solutions? –Well-specified and specialized manual methods –Reuse and reengineering –Semi-automatic methods (Ontology Learning)

Motivation (contd.) Observations… –Many ILOG applications do not require very complex ontologies –Many ILOG applications do not require “perfect” ontologies –The more help the ontology engineer can get the better, but what is really useful to the user? Issues in existing OL approaches –Low quality results, large and diverse –No traceability –No predefined requirements and no focussing –Few possibilities to include background knowledge

Related Work Reuse Ontology libraries and ontology search engines Ontology ranking schemes Modularisation To support reusability Ontology patterns Inspired by software patterns Initially manual use, as templates Problems –You need to know what you want, how to search for it, how to know when you’ve found it and how to use it!

Related Work Ontology Learning –Commonly based on text corpus input (OL from text) –Set of algorithms for extracting single “elements” –Tools exist OntoLT, Text2Onto, OntoGen, Abraxas … Problems –Dependent on the quality and focus of input texts Only explicit information is extracted –High dependence on user intervention an expertise Tuning, setting variables, validating, post-processing… –Large, diverse and low quality result Does it help the user to get 500 unconnected terms?

Research questions What are ontology patterns? How can ontology patterns be used in semi- automatic ontology construction? What are the effects of pattern-usage on the resulting ontologies?

Ontology engineering patterns – Our view “…a set of ontological entities, structures or construction principles that recur, either exactly replicated or in an adapted form, within some set of ontologies or is envisioned to recur within some future set of ontologies” Characteristics –Experience-based or designed (mining or template-writing) –Logic structure or content –Level of abstraction and granularity

Ontology engineering patterns – Our view Abstraction levels –Application patterns (Albertsen & Blomqvist, 2007) –Architecture patterns –Design patterns –Syntactic patterns Granularity and scope –Single “elements” and their representation –Parts solving specific sub-problems –Complete ontology Pattern or module? - Not a clear distinction Patterns imply a certain level of consensus and reusability…

Ontology Design Patterns General ontology design patterns – examples from LOA-CNR Portal coming: Participation ( LoaWiki:DesignPatternDiagrams)

Ontology Design Patterns OntoCase domain-dependent ontology design pattern – example Requirements and product features (adapted from Data Model Pattern)

First attempt at pattern-based OL Pattern classification and characteristics (Blomqvist, 2005) Design pattern construction from existing pattern sources, like data model patterns according to (Hay, 1996) Initial testing of existing OL systems: text processing tailored for OL significantly improves precision and recall compared to “standard” methods (Blomqvist, 2007) Pattern selection and combination in the SEMCO-project (Blomqvist et al., 2006) –Parallel construction – manual and OL, comparison through ontology evaluation methods –Pattern-approach: more relations and better structure, mostly at intermediate abstraction levels, poor coverage of terms and lack of general abstractions

Lessons learned and current focus Important to bridge the abstraction gap between patterns and extracted terms and relations Composition must be based on knowledge about patterns, possibly also abstract top-structure Need for evaluation phase in construction Ways to construct/extract/propose new patterns Current focus –Refined pattern categories and characteristics (presented previously) –OntoCase framework and detailed methods for retrieval and reuse of patterns (following slides) –Effects on output (still to be evaluated thoroughly)

OntoCase - a framework inspired by case- based reasoning

OntoCase – Pattern Base Current Ontology design patterns and “candidates” Index structure – domain, name and concept labels Future work Architecture patterns and reference architectures Relations between pattern More meta-information describing patterns (including competency questions?)

OntoCase Retrieval of Pattern Set Input Text corpus Pattern base Steps a)Construct representation of input (assumption: at a minimum contains a set of terms and unnamed binary relations with confidence information) b)Selection of architecture pattern/reference architecture and statement of competency questions c)Match input representation to patterns from pattern base d)Select appropriate set of patterns Output Input representation – terms and relations Set of selected patterns with matching results Selected architecture pattern/reference architecture Core issues: Bridge abstraction gap – use background knowledge Pattern ranking through term to concept matching, relation matching and quality assessment (Blomqvist, 2008)

Pattern ranking and selection Term & relation extraction (currently based on TextToOnto) Ranking based on –Concept coverage Direct term coverage (string matching) Indirect term coverage (head heuristic and WordNet) –Relation coverage Based on term matching results –“Usefulness/quality” of matched parts (intuition: estimated value of adding the parts to the resulting ontology) Density of concept “environment” Proximity of matched concepts Pattern selection based on total coverage over learnt input representation (learnt ontology)

Evaluation – pattern ranking Fraction of top-10 Fraction of bottom-10 Avg. diffMax diff OntoCase AktiveRank Stringm. (exact) Stringm. (inexact) Classification –Reference – manual classification of 30 patterns –Measure – fraction of the top-10 results that were “correct” and fraction of bottom-10 results that were “correct” Ranking –Reference – manual ranking of 12 patterns –Measure – average steps from “correct” position and maximum difference

OntoCase Pattern Reuse Input Input representation Set of selected patterns with matching information Selected architecture pattern/reference architecture Steps Iterate over selected patterns Specialise using matching information Adapt using matching information and heuristics Integrate in resulting ontology - pattern composition Output Initial ontology Core issues: Pattern composition = ontology merging? Use pattern background information, like origin and relations

Pattern specialisation, adaptation and composition Specialisation – “attach” input representation terms and relations –Direct term matches => synonyms –Indirect term matches => subclasses of the most specific concept –Relation matches => add relation label Adaptation –Default: keep only matched parts! But… preserve structure => heuristics Keep taxonomic structure Add also unmatched relations between added terms –Future work: Preserve pattern reference for tracking and manual evaluation Composition –Based on top-level ontologies and assumption of overlap –Based on extracted relations – add relations between added terms –Future work: pattern relations (specialisation, overlap) explicit in pattern base

OntoCase Ontology Revision – Future Work Input Text corpus and input representation Initial ontology Steps Evaluate initial ontology (possibly with the help of the user) Extend/enrich ontology (additional external information sources) Attach elements from the input representation “Clean” ontology Remove redundancy Reduce inconsistency Transform to required ontology representation depending on user needs Output “Resulting” ontology

OntoCase Pattern Retain Phase – Future Work Input Resulting ontology (and information from the construction process) Pattern base Steps Extract feedback for the used patterns (Used/Unused parts) Extract possible variations of the used patterns (Changed parts) Extract new pattern candidates Through finding “modules” or strongly connected sub- ontologies Additional manual selection, generalisation and validation Output Revised patterns and pattern feedback Set of new pattern candidates

Summary - OntoCase Ontology engineering needs to be semi-automatic but the quality of ontology output from existing OL systems are not sufficient, in addition a lot of user involvement is needed OntoCase gives an overall framework for semi-automatic ontology construction based on patterns including tasks to be performed in each phase Patterns are a means of reusing experience and knowledge, we intend to show that patterns increase the quality of the output OntoCase in total aims to Further automate the OL process (compared to existing OL approaches using only text corpus input) Introduce knowledge reuse in OL through patterns Produce better quality of the output ontologies (than existing OL approaches)

Future work OntoCase Planned evaluations: –Redo the SEMCO-caseCompare to: –Department ontologyTex2Onto, OntoGen etc. –Set of texts from the web(+ manual result) Two more phases! –Pattern-based evaluation and revision –Pattern extraction and refinement Adhere closer to CBR – competency questions and improved pattern base index Reference architectures for enterprise ontology

Future work Ontology Engineering Patterns Patterns to support evolution and maintenance of ontologies – change is a major issue! Patterns to support provenance and traceability – important to see why certain parts are included! User focus in Ontology Engineering and OL –What is really useful?? –What patterns are good and why?? => Experimentation!!

Future work – research ideas Applications of the enterprise application ontologies –Connect to the model-based software engineering –Realise the domain repository ideas How to construct ontologies for the Semantic Web? –What are the needs and wants of “real” applications and users on the Web? –How do OL and patterns help ontology engineers? What patterns are good and which are not? –“Personal” and evolving ontologies, constructed and evolving semi-automatically on the web using patterns and other reusable components (connection to web 2.0?)

Jönköpings Tekniska Högskola