Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ontology Quality by Detection of Conflicts in Metadata Budak I. Arpinar Karthikeyan Giriloganathan Boanerges Aleman-Meza LSDIS lab Computer Science University.

Similar presentations


Presentation on theme: "Ontology Quality by Detection of Conflicts in Metadata Budak I. Arpinar Karthikeyan Giriloganathan Boanerges Aleman-Meza LSDIS lab Computer Science University."— Presentation transcript:

1 Ontology Quality by Detection of Conflicts in Metadata Budak I. Arpinar Karthikeyan Giriloganathan Boanerges Aleman-Meza LSDIS lab Computer Science University of Georgia, USA EON’2006 Edinburgh, Scotland, May 22, 2006 Co-located with WWW-2006

2 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 Motivation Ontologies over 1 million entities increasingly appearing TAP, SWETO, GlycO, UniProt Quality Concerns: –Entity disambiguation –Which ontologies are available? (i.e., search & ranking) –Inconsistency checking (i.e., in OWL) –Conflict detection

3 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 … Motivation “Representing, identifying, discovering, validating, and exploiting complex relationships are important issues related to realizing the full power of the Semantic Web, and can help close the gap between highly separated information retrieval and decision- making steps” [Sheth, Arpinar & Kashyap 2003] “The Web is decentralized, allowing anyone to say anything. As a result, different viewpoints may be contradictory, or even false information may be provided. In order to prevent agents from combining incompatible data or from taking consistent data and evolving it into an inconsistent state, it is important that inconsistencies can be detected automatically” [W3C 2004] “… these problems manifest themselves in various ways, including poor recall of available resources and inconsistency of search results. They arise due to errors, omissions and ambiguities in the metadata…” [Currier & Barton 2003]

4 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 Our Approach Approach: Detection of conflicting relationships –or conflicts in sequences of relationships How? User-defined rules are validated against a populated ontology –These rules are domain-dependent Goal: By detecting conflicting data, a user can take action to improve the quality of the ontology

5 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 John Claura Bill Mary fatherOf marriedTo motherOf fatherOf marriedTo fatherinLawOf CONFLICT Example of Conflict Identification

6 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 WilliamsChrisRepublicanParty votedFormemberOf supporterOf An RDF triple is a simplification Basically, composing relationships –Leading to simple relations yet somewhat arbitrary Few definitions, ‘simplification’

7 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 Statement Simplification There could be simplifications of the form: statement 1  statement 2  … statement n → statement t In this case statement t is a simplification –this is dependent on expert knowledge –this is not in the traditional reasoning approach

8 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 Statement Simplification Immigrant FinancialOrganization JudicialOrganization BusinessOrganization Person multipleDeposits associated owner works underInvestigation MoneyLaundering suspected

9 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 TA set of triples SA function denoting the process of simplification sThe result of simplification (S(T)  s) UConstraints expressed in an ontology (e.g., the property ‘biologicalMother’ is unique) EConstraints supplied by an expert (e.g., person(x) can never do action(y))  Two sets of triples T1 and T2 are in conflict if their simplifications S(T1)  s1 and S(T2)  s2 are mutually non- agreeable Using ‘simplification’ for detection of conflict  Two simplifications s1 and s2 are mutually non-agreeable if taken together they are in violation of domain constrains

10 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 Defining Rules for Simplification

11 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 Types of Conflicts Property Assertion Class Assertion Statement Assertion

12 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006  Establish constraints on properties - based on the semantics of their intended/expected use - thus, subjective  Examples:  ‘asymmetric’ constraint  ‘disjoint’ constraint Types of conflicts: Property Assertion

13 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 Types of Conflicts: Class Assertion  Establish constraints on classes - based on the semantics of their intended/expected use - also, subjective  Examples:  ‘disjoint’ classes (schema or instances)

14 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 Types of Conflicts: Statement Assertion Stating that under certain conditions, one or more statement are conflicting Example, a person cannot be a superior and a friend to “John” at the same time

15 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 Semantic Metadata Ontology JENA API SERIALIZER User Interface Relationship Ontology RULES RuleML SIMPLIFICATION CONFIDER API MANDARAX API Facts RULES RuleML MANDARAX API CONFLICT ENGINE System Architecture

16 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 Performance Evaluation Tested with an ontology of 6K entities and 11K relationships –subset of SWETO ontology –domain of computer science publications Sample conflict detection of: –no two same papers published in different publication venues

17 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 Conflict Identification Results

18 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 Statement Provenance

19 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 Performance Evaluation  with increase in number of conflicts (500 triples)

20 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 Conclusions and Discussion Defined types of conflicts Described a rule-based approach to identify the conflicts Findings: Scalability limited by other tools (Mandarax) Applicable to refining extraction-based approaches for populating ontologies Very domain-dependent and subjective method

21 Searching and Ranking Documents based on Semantic Relationships, Boanerges Aleman-Meza, ICDE Ph.D. Workshop 2006 Comments, Questions, …


Download ppt "Ontology Quality by Detection of Conflicts in Metadata Budak I. Arpinar Karthikeyan Giriloganathan Boanerges Aleman-Meza LSDIS lab Computer Science University."

Similar presentations


Ads by Google