Presentation is loading. Please wait.

Presentation is loading. Please wait.

Balancing Lexicographic and Ontological Considerations in Ontology Development First International Workshop on Ontological Analysis Trento, IT 16-20 July,

Similar presentations


Presentation on theme: "Balancing Lexicographic and Ontological Considerations in Ontology Development First International Workshop on Ontological Analysis Trento, IT 16-20 July,"— Presentation transcript:

1 Balancing Lexicographic and Ontological Considerations in Ontology Development First International Workshop on Ontological Analysis Trento, IT 16-20 July, 2012 Amanda Hicks, University at Buffalo aellenhicks@gmail.com

2 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 2 Ontologies vs. Wordnets Wordnets represent how we use language –the word ‘cat’ in context Ontologies represent what it is to be a cat –e.g., whether being a cat is a rigid property

3 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 3 Overview of some ontologies

4 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 4 3 Layers of Ontologies Upper Most abstract Middle Intermediately abstract Domain Specific to a domain or application

5 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 5 Domain Ontologies are often developed by domain experts. model highly specific, technical information. often for use in a particular community of researchers, technicians, etc. Examples: –Gene Ontology –KYOTO domain ontology –Protein Ontology

6 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 6 Middle Ontologies are developed by ontologists or other information technologists. model concepts that are often part of a normal, spoken and written vocabulary and of an intermediate level of abstraction. connect upper-level ontologies with the domain ontology. Examples: –KYOTO Middle –Information Artifact Ontology

7 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 7 Upper Ontologies developed by ontologists models highly abstract concepts –endurant vs. perdurant –quality vs. substance Because the axioms at this level will be inherited all the way down, we need to be really careful here!

8 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 8 Upper Ontologies, some examples BFO - http://www.ifomis.org/bfo http://www.ifomis.org/bfo SUMO - http://www.ontologyportal.org http://www.ontologyportal.org DOLCE http://www.loa.istc.cnr.it/DOLCE.html http://www.loa.istc.cnr.it/DOLCE.html

9 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 9 BFO is a relatively shallow top ontology –36 classes –6 layers deep The BFO consortium coordinates many biomedical domain ontologies, users, and developers.

10 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 10 DOLCE DOLCE-Lite –37 classes –depth of 6 DOLCE-Lite Plus –208 classes –depth of 13

11 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 11 The KYOTO Project 7th frame EU project, 2007-2010 facilitates data mining and sharing from texts in the domain of ecology across seven languages WWF & ECNC are domain users www.kyoto-project.eu

12 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 12 The KYOTO Ontology Three layers Top, Middle, Domain Seven wordnets mapped to KYOTO Ontology to facilitate data extraction and management –English –Spanish –Basque –Italian –Dutch –Japanese –Chinese

13 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 13 The KYOTO Ontology KYOTO 3 - three layers Top, Middle, Domain Wordnets mapped to KYOTO Ontology to facilitate data extraction and sound inference –English –Spanish –Basque –Italian –Dutch –Japanese –Chinese Use Protégé 4.0 or older. KYOTO is not written in OWL2.

14 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 14 KYOTO Top Based on DOLCE-Lite Plus In DLP qualities are modeled according to the kinds of entities that bear the quality. –e.g., size is a physical quality since it inheres in a physical object KYOTO Top extends the physical-quality hierarchy –amount-of-matter-quality –feature-quality –physical-object-quality Added quality types –dispositional –relational

15 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 15 KYOTO Top KYOTO Top extends the role hierarchy. Roles are arranged according to the kind of entity that bears that role. –A physical-object-role is played by a physical object. –In the domain layer offspring is a subclass of organism-role since organisms are the kinds of things that play the role of offspring.

16 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 16 KYOTO Middle Includes: Base Concepts (BCs) from WordNet –nouns Units of measurement, e.g., length, and other qualities 72 new perdurants (processes and states) 123 new endurant terms (objects and substances) qualities that model adjectives

17 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 17 Base Concepts in KYOTO Synsets from WordNet-3.0 (Fellbaum (1998)) – for each path from leaf to root: first node with at least 50 hyponyms –roughly: cheap (and inadequate?) computational model for basic level concepts. –CAREFUL: the set depends on structure and coverage of WordNet which is idosyncratic cake

18 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 18 Base Concepts in KYOTO BCs facilitate mapping wordnets onto the ontology in KYOTO. WordNet is mapped onto the ontology via BCs. BC equivalents in other languages are indirectly mapped onto the ontology via mappings to WordNet’s BCs.

19 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 19 Base Concepts in KYOTO 297 BCs from the noun hierarchy and 578 BCs from the verb hierarchy –need work, in Domain layer –group names such as verb_change still appear though not ontological (Izquierdo et al. (2007)).

20 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 20 Sample BCs in KYOTO’s MiddleLayer unit-of-measurement number color change book message food

21 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 21 BCs and KYOTO In this case, the lexicon in conjunction with considerations of the application informed the population of the Middle and Domain layers of KYOTO.

22 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 22 KYOTO Domain

23 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 23 KYOTO Domain Sample concepts from user scenarios –fish family –coast –soil –water –breed –biodiversity

24 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 24 The Lexicon & The Ontology

25 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 25 is-a

26 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 26 “is a” The Problem “Is-a” is ambiguous between individuals and subclasses. This can lead to confusion. For example, species terms can be confused. Kermit is-a i leptopelis vermiculatus. Leptopelis vermiculatus is-a c species. Therefore, Kermit is a species.

27 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 27 “is-a” The Rule The Rule: Every property of a class belongs to every instance of that class. Check for all inherited properties. Species are comprised of many organisms that can successfully reproduce fertile off-spring. Is Kermit comprised of many organisms that can successfully reproduce fertile off-spring?

28 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 28 “is a” KYOTO’s Solution Model species terms twice! Species in the sense of a group are modeled as physical pluralities. This leptopelis vermiculatus is an instance NOT a subclass. ‘Leptopelis vermiculatus’ can also refer to a class. This is a type of organism.

29 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 29 Rigid & Non-Rigid Terms

30 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 30 Rigidity The Problem In ontologies and WordNet the subsumption relations are determined according to different criteria. WordNet –Hypernymy –Based on psycholinguistic data; native language speakers agree with word-use. Ontology –Subclass –Based on extention of a term, every x is a y.

31 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 31 Transitivity of Subsumption BECAREFUL! WordNet’s Hypernomy can lead to unsound inferences. Conclusion: If every pet has an owner, then every cat has an owner.

32 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 32 Rigidity KYOTO’s Solution Distinguish rigid and non-rigid terms in the wordnet. –This distinction comes from OntoClean (Guarino and Welty) Distinguish between roles and types in the ontology. Map synsets to the ontology using different mapping relations.

33 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 33 Rigidity “Cat” is a rigid concept. “Pet” is a non-rigid concept. A concept is rigid if it is essential to all of its instances. –Permanence: Fluffy is always a cat, not always a pet –Necessity: Fluffy cannot stop being a cat, Fluffy can stop being a pet.

34 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 34 The Rule of Thumb (See Giancarlo’s slides for a more nuanced view.) Non-rigid terms should not subsume rigid terms. or Roles should not subsume types.

35 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 35 A Jumbled Hierarchy amount of matter -R drug +R antibiotic +R chemical compound +R oil -R nutriment (a source of material to nourish the body)

36 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 36 Clean Hierarchies amount-of-matter +R antibiotic +R chemical compound + R oil substance-role (role played-by some amount-of matter) -R drug -R nutriment

37 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 37 Mapping Synsets

38 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 38 Adjectives

39 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 39 Adjectives General Strategy in KYOTO Qualities are easily modeled according to the kinds of entities in which they inhere. For example, amounts of matter are the kinds of things that have pH levels.

40 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 40 Adjectives General Strategy in KYOTO The values for specific qualities like pH levels are located in regions.

41 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 41 Adjectives The Problem pH-levels are easy because they are measureable, i.e., objective criteria. they are confined to one kind of entity, namely, amounts of matter.

42 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 42 Adjectives The Problem How should we model concepts like “beneficial” or “important”? Subjective component Not necessarily “out there” in the world Not typically quantifiable Criteria are context dependent Many kinds of entities can be beneficial or important.

43 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 43 Adjectives KYOTO’s Solution The middle layer has a region evaluative-region to accommodate adjectives like ‘beneficial’ or ‘worthless’.

44 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 44 Adjectives KYOTO’s Solution Concepts like “beneficial” and “important” are not in the domain specific layer since they are general concepts. not in the upper layer since they are “subjective”. not in a strictly realist ontology like BFO. modeled orthogonally to “real” qualities

45 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 45 Adjectives KYOTO’s solution What kind of restriction can you write for length? long or 2m. Indefinite qualities Definite qualities length q-located-in (length-measurement- unit or indefinite-quality-region)

46 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 46 In Conclusion Procurement - BCs influenced the concepts included in the KYOTO ontology. Hierarchy - subsumption relations must be carefully distinguished in order to avoid influence from the lexicon that might lead to unsound inferences Qualities - Lexicalized adjectives that may not have a realist corollary need to be modeled in an orthogonal way.

47 16-20 July, 2012Balancing Lexicographic and Ontological Considerations 47 Bibliography Fellbaum, C., editor (1998). WordNet: An Electronic Lexical Database. The MIT Press. Guarino N., and Welty, C., (2004). An Overview of OntoClean, Handbook on Ontologies, ed. S. Staab and R. Studer. pp. 151-172. Herold, A., Hicks, A., Rigau, G., & Laparra, E. (2009) Kyoto Deliverable D6.2: Central Ontology Version - 1, www.kyoto-project.eu.www.kyoto-project.eu Hicks, A., Rigau, G. (2010) Kyoto Deliverable D8.3: Domain Extension of the Central Ontology, www.kyoto-project.eu. Izquierdo, R., Suárez, A., and Rigau, G. (2007). Exploring the automatic selection of basic level concepts. In Proceedings of the International Conference on Recent Advances on Natural Language Processing (RANLP'07), Borovetz, Bulgaria. Masolo, C., Borgo, S., Gangemi, A., Guarino, N., Oltramari, A., & Schneider, L. (2002). Wonderweb Deliverable D17. The Wonderweb Library of Foundational Ontologies and the Dolce Ontology. Smith, B. (2004). Beyond Concepts: Ontology as Reality Representation. In Proccedings of FOIS 2004 International Conference on Formal Ontology and Information Systems. Vossen P., et al. 2008. KYOTO: A system for Mining, Structuring and Distributing Knowledge Across Languages and Cultures. In Proceedings of LREC 2008, Marrakech, Morocco, May 28-30, 2008.


Download ppt "Balancing Lexicographic and Ontological Considerations in Ontology Development First International Workshop on Ontological Analysis Trento, IT 16-20 July,"

Similar presentations


Ads by Google