Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 From thesauri to rich ontologies: The AGROVOC case Boris Lauser Food and Agriculture Organization (FAO) Rome, Italy

Similar presentations


Presentation on theme: "1 From thesauri to rich ontologies: The AGROVOC case Boris Lauser Food and Agriculture Organization (FAO) Rome, Italy"— Presentation transcript:

1 1 From thesauri to rich ontologies: The AGROVOC case Boris Lauser Food and Agriculture Organization (FAO) Rome, Italy boris.lauser@fao.org, www.fao.org DELOS Workshop Lund, Sweden June 23, 2004

2 2 The problem AI and Semantic Web applications need full-fledged ontologies that support reasoning Constructing such ontologies is expensive While existing KOS do not provide the full set of precise concept relationships needed for reasoning, existing KOS, both large and small, represent much intellectual capital KOS = Knowledge Organization System How can this intellectual capital be put to use in constructing full-fledged ontologies Specifically: From AGROVOC to a full-fledged Food and Agriculture Ontology

3 3 Some applications of a Food and Agriculture Ontology Advice on crops and crop management (fertilization, irrigation) Advice on pest management Tracking contaminants through the food chain Advice on safe food processing Computing nutrition labels Advice on healthy eating Improved searching

4 4 AGROVOC relationships compared with more differentiated relationships of a Food and Agriculture Ontology

5 5 AGROVOCFood and Agriculture Ontology Undifferentiated hierarchical relationships milk NT cow milk NT milk fat cows NT cow milk Cheddar cheese BT cow milk Differentiated relationships milk cow milk milk fat cows cow milk* Cheddar cheese cow milk Rule 1 Part X Substance Y IF Animal W Part X AND Animal W Substance Y Rule 2 Food Z Substance Y IF Food Z Part X AND Part X Substance Y

6 6 From AGROVOC to FA Ontology 1)Define the FA Ontology structure 2)Fill in values from AGROVOC to the extent possible 3)Edit manually with computer assistance using the rules-as-you go approach and an ontology editor: make existing information more precise add new information

7 7 Define ontology structure Overall model

8 8 Concept Relationships between concepts Lexicalization/ Term String Relationships between strings Relationships between terms designated by manifested as Other information: language/culture subvocabulary/scope audience type, etc. Note annotation relationship Relationship Relationships between Relationships

9 9 Define ontology structure Relationship types

10 10 Isa RelationshipInverse relationship X X Y Y X

11 11 Holonymy / meronymy (the generic whole-part relationship) RelationshipInverse relationship X Y Y X Y

12 12 Further relationship examples RelationshipInverse relationship X Y Y X

13 13 Fill in values from AGROVOC Fill in values from AGROVOC to the extent possible Arrange in structured sequence (to the extent possible based on the information in AGROVOC) to facilitate editing (The editor can deal with similar problems at the same time.)

14 14 Undifferentiated relationships from AGROVOC Edited relationships milkNT cow milk milkNT goat milk milkNT buffalo milk milkNT milk fat milkRT milk protein milkRT lactose cowsRT cow milk goatsRT goat milk ewesRT ewe milk goat milk RT goat cheese ewe milkRT ewe cheese acid soilsBT chemical soil types acrisolsBT genetic soil types alkaline soilsBT chemical soil types aluvial soilsBT lithological soil types chemical soil types BT soil types CichoriumBT Asteraceae Cichorium endiviaBT Cichorium Cichorium intybus BT Cichorium Cichorium intybusRT coffee substitutes Cichorium intybusRT root vegetables bloodNT blood protein bloodNT blood lipids

15 15 Edit manually with computer assistance Use the rules-as-you-go approach and good ontology editing software that handles large ontologies efficiently make existing information more precise add new information Assumption: Entity types of concepts are known from AGROVOC or other sources (Langual, UMLS, WordNet); for example milk fat is a Substance Asteraceae is a taxon The editor may need to determine the entity type

16 16 The rules-as-you-go approach Exploit patterns to automate the conversion process Example 1. An editor has determined that milk NT cow milk should become milk cow milk 2.She recognizes that this is an example of the general pattern milk NT * milk  milk * milk (where * is the wildcard character) 3.Given this pattern, the system can derive automatically milk NT goat milk should become milk goat milk Result:

17 17 Undifferentiated relationships from AGROVOC Edited relationships milkNT cow milk milkNT goat milk milkNT buffalo milk milkNT milk fat milkRT milk protein milkRT lactose cowRT cow milk goatsRT goat milk ewesRT ewe milk goat milk RT goat cheese ewe milkRT ewe cheese acid soilsBT chemical soil types acrisolsBT genetic soil types alkaline soilsBT chemical soil types aluvial soilsBT lithological soil types chemical soil types BT soil types CichoriumBT Asteraceae Cichorium endiviaBT Cichorium Cichorium intybus BT Cichorium Cichorium intybusRT coffee substitutes Cichorium intybusRT root vegetables bloodNT blood protein bloodNT blood lipids milk cow milk milk goat milk milk buffalo milk

18 18 The rules as you go approach Exploit patterns to automate the conversion process 1. Editor: milk NT milk fat  milk milk fat 2.Pattern: Substance NT/RT Substance  Substance Substance 3.Therefore milk RT milk protein  milk milk protein Result:

19 19 Undifferentiated relationships from AGROVOC Edited relationships milkNT cow milk milkNT goat milk milkNT buffalo milk milkNT milk fat milkRT milk protein milkRT lactose cowsRT cow milk goatsRT goat milk ewesRT ewe milk goat milk RT goat cheese ewe milkRT ewe cheese acid soilsBT chemical soil types acrisolsBT genetic soil types alkaline soilsBT chemical soil types aluvial soilsBT lithological soil types chemical soil types BT soil types CichoriumBT Asteraceae Cichorium endiviaBT Cichorium Cichorium intybus BT Cichorium Cichorium intybusRT coffee substitutes Cichorium intybusRT root vegetables bloodNT blood protein bloodNT blood lipids milk cow milk milk goat milk milk buffalo milk milk milk fat milk milk protein milk lactose goat milk goat cheese ewe milk ewe cheese blood blood protein blood blood lipids

20 20 The rules as you go approach Exploit patterns to automate the conversion process 1. Editor: cows RT cow milk  cows cow milk 2.Pattern Animal RT BodyPart  Animal BodyPart 3.Therefore: goats NT goat milk  goat goat milk Result:

21 21 Undifferentiated relationships from AGROVOC Edited relationships milkNT cow milk milkNT goat milk milkNT buffalo milk milkNT milk fat milkRT milk protein milkRT lactose cowRT cow milk goatsRT goat milk ewesRT ewe milk goat milk RT goat cheese ewe milkRT ewe cheese acid soilsBT chemical soil types acrisolsBT genetic soil types alkaline soilsBT chemical soil types aluvial soilsBT lithological soil types chemical soil types BT soil types CichoriumBT Asteraceae Cichorium endiviaBT Cichorium Cichorium intybus BT Cichorium Cichorium intybusRT coffee substitutes Cichorium intybusRT root vegetables bloodNT blood protein bloodNT blood lipids milk cow milk milk goat milk milk buffalo milk milk milk fat milk milk protein milk lactose cows cow milk goats goat milk ewes ewe milk goat milk goat cheese ewe milk ewe cheese blood blood protein blood blood lipids

22 22 The rules as you go approach Exploit patterns to automate the conversion process 1. Editor: acid soils BT chemical soil types  acid soils chemical soil types 2.Pattern: X BT * type*  X * type* 3.Therefore: acrisols BT genetic soil types  acrisols genetic soil types Result:

23 23 Undifferentiated relationships from AGROVOC Edited relationships milkNT cow milk milkNT goat milk milkNT buffalo milk milkNT milk fat milkRT milk protein milkRT lactose cowRT cow milk goatsRT goat milk ewesRT ewe milk goat milk RT goat cheese ewe milkRT ewe cheese acid soilsBT chemical soil types acrisolsBT genetic soil types alkaline soilsBT chemical soil types aluvial soilsBT lithological soil types chemical soil types BT soil types CichoriumBT Asteraceae Cichorium endiviaBT Cichorium Cichorium intybus BT Cichorium Cichorium intybusRT coffee substitutes Cichorium intybusRT root vegetables bloodNT blood protein bloodNT blood lipids milk cow milk milk goat milk milk buffalo milk milk milk fat milk milk protein milk lactose cows cow milk goats goat milk ewes ewe milk goat milk goat cheese ewe milk ewe cheese acid soils chemical soil types acrisols genetic soil types alkaline soils chemical soil types aluvial soils lithological soil types chemical soil type soil types blood blood protein blood blood lipids

24 24 The rules as you go approach Exploit patterns to automate the conversion process 1. Editor: CichoriumBT Asteraceae  Cichorium Asteraceae 2.Pattern: Taxon BT Taxon  Taxon Taxon 3.Therefore: Cichorium endivia BT Cichorium  Cichorium endivia Cichorium Result:

25 25 Undifferentiated relationships from AGROVOC Edited relationships milkNT cow milk milkNT goat milk milkNT buffalo milk milkNT milk fat milkRT milk protein milkRT lactose cowRT cow milk goatsRT goat milk ewesRT ewe milk goat milk RT goat cheese ewe milkRT ewe cheese acid soilsBT chemical soil types acrisolsBT genetic soil types alkaline soilsBT chemical soil types aluvial soilsBT lithological soil types chemical soil types BT soil types CichoriumBT Asteraceae Cichorium endiviaBT Cichorium Cichorium intybus BT Cichorium Cichorium intybusRT coffee substitutes Cichorium intybusRT root vegetables bloodNT blood protein bloodNT blood lipids milk cow milk milk goat milk milk buffalo milk milk milk fat milk milk protein milk lactose cows cow milk goats goat milk ewes ewe milk goat milk goat cheese ewe milk ewe cheese acid soils chemical soil types acrisols genetic soil types alkaline soils chemical soil types aluvial soils lithological soil types chemical soil type soil types Cichorium Asteraceae Cichorium endivia Cichorium Cichorium intybus Cichorium blood blood protein blood blood lipids

26 26 The rules as you go approach Exploit patterns to automate the conversion process 1. Editor: Cichorium intybusRT coffee substitutes  Cichorium intybus coffee substitutes 2.Pattern: Taxon RT FoodProduct  Taxon FoodProduct 3.Therefore: Cichorium intybus RT root vegetables  Cichorium intybus root vegetables Result:

27 27 Undifferentiated relationships from AGROVOC Edited relationships milkNT cow milk milkNT goat milk milkNT buffalo milk milkNT milk fat milkRT milk protein milkRT lactose cowRT cow milk goatsRT goat milk ewesRT ewe milk goat milk RT goat cheese ewe milkRT ewe cheese acid soilsBT chemical soil types acrisolsBT genetic soil types alkaline soilsBT chemical soil types aluvial soilsBT lithological soil types chemical soil types BT soil types CichoriumBT Asteraceae Cichorium endiviaBT Cichorium Cichorium intybus BT Cichorium Cichorium intybusRT coffee substitutes Cichorium intybusRT root vegetables bloodNT blood protein bloodNT blood lipids milk cow milk milk goat milk milk buffalo milk milk milk fat milk milk protein milk lactose cows cow milk goats goat milk ewes ewe milk goat milk goat cheese ewe milk ewe cheese acid soils chemical soil types acrisols genetic soil types alkaline soils chemical soil types aluvial soils lithological soil types chemical soil type soil types Cichorium Asteraceae Cichorium endivia Cichorium Cichorium intybus Cichorium Cichorium intybus coffee substitutes Cichorium intybus root vegetables blood blood protein blood blood lipids

28 28 The rules as you go approach Discussion Main idea: Formulate constraints to assist the editor Ontology may have many relationship types, perhaps > 100 Constraints limit the relationship types that are possible in a specific case; show the editor only these If the constraints limit possible relationship types to 1, conversion is automatic Constraints may depend on Thesaurus to be converted

29 29 Constraints Thesaurus Relationships Possible ontology relationships NT / BT | etc. RT | etc.

30 30 Constraints Thesaurus Relationships + entity types or values Possible ontology relationships milk NT * milk Substance NT Substance X BT * type* Taxon BT Taxon GeogrEntity BT GeogrEntity BodyPart BT BodyPart ChemSubstance BT ChemSubstance milk * milk Substance X * type* Taxon GeogrEntity BodyPart ChemSubstance

31 31 Constraints Thesaurus Relationships + entity types or values Possible ontology relationships Substance RT Substance LivingOrganism RT BodyPart Taxon RT FoodProduct GeogrEntity RT GeogrGrouping Process RT Object ChemSubstance RT Function Substance LivingOrganism BodyPart Taxon FoodProduct GeogrEntity GeogrGrouping Process Object ChemSubstance Function

32 32 Checking by editor Relationship instances created by editor by selecting from a constraint-generated menu are final Relationship instances created automatically must be presented to the editor If the editor determines that the relationship instances are almost always correct, she checks a box accept without checking

33 33 Overall conversion process One master editor must go through the file from start to finish, processing the relationship instances and creating patterns, creating new relationship types as needed Assistant editors can apply the patterns. In the first pass, the master editor should deal with the easy cases. Deal with the remaining cases later. Groups of similar relationship instances can be seen more easily in a smaller set

34 34 Adding new relationship types and new relationship instances AGROVOC does not contain all relationship types or relationship instances for AI applications Need to add data. For example Organism X Organism Y ChemSubstance X Organism Y Organism X Organism Y Plant X Environment Y FoodProduct X Diet Y

35 35 Conclusion The rules-as-you-go approach is a realistic method for developing a rich ontology from an existing thesaurus Full paper: Reengineering Thesauri for New Applications: the AGROVOC Example Journal of Digital InformationJournal of Digital Information, Volume 4 Issue 4Volume 4 Issue 4 http://jodi.ecs.soton.ac.uk/Articles/v04/i04/Soergel/

36 36 References For questions and discussion contact Boris Lauser boris.lauser@fao.org Dagobert Soergel dsoergel@umd.edu boris.lauser@fao.org dsoergel@umd.edu AOS : Agricultural Ontology Service Project http://www.fao.org/agris/aos http://www.fao.org/agris/aos AGMES: http://www.fao.org/agris/agmeshttp://www.fao.org/agris/agmes


Download ppt "1 From thesauri to rich ontologies: The AGROVOC case Boris Lauser Food and Agriculture Organization (FAO) Rome, Italy"

Similar presentations


Ads by Google