Presentation is loading. Please wait.

Presentation is loading. Please wait.

Thesis Defense Mini-Ontology GeneratOr (MOGO) Mini-Ontology Generation from Canonicalized Tables Stephen Lynn Data Extraction Research Group Department.

Similar presentations


Presentation on theme: "Thesis Defense Mini-Ontology GeneratOr (MOGO) Mini-Ontology Generation from Canonicalized Tables Stephen Lynn Data Extraction Research Group Department."— Presentation transcript:

1 Thesis Defense Mini-Ontology GeneratOr (MOGO) Mini-Ontology Generation from Canonicalized Tables Stephen Lynn Data Extraction Research Group Department of Computer Science Brigham Young University Supported by the

2 Thesis Defense Mini-Ontology GeneratOr (MOGO) TANGO Overview 1.Transform tables into a canonicalized form 2.Generate mini-ontologies 3.Merge into a growing ontology TANGO: Table ANalysis for Generating Ontologies Project consists of the following three components:

3 Thesis Defense Mini-Ontology GeneratOr (MOGO) Sample Input Region and State Information LocationPopulation (2000)LatitudeLongitude Northeast2,122,869 Delaware817,37645-90 Maine1,305,49344-93 Northwest9,690,665 Oregon3,559,54745-120 Washington6,131,11843-120 Sample Output

4 Thesis Defense Mini-Ontology GeneratOr (MOGO)  Concept/Value Recognition  Relationship Discovery  Constraint Discovery

5 Thesis Defense Mini-Ontology GeneratOr (MOGO) Concept/Value Recognition  Lexical Clues  Labels as data values  Data value assignment  Data Frame Clues  Labels as data values  Data value assignment  Default  Classifies any unclassified elements according to simple heuristic. Concepts and Value Assignments Northeast Northwest Delaware Maine Oregon Washington Location PopulationLatitudeLongitude 2,122,869 817,376 1,305,493 9,690,665 3,559,547 6,131,118 45 44 45 43 -90 -93 -120 RegionState Year 2002 2003

6 Thesis Defense Mini-Ontology GeneratOr (MOGO) Relationship Discovery  Dimension Tree Mappings  Lexical Clues  Generalization/Specialization  Aggregation  Data Frames  Ontology Fragment Merge

7 Thesis Defense Mini-Ontology GeneratOr (MOGO) Constraint Discovery  Generalization/Specialization  Computed Values  Functional Relationships  Optional Participation Region and State Information LocationPopulation (2000)LatitudeLongitude Northeast2,122,869 Delaware817,37645-90 Maine1,305,49344-93 Northwest9,690,665 Oregon3,559,54745-120 Washington6,131,11843-120

8 Thesis Defense Mini-Ontology GeneratOr (MOGO) Validation  Concept/Value Recognition  Correctly identified concepts  Missed concepts  False positives  Data values assignment  Relationship Discovery  Valid relationship sets  Invalid relationship sets  Missed relationship sets  Constraint Discovery  Valid constraints  Invalid constraints  Missed constraints PrecisionRecallF-measure Concept Recognition 87%94%90% Relationship Discovery 73%81%77% Constraint Discovery 89%91%90%

9 Thesis Defense Mini-Ontology GeneratOr (MOGO) Concept Recognition  What we counted:  Correct/Incorrect/Missing Concepts  Correct/Incorrect/Missing Labels  Data value assignments

10 Thesis Defense Mini-Ontology GeneratOr (MOGO) Relationship Discovery  What we counted:  Correct/incorrect/missing relationship sets  Correct/incorrect/missing aggregations and generalization/specializations

11 Thesis Defense Mini-Ontology GeneratOr (MOGO) Constraint Discovery  What we counted:  Correct/Incorrect/Missing:  Generalization/Specialization constraints  Computed value constraints  Functional constraints  Optional constraints

12 Thesis Defense Mini-Ontology GeneratOr (MOGO) Concept Recognition  Successes  98% of concepts identified  Missing label identification  97% of values assigned to correct concept  Common problems  Finding an appropriate label  Duplicate concepts

13 Thesis Defense Mini-Ontology GeneratOr (MOGO) Relationship Discovery  Recall of 92% for relationship sets  Missing aggregations and generalizations/specializations  Only found in label nesting

14 Thesis Defense Mini-Ontology GeneratOr (MOGO) Constraint Discovery  F-measure of 98% for functional relationship sets  Poor computed value discovery  Rows/Columns with totals

15 Thesis Defense Mini-Ontology GeneratOr (MOGO) Conclusions  Tool to generate mini-ontologies  Assessment of accuracy of automatic generation PrecisionRecallF-measure Concept Recognition 87%94%90% Relationship Discovery 73%81%77% Constraint Discovery 89%91%90%

16 Thesis Defense Mini-Ontology GeneratOr (MOGO) Future Work  Tool Enhancements  Linguistic processing  Data frame library  Domain specific heuristics  Alternate Uses  Annotation for the Semantic Web


Download ppt "Thesis Defense Mini-Ontology GeneratOr (MOGO) Mini-Ontology Generation from Canonicalized Tables Stephen Lynn Data Extraction Research Group Department."

Similar presentations


Ads by Google