Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semi-Automatic Generation of Mini-Ontologies from Canonicalized Relational Tables Chris Hathaway.

Similar presentations


Presentation on theme: "Semi-Automatic Generation of Mini-Ontologies from Canonicalized Relational Tables Chris Hathaway."— Presentation transcript:

1 Semi-Automatic Generation of Mini-Ontologies from Canonicalized Relational Tables Chris Hathaway

2 Introduction Ontologies are an important tool for realizing the vision of the semantic web Major setback is their creation and upkeep Must be created by experts Experts are biased in knowledge, agreement needed Ontologies continually change; upkeep a massive task Some automation is needed

3 Introduction (cont’d) Current attempts at automatic generation of ontologies not successful, because extracted from free-form, unstructured text. A more effective alternative is to extract ontologies from structured data on the web (tables, charts, etc.) TANGO project Part 1: Extract tables from the web Part 2: Define mini-ontologies from tables Part 3: Merge into growing domain ontology

4 Process Overview Start out with canonicalized table Generate likely candidates for: Object Sets Relationship Sets Functional Constraints Inclusion Constraints/Hierarchical Structure Get help from user when needed Choose best candidate for the ontology

5 Thesis Statement Currently, the generation of effective ontologies has been unsuccessful because of the free-form style of Web information By extracting concepts, constraints, and hierarchies from individual tables, we can create mini-ontologies that can later be merged into a domain ontology Success can only be determined subjectively as to the correctness of the generated ontologies.

6 Example 1: Generate Concepts Create list of candidate concepts (usually column names)

7 Example 1: Generate Concepts Determine lexicalization (columns with associated values are lexical)

8 Example 1: Generate Concepts Current ontology

9 Example 1: Generate Relationships Decide relationship sets Exponential number of combinations Basic assumption: one main concept relates to all others (attributes) Goal: find central column of interest

10 Example 1: Generate Relationships Look for mapping between one column and title of table

11 Example 1: Generate Relationships Current ontology

12 Example 1: Generate Constraints FDs and Participation Constraints FD definition: X → Y iff (X[i] = X[j]) → (Y[i] = Y[j]) for all row indexes i and j. Unless solid case (two or more same values), only consider FDs from central object to attributes Use heuristics for setting exact participation (0:1,1:*, etc)

13 Example 1: Generate Concepts Numerical values are usually functionally determined by column of interest and have 0:* participation constraint.

14 Example 1: Generate Constraints Completed mini-ontology

15 Example 2: Generate Concepts SubFamily, Group, and SubGroup are generic types Enumerate column values as object sets because less than 5 divisions (recursively)

16 Example 2: Generate Relationships Found mapping of central column of interest to title (Language) Exceptions to basic assumption Hierarchy (enumerated object sets) Transitive FDs (X → Y, Y → Z, remove X → Z) Create ISA hierarchy from table structure

17 Example 2: Generate Relationships Current ontology

18 Example 2: Generate Hierarchical Constraints Assign members to each object set for easy calculation Find inclusion dependencies: Union – All members of parents are members of one or more child Intersection (Less common) – Child members are always in both parents Mutual exclusion – Intersection of any two child members is empty.

19 Example 2: Generate Hierarchical Constraints Completed mini-ontology

20 Getting Help from the User Sometimes human intervention is required to move on in generation process Effective use of the user’s input will rely on IDS statements: Issue: explains the problem (Ex. No central object was found in the table) Default: describes default behavior (Ex. A new non-lexical object named Object will be created) Suggestion: suggests an action for the user to follow (Ex. Either choose the column central to describing the table, or name the new object set something appropriate)

21 Choosing the Best Ontology Even with given guidelines, a large set of possible mini-ontologies could still remain Two options: Ask the user a few of the most limiting questions to reduce set to a small number Rank the ontologies according to how well they follow guidelines and how they compare to other tables, domain ontology Pass smaller set to the merging process

22 Contributions to Computer Science Provides larger resource for ontology based information extraction Quick and effective way of gathering information from the Web Semi-automatic tool for generating useful ontologies, useful for the goals of the semantic web


Download ppt "Semi-Automatic Generation of Mini-Ontologies from Canonicalized Relational Tables Chris Hathaway."

Similar presentations


Ads by Google