Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Schema Matching to Simplify Heterogeneous Data Translation Tova Milo, Sagit Zohar Tel Aviv University.

Similar presentations


Presentation on theme: "Using Schema Matching to Simplify Heterogeneous Data Translation Tova Milo, Sagit Zohar Tel Aviv University."— Presentation transcript:

1 Using Schema Matching to Simplify Heterogeneous Data Translation Tova Milo, Sagit Zohar Tel Aviv University

2 Introduction There are large amounts of data available on the Web but the format of the data is not homogeneous. Most applications can handle only one or a small number of formats. There is a need to translate data from one format to another.

3 Introduction Two approaches to translating data: A specific program to translate from format A to format B. (e.g. Latex to HTML) Data translation languages.

4 Introduction The solution – TranScm A data translation system Automatically translates a portion (often a large portion) of the desired data Does not replace data translation languages, but reduces the amount of programming needed in them

5 TranScm Architecture Rule Base Matching Module Typing Module GUI Input SchemaOutput Schema Import/Export Library

6 Data Model Tree (Forest) Model Similar to OEM Allows an order on children Can handle cyclic structures using ids as “pointers”

7 Data Model Article title “Conceptual Concepts” authors author “Al Gore Ithm”“G WWW Bush” sections

8 Schema Model Labeled graphs Some nodes may be ordered Each vertex is a schema element (type) Labels carry information about the node

9 Schema Model Article [3] title [1] string authors [0,…,->] author [1] ref string sections [2]

10 Rules Rules are the basis of the matching and translation Rules have an associated priority

11 Rules Each rule has two components: Matching component Match function Decendents (sic) function Translation component Translation function

12 Matching The Match function examines schema labels to determine possible matches. The Decendents function checks the numbers and types of the children of the current node.

13 Matching Article author Article authors author

14 When Matching Fails Matching can fail for two reasons: Something in the source can’t be matched to something in the target with the current set of rules. Something in the source matches several items in the target equally well.

15 When Matching Fails Via the GUI, the user can do the following: Add Disable Modify Override

16 Translation Using the mapping generated from the Matching step and the appropriate rules, data is transformed from the input schema to the output schema. The translation process can make use of data translation languages The translation process can perform type checking.

17 Conclusion TranScm Provides a general mechanism for data translation Handles the common relatively simple translations automatically Can use data translation languages for more difficult translations


Download ppt "Using Schema Matching to Simplify Heterogeneous Data Translation Tova Milo, Sagit Zohar Tel Aviv University."

Similar presentations


Ads by Google