Presentation is loading. Please wait.

Presentation is loading. Please wait.

Extracting Information from Heterogeneous Information Sources Using Ontologically Specified Target Views Joachim Biskup Universität Dortmund and David.

Similar presentations


Presentation on theme: "Extracting Information from Heterogeneous Information Sources Using Ontologically Specified Target Views Joachim Biskup Universität Dortmund and David."— Presentation transcript:

1 Extracting Information from Heterogeneous Information Sources Using Ontologically Specified Target Views Joachim Biskup Universität Dortmund and David W. Embley Brigham Young University Funded by NSF

2 Information Exchange SourceTarget Information Extraction Schema Matching Leverage this … … to do this

3 Presentation Outline Overview Matching (Direct) Matching (Derived) Matching Algorithm Summary

4

5 Requirements 1.f is an injective function. 2.f maps obj. sets to obj. sets and rel. sets to rel. sets 3.f respects rel-set arities. 4.f respects referential integrity. 5.f respects types. 6.f respects real-world identity. 7.f ’s coercions are G/S compatible. 8.f respects subset constraints. 9.f respects mutual-exclusion constraints. 10.f respects union constraints

6 User Interaction (IDS Statements) Issue –Explains the issue –Example: units, may need transformation Default –Explains the default option –Example: if no transformation, no conversion Suggestion –Gives a suggestion about how to resolve the issue –Example: if needed, specify the conversion

7 Theorem Let f be the generated mapping from target t to source s, populated such that s has a valid interpretation. Let t’ be the submodel of t populated from s by f. Then t’ has a valid interpretation. Proof: the paper is the proof …

8 Target (Graphical View)

9 Target (Textual View)

10 Source Example (Assumed to be Populated)

11 Matching (Direct) Object Sets Relationship Sets

12 Object-Set Type Compatibility 1.type(a) = type(b) 2.type(a)  type(b) 3.type(a)  type(b) 4.type(a)  type(b)

13 type(a) = type(b) Same type –string = string, but Airport  Head Of State –Need better matching techniques Same type, different units –Size  Nr Sq Km –Need unit conversion Same type, different format –Date  Date, but 01/02/2002  Jan 2, 2002 –Need format conversion Same type, same units and format, different assumptions –Altitude  Altitude, but altitude of aircraft and spacecraft differ –Need same assumptions Same type, same units and format, same assumption, OIDs

14 type(a)  type(b) and type(a)  type(b) Real  Integer or Video  Image –Target has greater discriminating power –Can add.0 or make a video of a single image (?) Integer  Real or Image  Video –Source has greater discriminating power –Can round off or select one of the frames (?)

15 type(a)  type(b) Image  String –Mismatch, even if same attribute (e.g. both City) –Types can help discard potential matches String(5)  Integer –But suppose the integer is 2 –Might work, but is “2.000” ok?

16 Relationship Match Requirements Referential integrity Constraints –Cardinality –Mandatory/Optional

17 Referential Integrity a b a’ b’ TargetSource... a’’ The types of a, a’, and a’’ can all be different, but not arbitrary. Example: a (String), a’ (Integer), a’’ (Real).

18 Relationship-Set Constraint Compatibility 1.constr(a) constr(b) 2.(constr(a) constr(b)) 3.(constr(a) constr(b)) 4.(constr(a) constr(b))

19 constr(a) constr(b) Person Car owns drives o o o o Person Car ? o o Need more information to resolve: Perhaps “?” is “purchased.”

20 (constr(a) constr(b)) City City Map City City Map ab The target (a) expects many maps, but the source can’t supply them.

21 (constr(a) constr(b)) City City Map City City Map ab The target (a) expects one map, but the source can supply many.

22 (constr(a) constr(b)) City City Map City City Map ab The target (a) expects at least one and potentially many maps, but the source may have none or at most one. o

23 Matching (Derived) Generalization/Specialization Composite Values Derived Relationship Sets Displayable/Nondisplayable Object Sets

24 Generalization/Specialization For a target object set, a source object set may: –have no overlap (just ignore) –have a proper subset (accept or find missing generalization) –have the same values (direct match) –have a proper superset (hard, except for roles) –overlap (like proper subset and proper superset) Consider roles and missing generalizations

25 Roles target: source: City Travel Video CityClip: Video o o o o Video With City Scene Video With City Scene

26 Missing Generalization targetsource City MapCountry MapCity Map: ImageCountry Map: Image Map: Image  

27 Composite Values Composite in Source (split) Composite in Target (merge) Examples of Derived Relationships

28 Composite in Source Video Nr HoursNr Minutes Video Time Nr HoursNr Minutes targetsource Note also that we generated a source path.

29 Composite in Source Video Nr HoursNr Minutes Video Nr HoursNr Minutes targetsource

30 Composite in Target Video Nr HoursNr Minutes target Video Time source Time

31 Composite in Target Video target Video Time source Time

32 Displayable/Nondisplayable Object-Set Matches Nondisplayable in Source: find a key Nondisplayable in Target: create a key

33 Nondisplayable in Source targetsource Airport No Key: Discard Match City Airline flys to serves

34 Nondisplayable in Source targetsource Airport No Key: Discard Match City Airline flys to serves

35 Nondisplayable in Source targetsource Airport One Key: Choose it City Airline flys to serves Airport Name

36 Nondisplayable in Source targetsource Airport One Key: Choose it City Airline flys to serves Airport Name

37 Nondisplayable in Source targetsource Airport Two or more Keys: Choose One City Airline flys to serves Airport Name Airport Code

38 Nondisplayable in Source targetsource Airport Two or more Keys: Choose One City Airline flys to serves Airport Name Airport Code

39 Matching Algorithm

40

41 Sample Match Table

42 Pictorial View of Match Table target source

43 Summary

44 Concluding Remarks QED (the theorem holds) Let f be the generated mapping from target t to source s, populated such that s has a valid interpretation. Let t’ be the submodel of t populated from s by f. Then t’ has a valid interpretation. Proof: the paper is the proof …

45 Pictorial View of Match Table t = target s = source f = the mapping t’ has a valid interpretation t’ = submodel

46 Concluding Remarks QED (the theorem holds) Merge (several sources) –All sources extracted to same view –Union merge Object identity problems Constraint problems Source Modeling (convert to OSM) Framework defined, but not implemented


Download ppt "Extracting Information from Heterogeneous Information Sources Using Ontologically Specified Target Views Joachim Biskup Universität Dortmund and David."

Similar presentations


Ads by Google