Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.

Similar presentations


Presentation on theme: "Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003."— Presentation transcript:

1 Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

2 2 Outline Matching Syntactic Matching Semantic Matching On Implementing Semantic Matching Conclusions

3 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 3 MATCHING

4 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 4 Application Domains Generic Model Management Schema integration Data warehouses E-commerce Semantic query processing Data Coordination in P2P systems

5 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 5 Matching Problems 1. RDB Schemas 2. OODB Schemas 3. XML Schemas 4. Concept Hierarchies 5. Ontologies

6 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 6 Example of Matching Arts Organizations Art History Music Baroque History www.google.com Organizations Arts&Humanities Art History www.yahoo.com Design Art Baroque Architecture History S c =1.0 S r ={  } S c =1.0 S r ={  } S r ={  }

7 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 7 Matching Match is an operator that takes two graph-like structures (e.g., database schemas or ontologies) and produces a mapping between elements of the two graphs that correspond semantically to each other

8 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 8 Matching The problem of matching can be decomposed in two steps: Extract graphs from the data and conceptual models Match the resulting graphs (generic matching)

9 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 9 Matching Mapping element is a 4-tuple, i=1...h; j=1..k; where m ID is a unique identifier of the given mapping element; N i 1 is the i-th node of the first graph, h is the number of nodes in the first graph; N j 2 is the j-th node of the second graph, k is the number of nodes in the second graph R specifies a similarity relation of the given nodes Mapping is a set of mapping elements Matching is the process of discovering mappings between two graphs through the application of a matching algorithm

10 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 10 Matching: Syntactic AND Semantic Matching Semantic Matching Syntactic Matching R is computed between labels at nodes R = [0,1] R is computed between concepts at nodes R = {set-theoretic relations, e.g., =, , , ,  }

11 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 11 SYNTACTIC MATCHING

12 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 12 Syntactic Matching Mapping element is a 4-tuple, where L i 1 is the label at the i-th node of the first graph; L j 2 is the label at the j-th node of the second graph; R specifies a similarity relation in the form of a coefficient, which measures the similarity between the labels of the given nodes Example: R is a similarity coefficient in [0,1] R =

13 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 13 Example: Cupid (tentative links) Arts Organizations Art History Music Baroque History www.google.com Organizations Arts&Humanities Art History www.yahoo.com Design Art Baroque Architecture History S c =1.0 S c =0.9 S c =1.0 S c =0.7 S c =1.0 S c =0.7 (final result)

14 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 14 The State of the Art Cupid … is a hybrid matching prototype. It exploits linguistic and structural schema matching heuristics, and computes similarity coefficients between nodes of the trees. Similarity Flooding … is a hybrid matching prototype. It uses fix-point computation to determine correspondences between nodes of the graphs. COMA …is a composite matching prototype. It provides an extensible library of different matchers which manipulate DAGs and supports various ways of combining final results. As far as we know, so far only syntactic matching…

15 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 15 SEMANTIC MATCHING

16 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 16 Semantic Matching Mapping element is a 4-tuple, where C i 1 is the concept of the i-th node of the first graph; C j 2 is the concept of the j-th node of the second graph; R specifies a similarity relation in the form of a semantic relation between the extensions of concepts at the given nodes Possible R’s: equality {=}, overlapping {  }, mismatch {  }, more general/specific { ,  } Example: R =

17 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 17 Examples: Analysis of Siblings Suppose that we want to match nodes 5 1 and 2 2 Cupid: R = 0,8. This is because A 1 =A 2, C 1 =C 2 and we have the same structures on both sides (no importance of order of links) A semantic matching approach compares concepts A 1  C 1 with A 2  C 2 and produces C 5 1 = C 2 2

18 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 18 Examples: Analysis of Ancestors. Case 1 Suppose that we want to match nodes 5 1 and 1 2 Cupid does not find a similarity coefficient between the nodes under consideration, due to the significant differences in structure of the given graphs Semantic matching: The concept denoted by the label at node 5 1 is C 1, while the concept at node 5 1 is C 5 1 = A 1  C 1. The concept at node 1 2 is C 1 2 = C 2. Thus, C 5 1  C 1 2

19 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 19 Examples: Analysis of Ancestors. Case 2 Suppose that we want to match nodes 5 1 and 5 2 Cupid: R= 0,86. This is because of the identity of labels A 1 =A 2, C 1 =C 2 Semantic matching: The concept at node 5 1 is C 5 1 = A 1  C 1 ; while the concept at node 5 2 is C 5 2 = A 2  *  C 2. Since we have that A 1 =A 2 and C 1 =C 2, then C 5 2  C 5 1

20 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 20 Examples: Enriched Analysis of Siblings Suppose that we want to match nodes 2 1 and 2 2 Cupid: R= 0,68. This is mainly because of the entry in the thesaurus specifying Belgium as a part of Benelux, and due to the fact that the nodes with labels Benelux 1 and Belgium 2 are leaves Semantic matching: We treat C 2 1 as Benelux 1  Netherlands 1  Luxembourg 1 = Belgium. Thus, C 2 1 = C 2 2

21 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 21 ON IMPLEMENTING SEMANTIC MATCHING

22 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 22 On Implementation Semantic Matching Structure - level Element - level Weak Semantics Techniques Strong Semantics Techniques

23 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 23 Element-level Semantic Matching Weak Semantics Techniques Analysis of strings {=} Analysis of data types {=, , , ,  } Analysis of soundex {=} Strong Semantics Techniques Precompiled thesaurus syn key WordNet, where #1 … sense number 1 of the word Art according to WordNet

24 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 24 Element-level Semantic Matching (cont.) Semantic Relations via WordNet Equality: one concept is equal to another if there is at least one sense of the first concept, which is a synonym of the second Overlapping: one concept is overlapped with the other if there are some senses in common Mismatch: two concepts are mismatched if they have no sense in common More general: one concept is more general then the other iff there exists at least one sense of the first concept that has a sense of the other as a hyponym or meronym Less general: one concept is less general than the other iff there exists at least one sense of the first concept that has a sense of the other concept as hypernym or as a holonym

25 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 25 Structure-level Semantic Matching We translate the matching problem, namely the two graphs (in particular, the pair of nodes submitted to matching) into a propositional formula and then check for its validity We check for validity using SAT

26 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 26 Semantic Matching Algorithm 1. Extract the two graphs 2. Compute element-level semantic matching 3. Compute concepts at nodes 4. Construct the propositional formula 5. Run SAT 6. Perform iterations

27 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 27 Semantic Matching Algorithm: Example – (1) Extract the two graphs In the case of RDB, XML and OODB schemas, it is necessary to extract useful semantic information, for instance in the form of ontologies

28 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 28 Semantic Matching Algorithm: Example – (2) Element-level semantic matching. For each node, compute semantic relations holding among all the concepts denoted by labels at nodes under consideration A 1 = A 2 B 1 = B 2 C 1 = C 2 D 1 = D 2 E 1 = E 2

29 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 29 Semantic Matching Algorithm: Example – (3) Compute concepts at nodes. Suppose, we want to find a semantic relation between nodes 5 1 and 1 2 ? C 1 1 = A 1 C 5 1 = A 1  C 2 C 1 2 = C 2 C51  C12C51  C12

30 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 30 Semantic Matching Algorithm: Example – (4) Construct the propositional formula. We translate all the semantic relations computed in step 2 into propositional formulas under the following rules:  A 1  A 2  A 2  A 1  A 1  A 2  A 1  A 2  A 1 = A 2  A 1  A 2  A 1  A 2   (A 1  A 2 ) ? From step 2 we have: C 1  C 2 We want to prove that C 5 1  C 1 2 ( we guess relation between nodes at this stage) (A1  C1)  C2(A1  C1)  C2 (C 1  C 2 )  ((A 1  C 1 )  C 2 )

31 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 31 Semantic Matching Algorithm: Example – (5) Run SAT In order to prove that (C 1  C 2 )  ((A 1  C 1 )  C 2 ) is valid, we prove that its negation is unsatisfiabile (C 1  C 2 )  ((A 1  C 1 )  C 2 ) SAT returns FALSE Thus, C 5 1  C 1 2

32 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 32 Semantic Matching Algorithm: Example – (6.1) Iterations. Iterations are performed re-running SAT Suppose, that C 2 1  C 2 2   …an oracle tells us that A 1 = F 2  G 2 After this additional analysis we can infer C 2 1 = C 2 2

33 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 33 Semantic Matching Algorithm: Example – (6.2) Iterations. …to use the result of a previous match Suppose, that F 1  B 2 Having found that C 4 1  C 4 2 We can automatically infer that C 5 1  C 5 2

34 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 34 Example: Cupid vs. Semantic Matching Arts Organizations Art History Music Baroque History www.google.com Organizations Arts&Humanities Art History www.yahoo.com Design Art Baroque Architecture History {}{} {}{} {}{} {}{} {}{} {}{}

35 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 35 Conclusions We have made a rational reconstruction of the major matching problems and articulated them in terms of the more generic problem of matching graphs We have identified semantic matching as a new approach for performing generic matching We have proposed an implementation of semantic matching using SAT

36 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 36 Future Work Extend to a full graph matcher How to extract semantics from schemas Study how to take into account attributes and instances Develop an efficient implementation of the system Do a thorough testing of the system

37 The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003 37 References Project website: http://www.dit.unitn.it/~p2p/http://www.dit.unitn.it/~p2p/ F. Giunchiglia, P.Shvaiko “Semantic Matching”. Technical Report #DIT-03-013, Trento, 2003. Also to appear in Proc. of ODS at IJCAI – 03. F. Giunchiglia, I. Zaihrayeu “Making peer databases interact – a vision for an architecture supporting data coordination” In Proc. Of the Conference of Information Agents (CIA 2002), Madrid, 2002


Download ppt "Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003."

Similar presentations


Ads by Google