Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multi-Concept Alignment and Evaluation Shenghui Wang, Antoine Isaac, Lourens van der Meij, Stefan Schlobach Ontology Matching Workshop Oct. 11 th, 2007.

Similar presentations


Presentation on theme: "Multi-Concept Alignment and Evaluation Shenghui Wang, Antoine Isaac, Lourens van der Meij, Stefan Schlobach Ontology Matching Workshop Oct. 11 th, 2007."— Presentation transcript:

1 Multi-Concept Alignment and Evaluation Shenghui Wang, Antoine Isaac, Lourens van der Meij, Stefan Schlobach Ontology Matching Workshop Oct. 11 th, 2007

2 Multi-Concept Alignment and Evaluation Introduction: Multi-Concept Alignment Mappings involving combinations of concepts o1:FruitsAndVegetables → (o2:Fruits OR o2:Vegetables) Also referred to as: Multiple, complex Problem: only a few matching tools consider it Cf. [Euzenat & Shvaiko]

3 Multi-Concept Alignment and Evaluation Why is MCA a Difficult Problem? Much larger search space: |O1| x |O2| → 2 |O1| x 2 |O2| How to measure similarity between sets of concepts? Based on which information and strategies? “Fruits and vegetables” vs. “Fruits” and “Vegetables” together Formal frameworks for MCA? Representation primitives owl:IntersectionOf? skosm:AND? Semantics A skos:broader ( skosm:AND B C)  A broader B & A broaderC ?

4 Multi-Concept Alignment and Evaluation Agenda The multi-concept alignment problem The Library case and the need for MCA Generating MCAs for the Library case Evaluating MCAs in the Library case Conclusion

5 Multi-Concept Alignment and Evaluation Yet MCA is needed in real-life problems KB collections (cf. OAEI slides) Scenario: re-annotation of GTT-indexed books by Brinkman concepts

6 Multi-Concept Alignment and Evaluation Yet MCA is needed in real-life problems Books can be indexed by several concepts with post-coordination: co-occurrence matters {G1=“History”, G2=“the Netherlands”} in GTT → a book about Dutch history Granularity of two vocabularies differ → {B1=“Netherlands; History”} Alignment should associate combination of concepts

7 Multi-Concept Alignment and Evaluation Agenda The multi-concept alignment problem The Library case and the need for MCA Generating MCAs for the Library case Evaluating MCAs in the Library case Conclusion

8 Multi-Concept Alignment and Evaluation MCA for Annotation Translation: Approach Produce similarity measures between individual concepts Sim(A,B) =X Grouping concepts based on their similarity {G1,B1,G2,G3,B2} Creating conversion rules {G1,G2,G3} → {B1,B2} Extraction of deployable alignment

9 Multi-Concept Alignment and Evaluation MCA Creation: Similarity Measures KB scenario has dually indexed books Brinkman and GTT concepts co-occur Instance-based alignment techniques can be used Between concepts from a same vocabulary, similarity mirrors possible combinations!

10 Multi-Concept Alignment and Evaluation MCA Creation: 2 Similarity Measures Jaccard overlap measure applied on concept extensions Latent Semantic Analysis Computation of similarity matrix Filter noise due to insufficient data Similarity between concepts between vocabularies and inside vocabularies

11 Multi-Concept Alignment and Evaluation MCA Creation: 2 Concept Aggregation Methods Simple Ranking For a concept, take the top k similar concepts Gather GTT concepts and Brinkman ones Clustering Partitioning concepts into similarity-based clusters Gather concepts Global approach: the most relevant combinations should be selected

12 Multi-Concept Alignment and Evaluation Generated Rules Clustering generated much less rules With more concepts

13 Multi-Concept Alignment and Evaluation Agenda The multi-concept alignment problem The Library case and the need for MCA Generating MCAs for the Library case Evaluating MCAs in the Library case Conclusion

14 Multi-Concept Alignment and Evaluation Evaluation Method: data sets Training and evaluation set from dually-indexed books 2/3 training, 1/3 testing Two training sets (samples) Random Rich: books that have at least 8 annotations (both thesauri)

15 Multi-Concept Alignment and Evaluation Evaluation Method: Applying Rules Several configurations for firing rules 1. Gt = Gr 2. Gt  Gr 3. Gt  Gr 4. ALL Gt Gr1 → Br1 Gr2 → Br2 Gr3 →Br3

16 Multi-Concept Alignment and Evaluation Evaluation Measures Precision and recall for matched books Books that were given at least one good Brinkman annotation Pb, Rb Precision and recall for annotation translation Averaged over books

17 Multi-Concept Alignment and Evaluation Results: for ALL Strategy

18 Multi-Concept Alignment and Evaluation Results: Rich vs. Random Training Set Rich does not improve the results a lot Bias towards richly annotated books Jaccard performances go down LSA does better Statistical corrections allow simple grouping techniques to cope with data complexity

19 Multi-Concept Alignment and Evaluation Results : for Clustering

20 Multi-Concept Alignment and Evaluation Results: Jaccard vs. LSA For 3 and ALL, LSA outperforms Jaccard For 1 and 2 Jaccard outperforms LSA Simple similarity is better at finding explicit similarities Really occurring in books LSA is better at finding potential similarities

21 Multi-Concept Alignment and Evaluation Results : using LSA

22 Multi-Concept Alignment and Evaluation Results: Clustering vs. Ranking Clusters performs better on strategies 1 and 2 They match existing annotations better They have better precision Ranking has higher recall but lower precision Classical tradeoff (ranking keeps noise)

23 Multi-Concept Alignment and Evaluation Agenda The multi-concept alignment problem The Library case and the need for MCA Generating MCAs for the Library case Evaluating MCAs in the Library case Conclusion

24 Multi-Concept Alignment and Evaluation Conclusions There is an important problem: multi-concept alignment Not extensively dealt with current litterature Needed by applications We have first approaches to create such alignments And to deploy them! We hope that further research will improve the situation (with our ‘deployer’ hat on) Better alignments More precise frameworks (methodology research)

25 Multi-Concept Alignment and Evaluation Conclusions: performances Evaluation shows mitigated results Performances are generally very low These techniques cannot be used alone Notice: dependence on requirements Settings were manual indexer choose among several candidates allow for lower precision Notice: indexing variablity OAEI have demonstrated that manual evaluation somehow compensates for the bias of automatic one

26 Multi-Concept Alignment and Evaluation Thanks!


Download ppt "Multi-Concept Alignment and Evaluation Shenghui Wang, Antoine Isaac, Lourens van der Meij, Stefan Schlobach Ontology Matching Workshop Oct. 11 th, 2007."

Similar presentations


Ads by Google