Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 M APPING C OMPOSITION FOR M ATCHING L ARGE L IFE S CIENCE O NTOLOGIES A NIKA G ROSS, M ICHAEL H ARTUNG, T ORALF K IRSTEN, E RHARD R AHM 29 TH J ULY 2011,

Similar presentations


Presentation on theme: "1 M APPING C OMPOSITION FOR M ATCHING L ARGE L IFE S CIENCE O NTOLOGIES A NIKA G ROSS, M ICHAEL H ARTUNG, T ORALF K IRSTEN, E RHARD R AHM 29 TH J ULY 2011,"— Presentation transcript:

1 1 M APPING C OMPOSITION FOR M ATCHING L ARGE L IFE S CIENCE O NTOLOGIES A NIKA G ROSS, M ICHAEL H ARTUNG, T ORALF K IRSTEN, E RHARD R AHM 29 TH J ULY 2011, ICBO, B UFFALO

2 2 MeSH GALEN O NTOLOGIES Multiple interrelated ontologies in a domain (e.g. anatomy) UMLS SNOMED NCI Thesaurus Mouse Anatomy FMA Identify overlapping information between ontologies Create mappings

3 3 O NTOLOGY M ATCHING Manual creation of mappings between very large ontologies is too labor-intensive Semi-automatic generation of semantic correspondences (linguistic, structural, instance-based ontology matching) →Interrelation of ontologies →Integration of heterogeneous information sources Matching Mapping sim(O 1.a, O 2.b) = 0.8 sim(O 1.a, O 2.c) = 0.5 sim(O 1.c, O 2.c) = 1.0 further input, e.g. instances, dictionary … O1O1 O2O2

4 4 ? Indirect composition-based matching Via intermediate ontology (IO): important hub ontology, synonym dictionary, … C OMPOSING MA_0001421 UBERON:0001092 NCI_C32239 Synonym: Atlas Name: atlas Name: C1 Vertebra Name: cervical vertebra 1Synonym: cervical vertebra 1 Synonym: C1 vertebra →Find new correspondences via composition →Reuse existing mappings to →Increase match quality →Save computation time IO O1O1 O2O2

5 5 Composition-based ontology matching approach, reuse of previously determined mappings →composeMatch Optional match step to improve composition-based match quality → extendMatch Use of ontology and mapping operators: compose, match and extract Evaluation: indirect matching of MA and NCIT using large intermediate ontologies (UMLS, FMA, Uberon, RadLex) C ONTRIBUTIONS

6 6 Use mappings to intermediate ontologies IO 1, …, IO k to indirectly match O 1 and O 2 Reduce matching effort by reusing mappings to IO → very fast composition I NDIRECT M ATCHING... IO 1 IO 2 IO k O1O1 O2O2... O1O1 O2O2 OnOn HO O new →IO should have a significant overlap with O 1 and O 2 →IO 1, …, IO k may complement each other →Centralized hub HO →many mappings to other ontologies →O new aligned with any O i via HO

7 7 O1O1 IO 1 O2O2 occ = 1: CM O1,O2 = {(a,a),(b,b),(c,c)} occ = 2: CM O1,O2 = {(a,a)} Input: Two ontologies O 1 and O 2, list of intermediate ontologies IO 1 … IO k, occurrence count occ Output: Composed mapping CM O1,O2 (1) C OMPOSE M ATCH a bc de a b gh a bc d f a ic IO 2 MapList  empty for each IO i  IO do M O1,IOi  getMapping(O 1, IO i ) return merge (MapList, occ) MapList.add( compose (M O1,IOi, M IOi,O2 )) M IOi,O2  getMapping(IO i, O 2 ) end for MapList (c,c ), (a,a) (a,a), (b,b)

8 8 //Direct Mapping DM ΔO1ΔO2  match (ΔO 1, ΔO 2 ) EM O1,O2 (b,b) (a,a) (c,c) (d,d) EM O1,O2  merge ({CM O1,O2, DM ΔO1ΔO2 }, 1) return EM O1,O2 Input: Two ontologies O 1 and O 2, composed mapping CM O1,O2 Output: Extended Mapping EM O1,O2 (2) E XTEND M ATCH O1O1 O2O2 a bc de a bc d f ΔO 1  extract (O 1, CM O1,O2 ) ΔO 2  extract (O 2, inverse (CM O1,O2 )) CM O1,O2 (b,b) (a,a) (c,c) DM O1,O2 (d,d)

9 9 E VALUATION S ETUP Match problem Adult Mouse Anatomy (MA) NCI Thesaurus Anatomy part (NCIT) Preprocessing Normalization Linguistic Matcher (Name, synonyms, Trigram t = 0.8) Selection & Postprocessing Uberon UMLS MA NCIT RadLex FMA Gold standard ~1500 correspondences Precompute mappings using a match strategy ~5000 ~88,000 ~30,800 ~81,000 ~2,700~3,300 |concepts|

10 10 M APPING S TATISTICS 80% of MA48% of NCIT MA NCIT Uberon Is there a good coverage of MA and NCIT by intermediate ontologies? High overlap of covered MA and NCIT concepts → promising for composition-based match results

11 11 OAEI A NATOMY T RACK http://oaei.ontologymatching.org/[year]/anatomy AgrMaker 87.7%

12 12 Direct match result compared to composeMatch via each hub Additional matching of unmatched parts (extendMatch) E VALUATION 88.2% 86% Uberon & UMLS → best evaluated intermediate ontologies Intermediate Ontology IO

13 13 Combination of the four composed mappings Correspondences have to occur in at least 1,…,4 mappings E VALUATION union(occ=1) F-Measure90.2 Precision92.7 Recall87.8 Higher occurrences → Recall ↓ extendMatch → Recall ↑

14 14 Composition-based approach for indirect matching of large life science ontologies (composeMatch, extendMatch) Reuse mappings for improved match efficiency and quality (>90%) Evaluated several intermediate ontologies →Uberon and UMLS: very effective, suited as hub ontologies in the anatomy domain C ONCLUSIONS

15 15 F UTURE W ORK Investigate composition-based ontology matching for further domains Study the impact of additional mappings Determined by structural matching Existing mappings from BioPortal

16 16 M APPING C OMPOSITION FOR M ATCHING L ARGE L IFE S CIENCE O NTOLOGIES ? F UNDING German Research Foundation Grant RA497/18-1 “E VOLUTION OF O NTOLOGIES AND M APPINGS ”

17 17 S. Zhang, O.Bodenreider "Alignment of multiple ontologies of anatomy: Deriving indirect mappings from direct mappings to a reference." AMIA Annual Symposium (2005) A. Tordai, A. Ghazvinian, J. van Ossenbruggen, M.A. Musen, N.F. Noy "Lost in Translation? Empirical Analysis of Mapping Compositions for Large Ontologies." Proc. Ontology Matching Workshop @ ISWC (2010) Our work: Use multiple complementing intermediate ontologies + an additional extendMatch to improve recall of compose R ELATED W ORK

18 18 NameSynonym Matcher B ACKUP Synonym1 Name Synonym1 Synonym2 Concept MA Concept NCIT sim = max(sim 1,sim 2,...)


Download ppt "1 M APPING C OMPOSITION FOR M ATCHING L ARGE L IFE S CIENCE O NTOLOGIES A NIKA G ROSS, M ICHAEL H ARTUNG, T ORALF K IRSTEN, E RHARD R AHM 29 TH J ULY 2011,"

Similar presentations


Ads by Google