Inexact Matching of ontology graphs using expectation maximization Universidad Autónoma de Madrid -15 Enero 2010 Inexact Matching of ontology graphs using expectation maximization Prashant Doshi, Ravikanth Kolli, Christopher Thomas Web Semantics: Science, Services and Agents on the World Wide Web 2009 Keywords: ontology, matching, expectation-maximization
Agenda Introduction Expectation Maximization Ontology Schema Model Universidad Autónoma de Madrid -15 Enero 2010 Agenda Introduction Expectation Maximization Ontology Schema Model Graph Matching with GEM Random sampling and Heuristics Computational complexity Initial Results Large ontologies Benchmarks Conclusions
Universidad Autónoma de Madrid -15 Enero 2010 Introduction Growing usefulness of semantic web based on the increasingly number of ontologies OWL and RDF are labeled-directed-graph ontology representation languages Formulation ‘Find the most likely map between the two ontologies’*
Expectation Maximization Universidad Autónoma de Madrid -15 Enero 2010 Expectation Maximization Technique to find the maximum likelihood estimate of the underlying model from observed data in the presence of missing data. E-Step Formulation of the estimate M-Step Search for the maximum of the estimate Relaxed search using: GEM
Ontology Schema Model OWL y RDF (labeled directed graphs) Universidad Autónoma de Madrid -15 Enero 2010 Ontology Schema Model OWL y RDF (labeled directed graphs) Labels are removed, constructing a bipartite graph.
Graph matching GEM Maximum likelyhood estimate problem Universidad Autónoma de Madrid -15 Enero 2010 Graph matching GEM Maximum likelyhood estimate problem Hidden variables: mapping matrix Local search guided by GEM Search-Space
Universidad Autónoma de Madrid -15 Enero 2010 Graph matching GEM M* gives the maximum conditional probability of the data graph Od given Om. Only many-one matching Focused on homeomorphisms
Graph matching GEM MLE problem with respect to map hidden variables
Graph matching GEM Need to maximize:
Graph matching GEM Probability that xa is in correspondence with ya given the assignment model Each of the hidden variables
Graph matching GEM Graph constraints And Smith-Waterman
Graph matching GEM Exhaustive search not possible Universidad Autónoma de Madrid -15 Enero 2010 Graph matching GEM Exhaustive search not possible Problem: local maxima Use K random models + heuristics If two classes are mapped, map their parents + Random restart
Computational complexity Universidad Autónoma de Madrid -15 Enero 2010 Computational complexity SW technique is O(L2) EM mapping is O(K*(|Vm|*|Vd|)2 )
Universidad Autónoma de Madrid -15 Enero 2010 Initial Experiments
Universidad Autónoma de Madrid -15 Enero 2010 Large Ontologies
Universidad Autónoma de Madrid -15 Enero 2010 Benchmarks
Conclusions Structure and Syntactic vs External Resources Universidad Autónoma de Madrid -15 Enero 2010 Conclusions Structure and Syntactic vs External Resources Weak performance: dissimilar names and structure Good performance: extensions and flattening Not scalable : partitioning and extension No longer GEM, but converges Future work: Markov Chain MonteCarlo methods Extensible algorithm: can include other aproaches