Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hoai-Viet To1, Ryutaro Ichise2, and Hoai-Bac Le1

Similar presentations

Presentation on theme: "Hoai-Viet To1, Ryutaro Ichise2, and Hoai-Bac Le1"— Presentation transcript:

1 Hoai-Viet To1, Ryutaro Ichise2, and Hoai-Bac Le1
An Adaptive Machine Learning Framework with User Interaction for Ontology Matching Hoai-Viet To1, Ryutaro Ichise2, and Hoai-Bac Le1 1 Ho Chi Minh University of Science, Vietnam 2 National Institute of Informatics, Japan 3/27/2017

2 Ontology Matching (OM) Problem
Ontology is a hierarchical structure used to organize concepts. Ontology plays an important role in semantic web development. Ontology matching finds correspondences between concepts from two ontologies. Ontology matching is an important process when we want to integrate heterogeneous information source in new semantic web environment. 3/27/2017

3 Machine Learning Framework for OM
We introduced a machine learning framework for ontology matching problem in [Ichise, 2008] Our hypothesis: the use of semi-supervised learning method will reduce the manual annotation cost. Cb1 Ca1 Ca2 Ca3 Cb2 Cb3 ID Sim1 Simn Class Ca1  Cb1 0.5 0.7 1 Ca1  Cb2 0.3 0.56 Pre-alignment Correct mapping: Ca1  Cb1 Ca2  Cb1 … Incorrect mapping Ca1  Cb2 Ca1  Cb3… 3/27/2017

4 Semi-supervised Learning with User Interaction
Basic idea: propagate label through unlabeled data Problem: few samples of labeled data  low confidence prediction. ? Red ? Blue User Interaction 3/27/2017

5 Adaptive Machine Learning Framework
Use multiple learning strategies + user interaction Ontology Storage learner training Initialize labeling User Interaction Ontology Parser Similarity Calculator learner Pre-alignment training 3/27/2017

6 Adaptive Machine Learning Framework
Similarity measures are based on those used in machine learning framework proposed in [Ichise, 2008], which: include 24 string-based similarity measures calculate similarity between: concept feature, concept structure feature, and concept hierarchical feature. Our system: Machine Learning Framework for Ontology Matching with User Interaction (MalfomUI) 3/27/2017

7 Experiments Purpose: Compare the performance of our learning framework with other matching systems. General setting: Dataset from directory track of OAEI 2008’s campaign. [Caracciolo et. al., 2008] The dataset is constructed from three internet directories: Yahoo, Google, Looksmart. Simple equivalent relation. The dataset includes 4487 labeled matching tasks, in which there are 2160 positive samples and 2327 negative samples. Base learner: Naïve Bayes 3/27/2017

8 Experiments Pre-Experiment – Supervised Learning method:
Used as baseline to compare with semi-supervised learning method. Study the effect of training-set size on the performance of the supervised learning method. 3/27/2017

9 Experimental Results MalUI-5 to MalUI-4000: Training set size
Add training size Training set size 3/27/2017

10 Experiments Experiment – Semi-supervised learning with user interaction Study the performance of semi-supervised learning method with user interaction. User annotate 20 samples at initialize phase and then label 4 samples more in 2 feedback round. 3/27/2017

11 Comparison with other matching systems [Caracciolo et. al., 2008]
Experimental Results MalUI-RF: Change the size Comparison with other matching systems [Caracciolo et. al., 2008] 3/27/2017

12 Experimental Results Semi-supervised learning with user feedback can reduce the cost of manual annotation. * In MalfomUI-RF experiment, users need to label 28 samples in total. MalUI -30 MalUI -100 MalUI -500 MalUI -4000 MalUI –RF Precision 0.56 0.62 0.68 0.61 Recall 0.59 0.63 0.74 0.75 0.73 F-Measure 0.58 0.71 0.67 Note the sample labeled in RF 3/27/2017

13 Conclusion Conclusions:
Our adaptive machine learning framework is effective: it requires less annotation cost but gains approximately good performance. Machine learning approaches with user interaction are promising for ontology matching systems. Future works: Integrate more similarity measures to cover real datasets. Consider more complicate semi-supervised models. 3/27/2017

Download ppt "Hoai-Viet To1, Ryutaro Ichise2, and Hoai-Bac Le1"

Similar presentations

Ads by Google