Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Interactive Approach to Collectively Resolving URI Coreference

Similar presentations


Presentation on theme: "An Interactive Approach to Collectively Resolving URI Coreference"— Presentation transcript:

1 An Interactive Approach to Collectively Resolving URI Coreference
Saisai Gong, Wei Hu, Gong Cheng, Yuzhong Qu

2 Contents Background Related Work Overview of our Approach
Evolvement of Individual Partition Computing Consensus Partition Evaluation Conclusion

3 Background owl:sameAs URICoreference ……
URICoreference ……

4 Related Work Fully automatic approaches
OWL semantics Similarities between descriptions Self –training Automatic approaches remain far from prefect (see Ferrara et al )

5 Related Work (Cont.) Semi-automatic approaches
Active learning Micro-task crowdsourcing Assumptions made by semi-automatic approaches Users act as “oracle” One single right answer Not always hold Users may have different opinions Disagreement among users happen Distinguish a user's individual URI coreference from the mass Resolve disagreement among users

6 Our Approach iReC iReC: an interactive approach to resolve collectively URI coreference with user involvement Basic idea: achieve a good partition of the URI universe Maintain individual partition for each user Form consensus partition aggregated from individual ones Evolve partitions through user interaction Two goals Alleviate user involvement Reflect the collective power of masses

7 Overview of our Approach

8 Candidate Selector Generating Candidates
Find potential coreference from various sources owl:sameAs links existing resolution services such as sameas.org, keyword-based entity search engines such as Falcons Object Search the user's individual partition the consensus partition Merge URIs belonging to the same equivalent class into a candidate entity

9 Learning Binary Classifier
To reduce user involvement Learning model: averaged perceptron (See Collins 02) Online learning algorithm Learn individual classifier both online and offline, learn global one offline

10 Learning Binary Classifier
Training data Online : latest URI pairs from user feedback Offline training examples Positive : URIs pairs from equivalent classes Negative URI pairs from user feedback URI pairs from different equivalent classes sharing types URI pairs Falcons search result

11 Learning Binary Classifier
Training algorithm Feature : the cartesian product of the two candidates' properties Feature value: for each property pair, compute maximum similarity of the given two properties’ values URIs: vsim=1 iff identical or in equivalent class Numeric literals: vsim=1 iff difference less than threshold Boolean literals: vsim=1 iff value equal Other literals: Jaccard similarity

12 Learning Binary Classifier
Training algorithm

13 Selecting Most Beneficial Candidate
Combine individual classifier and global one by their weights (α_+ β = 1) Confidence of coreference based on margin The larger the absolute value of margin is, the higher the confidence is Uncertainty: the absolute value of margin Select candidate with minimum absolute value of margin

14 Comparative Snippets To facilitate user interaction
Coreferent (non-coreferent resp.): values of discriminative property pairs signicantly similar (dissimilar resp.) Discriminability of property pairs: absolute values of weight in combined classifier

15 Comparative Snippets Compute maximum weighted matching on the bipartite graph from property pairs Get topk property value pairs based on maximum similarity of property values

16 Computing Consensus Partition
Minimize disagreements between individual partitions In our approach, using symmetric difference distance Maximizing NP-complete

17 Computing Consensus Partition
Approximation algorithm clustering-based Compute a partition on the union of individual partitions’ domains first initialize a similarity matrix Mtrx=( ij ) begin with each URI forming an equivalence class separately for each class pair (i, j) , where > 0, merge together classes i,j , and update Mtrx

18 Computing Consensus Partition

19 Evaluation Build link between NYT and Dbpedia of OAEI benchmark
10 fold cross validation

20 Evaluation F-Measure

21 Evaluation Examination
Choose 50 popular URIs from falcons Invite 10 people to resolve URIcoreference on the 50 URIs using SView In average, times verification, 32.0 accepted as positive 53.9 pair of URIs in individual partitions

22 Evaluation User study SUS Vs sigma 72 vs 68

23 Conclusion Averaged Perceptron is feasible User involvement is reduced

24 Reference A. Ferrara, A. Nikolov, J. Noessner, and F. Schare. Evaluation of instance matching tools: the experience of OAEI. Journal of Web Semantics, 21:49-60, 2013. M. Collins. Discriminative training methods for hidden markov models: theory and experiments with perceptron algorithms. In Proc. of EMNLP, pages 1-8, 2002.


Download ppt "An Interactive Approach to Collectively Resolving URI Coreference"

Similar presentations


Ads by Google