KBP2014 Entity Linking Scorer Xiaoman Pan, Qi Li, Heng Ji, Xiaoqiang Luo, Ralph Grishman

Slides:



Advertisements
Similar presentations
Clustering Overview Algorithm Begin with all sequences in one cluster While splitting some cluster improves the objective function: { Split each cluster.
Advertisements

The Primal-Dual Method: Steiner Forest TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A AA A A A AA A A.
Introduction to Algorithms Graph Algorithms
Towards a Quadratic Time Approximation of Graph Edit Distance Fischer, A., Suen, C., Frinken, V., Riesen, K., Bunke, H. Contents Introduction Graph edit.
Great Theoretical Ideas in Computer Science
1 © 2009 APQC. ALL RIGHTS RESERVED. Measurement Assessment Tool to Connect Questions to Measures u Based on “Tree Diagram” format u Can start at different.
Longest Common Subsequence
O(N 1.5 ) divide-and-conquer technique for Minimum Spanning Tree problem Step 1: Divide the graph into  N sub-graph by clustering. Step 2: Solve each.
Overview of the TAC2013 Knowledge Base Population Evaluation: Temporal Slot Filling Mihai Surdeanu with a lot help from: Hoa Dang, Joe Ellis, Heng Ji,
Bipartite Matching, Extremal Problems, Matrix Tree Theorem.
Overview of the TAC2013 Knowledge Base Population Evaluation: English Slot Filling Mihai Surdeanu with a lot help from: Hoa Dang, Joe Ellis, Heng Ji, and.
Comp 122, Spring 2004 Greedy Algorithms. greedy - 2 Lin / Devi Comp 122, Fall 2003 Overview  Like dynamic programming, used to solve optimization problems.
Great Theoretical Ideas in Computer Science for Some.
Label Placement and graph drawing Imo Lieberwerth.
Overview of the KBP 2013 Slot Filler Validation Track Hoa Trang Dang National Institute of Standards and Technology.
Graph Traversals Visit vertices of a graph G to determine some property: Is G connected? Is there a path from vertex a to vertex b? Does G have a cycle?
Efficient Query Evaluation on Probabilistic Databases
Global and Local Wikification (GLOW) in TAC KBP Entity Linking Shared Task 2011 Lev Ratinov, Dan Roth This research is supported by the Defense Advanced.
Great Theoretical Ideas in Computer Science.
1 Discrete Structures & Algorithms Graphs and Trees: II EECE 320.
Data Transmission and Base Station Placement for Optimizing Network Lifetime. E. Arkin, V. Polishchuk, A. Efrat, S. Ramasubramanian,V. PolishchukA. EfratS.
Iterative closest point algorithms
1 Optimization problems such as MAXSAT, MIN NODE COVER, MAX INDEPENDENT SET, MAX CLIQUE, MIN SET COVER, TSP, KNAPSACK, BINPACKING do not have a polynomial.
Whole Genome Alignment using Multithreaded Parallel Implementation Hyma S Murthy CMSC 838 Presentation.
Optimizing F-Measure with Support Vector Machines David R. Musicant Vipin Kumar Aysel Ozgur FLAIRS 2003 Tuesday, May 13, 2003 Carleton College.
A general approximation technique for constrained forest problems Michael X. Goemans & David P. Williamson Presented by: Yonatan Elhanani & Yuval Cohen.
1 高等演算法 Homework One 暨南大學資訊工程學系 黃光璿 2004/11/11. 2 Problem 1.
Dept. of Computer Science Distributed Computing Group Asymptotically Optimal Mobile Ad-Hoc Routing Fabian Kuhn Roger Wattenhofer Aaron Zollinger.
Job Scheduling Lecture 19: March 19. Job Scheduling: Unrelated Multiple Machines There are n jobs, each job has: a processing time p(i,j) (the time to.
Given Connections Solution
Two Discrete Optimization Problems Problem #2: The Minimum Cost Spanning Tree Problem.
An Information Theoretic Approach to Bilingual Word Clustering Manaal Faruqui & Chris Dyer Language Technologies Institute SCS, CMU.
Image Segmentation Rob Atlas Nick Bridle Evan Radkoff.
Tag-based Social Interest Discovery
Motif Discovery in Protein Sequences using Messy De Bruijn Graph Mehmet Dalkilic and Rupali Patwardhan.
C&O 355 Lecture 2 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A.
V. V. Vazirani. Approximation Algorithms Chapters 3 & 22
C OLLECTIVE ANNOTATION OF WIKIPEDIA ENTITIES IN WEB TEXT - Presented by Avinash S Bharadwaj ( )
The Traveling Salesman Problem Approximation
JHU/CLSP/WS07/ELERFED Scoring Metrics for IDC, CDC and EDC David Day ELERFED JHU Workshop July 18, 2007.
EMIS 8373: Integer Programming NP-Complete Problems updated 21 April 2009.
1 Approximate Algorithms (chap. 35) Motivation: –Many problems are NP-complete, so unlikely find efficient algorithms –Three ways to get around: If input.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Graph Colouring L09: Oct 10. This Lecture Graph coloring is another important problem in graph theory. It also has many applications, including the famous.
MIDDLEWARE SYSTEMS RESEARCH GROUP Adaptive Content-based Routing In General Overlay Topologies Guoli Li, Vinod Muthusamy Hans-Arno Jacobsen Middleware.
Clustering.
Presenter : Kuang-Jui Hsu Date : 2011/3/24(Thur.).
Bipartite Matching. Unweighted Bipartite Matching.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 25.
DNA, RNA and protein are an alien language
Efficient Belief Propagation for Image Restoration Qi Zhao Mar.22,2006.
On the Hardness of Optimal Vertex Relabeling and Restricted Vertex Relabeling Amihood Amir Benny Porat.
Tzvika Hartman Elad Verbin Bar Ilan University Tel Aviv University
Pattern Matching With Don’t Cares Clifford & Clifford’s Algorithm Orgad Keller.
Privacy Issues in Graph Data Publishing Summer intern: Qing Zhang (from NC State University) Mentors: Graham Cormode and Divesh Srivastava.
Pertemuan 23 IP Routing Protocols
NYU Coreference CSCI-GA.2591 Ralph Grishman.
What is the next line of the proof?
Great Theoretical Ideas in Computer Science
Lecture 24: NER & Entity Linking
Pattern Matching With Don’t Cares Clifford & Clifford’s Algorithm
Tractable MAP Problems
Neural Networks Chapter 4
Fair Clustering through Fairlets ( NIPS 2017)
Problem Solving 4.
TF Media 3rd official Meeting
Optimal Point Movement for Covering Circular Regions
Umans Complexity Theory Lectures
Approximation Algorithms for the Selection of Robust Tag SNPs
Presentation transcript:

KBP2014 Entity Linking Scorer Xiaoman Pan, Qi Li, Heng Ji, Xiaoqiang Luo, Ralph Grishman

Overview  We will apply two steps to evaluate the KBP2014 Entity Discovery and Linking results. 1. Clustering score 2. Entity Linking (Wikification) F-score

Overview  We use a tuple doc-id, start, end, entity-type, kb-id to represent each entity mention, where a special type of kb-id is NIL.  Let s be an entity mention in the system output, g be an corresponding gold-standard. An output mention s matches a reference mention g iff: 1. s.doc-id = g.doc-id, 2. s.start = g.start, s.end = g.end, 3. s.entity-type = g.entity-type, 4. and s.kb-id = g.kb-id.

Clustering score  In this step, we only concern clustering performance. Thus, change all mentions’ kb_id to NIL.  We will apply three metrics to evaluate the clustering score  B-Cubed metric  CEAF metric  Graph Edit Distance (G-Edit)

B-Cubed

CEAF  CEAF (Luo 2005)  Idea: a mention or entity should not be credited more than once  Formulated as a bipartite matching problem A special ILP problem efficient algorithm: Kuhn-Munkres  We will use CEAFm as the official scoring metric because it’s more sensitive to cluster size than CEAFe

CEAF (Luo, 2005)

G-Edit  The first step, evaluate the name mention tagging F-measure  The second step, evaluate the overlapped mentions by using graph-based edit distance  To do:  Find a theoretical proof that the connected component based method is the unique optimal solution  Try to make the matching more sensitive to cluster size

Wikification F-score  Existing Entity Linking (Wikification) F-score