Presentation is loading. Please wait.

Presentation is loading. Please wait.

Effective Entity Recognition and Typing by Relation Phrase-Based Clustering 20151116.

Similar presentations


Presentation on theme: "Effective Entity Recognition and Typing by Relation Phrase-Based Clustering 20151116."— Presentation transcript:

1 Effective Entity Recognition and Typing by Relation Phrase-Based Clustering

2 Content Motivation Definition Problem model ClusType Algorithm
Experiments

3 Motivation Fine-Grained type information is useful for downstream applications (e.g, it improved the F1 score by 93% for a relation extraction system[1]) Traditional named entity recognition systems are designed for several major types (e.g., person, organization, location) and general domains (e.g., news), require additional steps for adaptation to a new domain and new types. Entity linking techniques suffer coverage and freshness (e.g., over 50% entities mentioned in Web documents are unlinkable [2]) Previous methods have difficulties in handling entity mentions with sparse context. there are often many ways to describe even the same relation between two entities (e.g., “beat” and “won the game over”)

4 Definition

5 Problem model Based on several hypotheses
Hypothesis 1: Entity-Relation Co-occurrences If surface name c often appears as the left (right) argument of relation phrase p, then c's type indicator tends to be similar to the corresponding type indicator in p's type signature. Hypothesis 2: Mention correlation If there exists a strong correlation (i.e., within sentence, common neighbor mentions) between two candidate mentions that share the same name, then their type indicators tend to be similar.

6 Problem model Hypothesis 3: Type signature consistency.
If two relation phrases have similar cluster memberships, the type indicators of their left and right arguments (type signature) tend to be similar, respectively. Hypothesis 4: Relation phrase similarity. Two relation phrases tend to have similar cluster memberships, if they have similar (1) strings; (2) context words; and (3) left and right argument type indicators.

7 H1 H3 H4 H2

8 ClusType Algorithm Framework Overview
1. Perform phrase mining on a POS-tagged corpus to extract candidate entity mentions and relation phrases, and construct a heterogeneous graph G 2. Collect seed entity mentions ML as labels by linking extracted candidate mentions M to the KB Ψ. 3. Estimate type indicator y for unlinkable candidate mention m ∈MU with G using clustering-integrated type propagation.

9 Candidate Generation[4]
1. mining frequent contiguous patterns up to a fixed length 2. using a greedy agglomerative algorithm to generate longer phrases and terminates when the next highest-score merging does not meet a pre-defined significance threshold.

10 Construction of Graph G
Name-Relation Phrase Subgraph

11 Construction of Graph G
Mention Correlation Subgraph Mention-Name Subgraph Washington <-> 76_Washington

12 Clustering-integrated Type Propagation
1. Seed Mention Generation utilize a entity name disambiguation tool ( and only keep entity mapped with high confidence scores (η > 0.8) 2. Joint Optimization the type indicators of entity names C the type signatures of relation phrases {PL; PR} F follows from Hypothesis 1 to model type propagation

13 Clustering-integrated Type Propagation
follows Hypotheses 3 and 4 to model the multi-view relation phrase clustering models the type indicator for each entity mention candidate, the mention-mention link and the supervision from seed mentions Finally, solve the real-valued relaxation of (2) and predict the exact type of each candidate mention using

14 Experiments

15

16 Reference [1] X Ling, DS Weld. Fine-Grained Entity Recognition. AAAI, 2012 [2] Thomas Lin, Mausam, Oren Etzioni. No noun phrase left behind: detecting and typing unlinkable entities. EMNLP-CoNLL, 2012 [3] Xiang Ren, Ahmed El-Kishky, Chi Wang, etc. ClusType: Effective Entity Recognition and Typing by Relation Phrase-Based Clustering. KDD 2015 [4] A. El-Kishky, Y. Song, C. Wang, C. R. Voss, and J. Han. Scalable topical phrase mining from text corpora. VLDB, 2015.


Download ppt "Effective Entity Recognition and Typing by Relation Phrase-Based Clustering 20151116."

Similar presentations


Ads by Google