Presentation is loading. Please wait.

Presentation is loading. Please wait.

Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:

Similar presentations


Presentation on theme: "Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:"— Presentation transcript:

1 Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:

2 TAC KBP Goals Goal: Populate a knowledge base (KB) with information about entities as found in a collection of source documents, following a specified schema for the KB KBP 2009-2011: Focus on augmenting an existing KB. Decompose KBP into two tasks ▫Entity-Linking: link each given named entity mention to a node in reference KB (or create new node) ▫Slot-Filling: Learn attributes about target entities from the source documents and add new information about the entity to the reference KB KBP 2012: Combine entity-linking and slot-filling to build a KB from scratch -> Cold Start KBP 2013: ▫Conversational, informal data (discussion fora) ▫Temporal constraints for Slot Filling (2011 pilot) ▫Sentiment analysis for Slot Filling

3 TAC KBP 2013 Track Participants Track coordinators ▫Hoa Dang (Slot Filler Validation) ▫Jim Mayfield (Entity Linking, Cold Start KBP) ▫Margaret Mitchell (Sentiment Slot Filling) ▫Mihai Surdeanu (English Slot Filling and Temporal Slot Filling) LDC linguistic resource providers: Joe Ellis, Jeremy Getman, Justin Mott, Xuansong Li, Kira Griffitt, Stephanie M. Strassel, Jonathan Wright Coordinators emeritus: Ralph Grishman, Heng Ji Advisor: Boyan Onyshkevych 45 Teams ▫14 countries (21 USA, 9 China, 3 Spain, 2 Germany,….)

4 6 (8) TAC KBP 2013 Tracks Entity-Linking ▫English ▫Chinese ▫Spanish Slot-Filling (English) ▫Regular ▫Sentiment ▫Temporal ▫Slot Filler Validation Task Cold Start (English)

5 Entity Linking and Slot Filling Tracks Goal: Augment a reference knowledge base (KB) with info about query entities (PER, ORG, GPE) as found in a diverse collection of documents Reference KB: Oct 2008 Wikipedia snapshot. Each KB node corresponds to a Wikipedia page and contains: ▫Infobox ▫Wiki_text (free text not in infobox) English source documents: ▫1M News docs ▫1M Web docs ▫99K Discussion Forum docs (threads) Chinese source documents: 2M news, 800K Web Spanish source documents: 900K news

6 Entity-Linking Evaluation Results English ▫Participants:26 teams ▫Highest F1: 0.721 (0.730 in 2012) ▫Median F1: 0.583 (0.536 in 2012) Chinese ▫Participants:4 teams ▫Highest F1:0.622 (0.740 in 2012) ▫Median F1:0.619 (0.617 in 2012) Spanish ▫Participants3 teams ▫Highest F1:0.709 (0.641 in 2012) ▫Median F1:0.651 (0.612 in 2012)

7 Regular Slot Filling Evaluation Results Participants: 18 teams Human F1: 0.685 (0.814 in 2012) Highest System F1: 0.373 (0.517 in 2012) 2 nd Highest System F1:0.339 (0.296 in 2012) Median System F1:0.150 (0.099 in 2012)

8 Sentiment Slot Filling Track Sentiment analysis for KBP: ▫Holder (PER, ORG, GPE) ▫Target (PER, ORG, GPE) ▫Polarity (positive, negative) Implemented as regular slot filling, with different set of slots ▫{per,org,gpe}:positive-towards ▫{per,org,gpe}:negative-towards ▫{per,org,gpe}:positive-from ▫{per,org,gpe}:negative-from Participants: 3 teams Evaluation results: ▫Human F1:0.727 ▫Highest System F1:0.132 ▫Median System F1:0.014

9 Temporal Slot Filling Track Find tightest temporal constraints [T1 T2 T3 T4] on a given relation ▫Relation is true for a period beginning between T1 and T2 ▫Relation is true for a period ending between T3 and T4 Participants: 5 teams Evaluation results: ▫Human Accuracy: 0.688 ▫Highest System Accuracy: 0.331 ▫Median System Accuracy:0.148

10 Slot Filler Validation Track (SFV) Task: Determine whether or not a candidate slot filler is correct Objective: improve precision without excessive reduction of recall Participants: 5 teams Some SFV runs had overwhelmingly positive impact on individual SF runs!

11 Cold Start KBP Track Goal: Build a KB from scratch, containing all targeted info about all entities as found in a relatively closed domain corpus of documents KB schema: same entity types and slots as regular slot-filling task Source document collection: ▫50K Web pages from small-town publications (from TREC KBA document stream) Required capabilities: ▫Entity-linking: Grounding all named entity mentions in docs to KB nodes ▫Slot-filling: Learning attributes about all named entities Post-submission evaluation queries traverse KB starting from a single entity node (entity mention): ▫0-hop: Find all children of Michael Jordan ▫1-hop: Find date of birth of each of the children of Michael Jordan

12 Cold Start Evaluation Results (Preliminary) Participants: 3 teams 0-hop queries: ▫Highest F10.384 (0.497 in 2012) 1-hop queries: ▫Highest F10.145 (0.255 in 2012) Combined 0-hop and 1-hop F1 ▫Highest F1: 0.278 (~0.352 in 2012)

13 TAC KBP Discussion/Planning Sessions Monday, November 18 (2:15-3:10pm): ▫English Slot Filling ▫Slot Filler Validation ▫Temporal Slot Filling? ▫+Spanish Slot Filling? ▫+Event identification and argument extraction? Tuesday, November 19 (3:00-4:00pm): ▫Cold Start ▫English Entity Linking (as queries in Cold Start framework?) ▫Cross-Lingual Spanish and Chinese Entity Linking  + Discussion forum


Download ppt "Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:"

Similar presentations


Ads by Google