Download presentation
Presentation is loading. Please wait.
Published byUriel Sutt Modified over 9 years ago
1
Open IE to KBP Relations in 3 Hours Stephen Soderland John Gilmer, Rob Bart, Oren Etzioni, Daniel S. Weld Turing Center University of Washington 11/18/2013TAC-KBP Workshop1
2
11/18/2013TAC-KBP Workshop2 Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs,died of,cancer)
3
11/18/2013TAC-KBP Workshop3 Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs,died of,cancer) (Steve Jobs,died in,his Palo Alto home)
4
11/18/2013TAC-KBP Workshop4 Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs,died of,cancer) (Steve Jobs,died in,his Palo Alto home) (Steve Jobs,is co-founder of, Apple)
5
11/18/2013TAC-KBP Workshop5 Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs,died of,cancer) (Steve Jobs,died in,his Palo Alto home) (Steve Jobs,is co-founder of, Apple) “Hamas denied responsibility for the attacks, which threaten to derail ongoing peace talks.” Arg1RelArg2 (Hamas,denied responsibility for,the attacks)
6
11/18/2013TAC-KBP Workshop6 Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs,died of,cancer) (Steve Jobs,died in,his Palo Alto home) (Steve Jobs,is co-founder of, Apple) “Hamas denied responsibility for the attacks, which threaten to derail ongoing peace talks.” Arg1RelArg2 (Hamas,denied responsibility for,the attacks) (the attacks,threatened to derail,ongoing peace talks)
7
11/18/2013TAC-KBP Workshop7 Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs,died of,cancer) (Steve Jobs,died in,his Palo Alto home) (Steve Jobs,is co-founder of, Apple) “Hamas denied responsibility for the attacks, which threaten to derail ongoing peace talks.” Arg1RelArg2 (Hamas,denied responsibility for,the attacks) (the attacks,threatened to derail,ongoing peace talks) “Ribosomes, which are complexes made of ribosomal RNA and protein, are the cellular components that carry out protein synthesis.” Arg1RelArg2 (Ribosomes, are complexes made of,ribosomal RNA and protein)
8
11/18/2013TAC-KBP Workshop8 Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs,died of,cancer) (Steve Jobs,died in,his Palo Alto home) (Steve Jobs,is co-founder of, Apple) “Hamas denied responsibility for the attacks, which threaten to derail ongoing peace talks.” Arg1RelArg2 (Hamas,denied responsibility for,the attacks) (the attacks,threatened to derail,ongoing peace talks) “Ribosomes, which are complexes made of ribosomal RNA and protein, are the cellular components that carry out protein synthesis.” Arg1RelArg2 (Ribosomes, are complexes made of,ribosomal RNA and protein) (Ribosomes, are,the cellular components)
9
11/18/2013TAC-KBP Workshop9 Open IE “Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1RelArg2 (Steve Jobs,died of,cancer) (Steve Jobs,died in,his Palo Alto home) (Steve Jobs,is co-founder of, Apple) “Hamas denied responsibility for the attacks, which threaten to derail ongoing peace talks.” Arg1RelArg2 (Hamas,denied responsibility for,the attacks) (the attacks,threatened to derail,ongoing peace talks) “Ribosomes, which are complexes made of ribosomal RNA and protein, are the cellular components that carry out protein synthesis.” Arg1RelArg2 (Ribosomes, are complexes made of,ribosomal RNA and protein) (Ribosomes, are,the cellular components) (Ribosomes, carry out,protein synthesis)
10
Advantages of Open IE – Robust – Massively scalable – Works out of the box – Finds whatever relations are expressed in the text – Not tied to an ontology of relations Disadvantages – Finds whatever relations are expressed in the text – Not tied to an ontology of relations Challenge – Map Open IE to an ontology of relations – Minimum of user effort 11/18/2013TAC-KBP Workshop10 github/knowitall/openie
11
11/18/2013TAC-KBP Workshop11 per:cause_of_death: (Steve Jobs,died ofcancer) (Steve Jobs,died from, cancer) (Steve Jobs,passed away from, cancer) (Steve Jobs,succumbed to, cancer) (cancer,killed,Steve Jobs) … (cancer,claimed the life of Steve Jobs) (Steve Jobs,lost his battle to, cancer) (Steve Jobs,was a victim of cancer ) (Steve Jobs,could not beat, cancer ) (Steve Jobs,could not have prevented,his death from cancer) (Steve Jobs,joins the ranks ofcancer fatalities) … Head: high frequency Long tail: low frequency
12
Outline Rules to map to target relations – Rule language – Semantic taggers KBP system – Architecture – 3 hour rule set vs. 12 hour rule set Results and discussion Future work 11/18/2013TAC-KBP Workshop12
13
Desiderata for Target Relation Mapping Works even if no annotated training User may have limited skill in NLP and ML Rules are understandable to user High precision and good generalization Approach: – Manually created rules based on Open IE tuples – Simple rule language – Rules combine lexical and semantic type constraints – Extensible semantic types based on keyword tagger 11/18/2013TAC-KBP Workshop13
14
Rule language 11/18/2013TAC-KBP Workshop14 (Smith, was appointed, Acting Director of Acme Corporation) entity slotfill Terms in RuleExample Target relation:per:employee_or_member_of Query entity in:Arg1 Slotfill in:Arg2 Slotfill type:Organization Arg1 terms:- Relation terms:appointed Arg2 terms: of Functional?no
15
Rule language 11/18/2013TAC-KBP Workshop15 (Smith, was appointed, Acting Director of Acme Corporation) per:employee_or_member_of (Smith, Acme Corporation) Terms in RuleExample Target relation:per:employee_or_member_of Query entity in:Arg1 Slotfill in:Arg2 Slotfill type:Organization Arg1 terms:- Relation terms:appointed Arg2 terms: of Functional?no
16
Semantic Tagging General types – Person, Organization, Location, Date – NER tagger – WordNet User-specified types – Keyword tagger – User creates file of terms for the semantic type – Taggers takes file as input – Used lists from CMU’s NELL for KBP 11/18/2013TAC-KBP Workshop16 github/knowitall/taggers
17
Semantic Types from CMU’s NELL 4K Job titles – academic coordinator … zonal underwriting manager 182 Head job titles – acting chief director … vice-director 47 Religions – Adventist … Zoroastrianism 114 Nationalities – Akkadian … Zambian 5K Cities: Aachen … Zwolle 536 State-provinces: Ad Dali … Zlitan 241 Countries: Afghanistan … Zimbabwe 11/18/2013TAC-KBP Workshop17
18
Outline Rules to map to target relations – Rule language – Semantic taggers KBP system – architecture – 3 hour rule set vs. 12 hour rule set – Co-reference Results and discussion Future work 11/18/2013TAC-KBP Workshop18
19
KBP Architecture 11/18/2013TAC-KBP Workshop19 200M tuples
20
What We Did Not Handle Entity disambiguation needed for KBP precision – Good extraction for “Paul Gray”, but wrong Paul Gray Mostly ignored this in our system – Find any tuple that matched entity string – Detect ambiguous entities if linked to multiple KB entries – Discard all results for ambigous entities 11/18/2013TAC-KBP Workshop20
21
Creating Rule Sets 3 Hour Rules set – Avg 3 rules per relation – Light editing of NELL keyword lists per:cause_of_death = “died of”, “died from”, “died as a result of”, “died due to” 12 Hour Rules set (over two week period) – Avg 16 rules per relation – Refined rules, testing on 2012 KBP answer key – Further editing of NELL keyword lists per:cause_of_death = “die of”, “dies of”, “dying of”, … “succumbed to”, “succumbs to”, … 11/18/2013TAC-KBP Workshop21
22
Outline Rules to map to target relations – Rule language – Semantic taggers KBP system – architecture – 3 hour rule set vs. 12 hour rule set – Co-reference Results and discussion Future work 11/18/2013TAC-KBP Workshop22
23
KBP Results 11/18/2013TAC-KBP Workshop23 Extractor Precision: per:title(Paul Gray, bassist) per:title(Paul Gray, president) KBP Precision: per:title(Paul Gray, bassist) per:title(Paul Gray, president) 35% recall boost from 12 hours
24
Error Analysis 31% “Looked right to me” “Tantawi was the grand sheik” => per:title(Tantawi, sheik) “ETA's political wing Batasuna” => org:subsidiary(ETA, Batasuna) 23% Overgeneralized rules “Ginzburg was an outspoken critic” => per:title(Ginzburg, critic) “Meredith led the NFL in scoring” => per:employee_or_member_of(Meredith, NFL) 19% Rules matched on non-head terms “Kahn’s younger sister married Shankar” => per:spouse(Kahn, Shankar) 15% Open IE errors 12% Coref errors 11/18/2013TAC-KBP Workshop24
25
Ceiling for Recall from Open IE 42% Extracts all information for KBP relation 16% Extractor truncates an argument Omits appositive or parenthetical “Sheikh Tantawi, the top Egyptian cleric who died on Wednesday…” (the top Egyptian cleric, died on, Wednesday) 10%Extractor misses “relational noun” “Tantawi, the Grand Imam of Al-Azhar” 10%No extraction of relevant part of sentence Syntactic complexity 4%Extraction error 18%Other 11/18/2013TAC-KBP Workshop25 68%
26
Future Work Increase recall of Open IE Increase precision of rule applier General method not tied to KBP task – Plug in any ontology of relations – Results not tied to query entity Release as open-source software 11/18/2013TAC-KBP Workshop26
27
Conclusion Novel approach for KBP Slot Filling – Run Open IE extractor on corpus – Semantic taggers based on user-written keyword lists – User-written rules to map target relations to Open IE Results – High extraction precision 0.80 – Moderate recall 0.10 (comparable to all but top sites) Low human effort – Requires no NLP or ML experience – Only 3 hours effort gives high precision 11/18/2013TAC-KBP Workshop27
28
Thank you github/knowitall/openie github/knowitall/taggers 11/18/2013TAC-KBP Workshop28
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.