Presentation on theme: "Natural Language Knowledge Graphs Open-IE meets Knowledge Representation Ido Dagan Bar-Ilan University, Israel."— Presentation transcript:
Natural Language Knowledge Graphs Open-IE meets Knowledge Representation Ido Dagan Bar-Ilan University, Israel
Knowledge Representation (KR) Two complementary frameworks: Knowledge graphs Formal pre-specified schema & predicates Require (supervised) IE to populate from text Targeting established knowledge Open IE Arbitrary propositions found in text (anything said) Represented in natural language terms Our research line: extend Open IE towards a richer KR framework
From Berant et al., 2014 Appeal: complex aggregation queries, via semantic parsing (beyond text-QA scope) E.g. politicians spouses who lived in Chicago
What’s missing in Open IE?
Enriching Proposition Structure and Coverage Gabriel Stanovsky Jessica Ficler Ido Dagan Yoav Goldberg Based on paper at Semantic Parsing ACL 2014 (work in progress)
(Curiosity, Landed on, Mars) (Curiosity, is a, rover) (Curiosity, is a, science lab) (Curiosity, landed on, Mars) (Curiosity, explores, Mars) Open IE produces tuples of predicate and arguments (NASA, launched, Curiosity) (Curiosity, surveys, Mars’ surface) (Curiosity, collects, rock samples)
Falls short of capturing all information conveyed Falls short of representing internal structure of information → Enrich proposition representation and extraction Limitations of Current Proposition Structure
Extracting Implied Propositions Propositions can be implied from syntax Also implied by adjectives, nominalizations, conjunctions, etc. Curiosity’s robotic arm is used to collect samples Curiosity has a robotic arm Possessives Curiosity, the Mars rover, landed on Mars Curiosity is the Mars rover Apposition
from Mars Propositions can be embedded Arguments and predicates may have internal structure NASA utilizes Curiosity to survey Mars Curiosity examines rock samples Enriching Structure
Predicate: is Subject: Curiosity Object: the Mars rover NASA utilizes the Mars rover, Curiosity, to examine rock samples from Mars Proposition Structures Predicate: examine Subject: NASA Object: rock samples Modifier: from Mars rock samples Predicate: utilize Subject: NASA Object: the Mars rover Comp: examine Q: “What is Curiosity?” Q: “Who utilizes the Mars rover?” Q: “What did NASA examine?”
Predicate: is Subject: Curiosity Object: the Mars rover NASA utilizes the Mars rover, Curiosity, to examine rock samples from Mars Proposition Structures Predicate: examine Subject: the Mars rover Object: rock samples Modifier: from Mars rock samples Predicate: utilize Subject: NASA Object: the Mars rover Comp: examine Explicitly represent implied propositions and embedded structure
Further Steps Soon: A tool which produces proposition structures Generically – a better “syntax wrapper” for semantic processing (vs. dependency trees) Add sub-proposition factuality (truth assertion) TruthTeller (Lotan et. al, NAACL 2012) OLLIE (Mausam et al., EMNLP 2012) Extract implied arguments From discourse rather than syntax (Stern and Dagan, ACL 2014)
14 Add features to nodes pt+ TruthTeller: Predicate Truth Value Annotation
15 pt- Add features to nodes Predicate Truth Value Annotation
16 pt? Add features to nodes Predicate Truth Value Annotation
aspirin drug analgesic painkiller caffeine coffee tea
aspirin drug analgesic painkiller caffeine coffee tea Next step – graph aggregation: “Which drinks relieve headache?”
Our Contributions Structuring Open IE with Proposition Entailment Graphs Dataset: 30 gold-standard graphs, 1.5 million entailment annotations Algorithm for constructing Focused Proposition Entailment Graphs Analysis: Predicate entailment is not quite what we thought
How do we recognize proposition entailment?
Lexical Entailment (Logistic) Lexical Entailment Lexical Entailment Features
Lexical Entailment (Logistic) Lexical Entailment Features WordNet Relations UMLS Distributional Similarity String Edit Distance Lexical Entailment Features Supervision
Are WordNet relations capturing real-world predicate entailments?
Why isn’t WordNet capturing predicate entailment? Predicate Entailment vs WordNet Relations Over a predicate inference subset, how many predicate entailments are covered by WordNet? Positive indicators synonyms, hypernyms, entailment Negative Indicators antonyms, hyponyms, cohyponyms
Predicate Entailment is Context- Sensitive
Appeal of NL KR Scalable – in principle unlimited coverage Easy to communicate with people Understand Supervise – add knowledge (vs. in logic representation) May add additional links between propositions Causality, temporal, argumentative Support at least some useful inferences
Integration with Logic-based Approaches Integrate with logical/formal representations for concrete phenomena E.g. temporal, arithmetic, spatial Borrow ideas/methods from logic to apply over NL KR Which are relevant and applicable?
Text Exploration via NL Knowledge Graphs Customer interactions Exploratory search
Example: Service issues not happy with the cateringcoffee is awful coffee in economy is awful no refreshments food on train is too expensive you charge too much for sandwiches food quality is disappointing bad food in premier not enough food selectionprovide veggie meals not happy with the service journey is too slow no clear information not happy with the staff staff is unfriendly no vegetarian foodexpand meal options sandwiches are overpriced sandwiches are too expensive disgusting coffee is served they have horrible coffee food is bad not happy with the catering coffee is awful they have horrible coffee disgusting coffee is served coffee in economy is awful no refreshments food on train is too expensive sandwiches are too expensive sandwiches are overpriced you charge too much for sandwiches food is bad food quality is disappointing bad food in premier not enough food selection expand meal options no vegetarian food provide veggie meals not happy with the service journey is too slow no clear information not happy with the staff staff is unfriendly
not happy with the catering coffee is awful coffee in economy is awful no refreshments food on train is too expensive sandwiches are too expensive food is bad bad food in premier not enough food selection no vegetarian food not happy with the service journey is too slow no clear information not happy with the staff staff is unfriendly not happy with the toilets toilets are dirty toilets are smelly missing hygienic supplies no soap in toilets no toilet paper not happy with train facilities seats are uncomfortable missing facilities no children section no WIFI no AC in cars facilities are bad no AC no AC in station station is too crowded Marshfield station is too crowded cars are congested pathway is too narrow lack of personal space lack of storage space improve the website online booking can be better webpage shows old timetables no website for android can‘t find FAQ page Customer Interactions Entailment Graph
Conclusion: exciting research area Extend Open IE to become NL-based knowledge graph NL proposition structure Graph of inter-proposition relations Entailment – consolidation and hierarchy for propositions Other relations desired – causal, temporal, argumentative, … How does it integrate with formal/artificial language KR? Thank You!