Presentation on theme: "Two Related Lexico-Syntactic Approaches to Entailment Vasile Rus Institute for Intelligent Systems Department of Computer Science"— Presentation transcript:
Two Related Lexico-Syntactic Approaches to Entailment Vasile Rus Institute for Intelligent Systems Department of Computer Science
TODAY- Outline General strategy –Map T and H into lexico-syntactic graphs –Perform graph subsumption between graph-T and graph-H –Additive strategy Not cascaded Two approaches –Lexico-syntactic approach –Dependency-based approach Results, Comparison, Conclusions
The Two Approaches Lexico-syntactic approach –Lexical component –Syntactic component –Dependencies derived from phrase-based parse trees –Negation –thesaurus Dependency-based approach –Dependencies from MINIPAR –Lexical component by default –Postprocessing (thanks to Vivi Nastase) To eliminate unused information To retain only dependencies among content words
Graph Subsumption Map nodes and edges in H-graph to nodes and edges in T-graph complex mapping based on –Named Entity Inferences: Overture Services Inc -- Overture –Word-level entailment / equivalence: take over – buy –Syntactic Info: Yahoo is the agent of buying
From Sentences to Graph Representation vertices represent content words edges represent dependencies –local dependencies (intra-phrase) are straightforwardly obtained from a parse tree –remote dependencies are obtained using an extended functional tagger –Or from MINIPAR (for the second approach)
The Entailment Score The score is so defined to be non-reflexive: entail(T, H) ≠ entail(H,T) Score is also used as confidence
The Parameters the following parameters worked best on development α=.5 β =.5 γ=0
Negation Explicit –Clue phrases no, not, neither … nor shortened forms: ‘nt Implicit –Antonymy in WordNet Hypothetical sentences: “a possible visit by Clinton to China” does not entail “Clinton visited China” –a form of negation
Conclusions Lexical information significantly helps The other components (synonymy, dependencies, negation) add value but not significantly
Missed Opportunities Linguistic Level –Five = 5 –Tuscany province = province of Tuscany Current subsumption algorithm is weak T: Besancon is the capital of France’s watch and clock- making industry and of high precision engineering. H: Besancon is the capital of France. Solution: matching with more complex structures World Knowledge
More Conclusions Our system is light –Good for interactive environment such as Intelligent Tutoring Systems No training involved –Just development to tune few parameters
One More Conclusion It is not clear whether there is a difference among the two ways to obtain dependencies!
Two Related Lexico-Syntactic Approaches to Entailment Thankyoueveryone !