# COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.

## Presentation on theme: "COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006."— Presentation transcript:

COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006

COGEX@RTE22 LCC’s Submission to RTE2  Linear combination of three entailment scores 1.COGEX with constituency parse tree-derived logic forms 2.COGEX with dependency parse tree-derived logic forms 3.Lexical alignment between T and H For each pair i (T i,H i ) If then T i entails H i  Lambda ( λ ) parameters learned on the development data for each task (IE, IR, QA, SUM)

COGEX@RTE23 Approach to RTE with COGEX  Transform the two text fragments into 3-layered logic forms  Syntactic  Semantic  Temporal  Automatically create axioms to be used during the proof  Lexical Chains axioms  World Knowledge axioms  Linguistic transformation axioms  Load COGEX’s SOS with T and  H and its USABLE list of clauses with the generated axioms,  Search for a proof by iteratively removing clauses from SOS and searching the USABLE for possible inferences until a refutation is found  If no contradiction is detected  Relax arguments  Drop entire predicates from H  Compute proof score semantic and temporal axioms

COGEX@RTE24 COGEX Enhancements (1/3)  Logic Form Transformation  Negations  not_RB(x1,e1) & walk_VB(e1,x2,x3) » - walk_VB(e1,x2,x3)  not_RB(x1,e1) & walk_VB(e1,x2,x3) & fast_RB(x4,e1) » -fast_RB(x4,e1)  no/DT case_NN(x1) & confirm_VB(e1,x2,x1) » - confirm_VB(e1,x2,x1)

COGEX@RTE25 COGEX Enhancements (1/3)  Logic Form Transformation  Temporal normalization of date/time predicates  13 th of January 1990 vs. January 13 th, 1990  13th_of_January_1990_NN(x1) vs. January_13th_1990_NN(x1)  time_TMP(BeginFN(x1), year, month, day, hour, minute, second) & time_TMP(EndFN(x1), year, month, day, hour, minute, second)  time_TMP(BeginFN(x1), 1990, 1, 13, 0, 0, 0) & time_TMP(EndFN(x1), 1990, 1, 13, 23, 59, 59)

COGEX@RTE26 COGEX Enhancements (1/3)  Logic Form Transformation  Temporal context SUMO predicates (Clark et al., 2005)  (S,E 1,E 2 ) : S is the temporal signal linking two events E 1 and E 2  during_TMP(e1,x1), earlier_TMP(e1,x1), …

COGEX@RTE27 Logic Forms Differences  Generate LF from two different sources  Constituency parse of the data  Dependency parse trees (data provided by the challenge organizers) ConstituencyDependency  Semantic information  Temporal information  Captures better the (long-range) syntactic dependencies  Temporal normalization (only)  NEs imported from the constituency LF whenever the tokens matched (no control over tokenization)

COGEX@RTE28 Logic Forms Differences  Gilda Flores was kidnapped on the 13 th of January 1990.  Constituency: Gilda_NN(x1) & Flores_NN(x2) & nn_NNC(x3,x1,x2) & _human_NE(x3) & kidnap_VB(e1,x9,x3) & on_IN(e1,x8) & 13th_NN(x4) & of_NN(x5) & January_NN(x6) & 1990_NN(x7) & nn_ NNC(x8,x4,x5,x6,x7) & _date_NE(x8) & THM_SR(x3,e1) & TMP_SR(x8,e1) & time_TMP(BeginFN(x1), 1990, 1, 13, 0, 0, 0) & time_TMP(EndFN(x1), 1990, 1, 13, 23, 59, 59) & during_TMP(e1,x8)  Dependency: Gilda_Flores_NN(x2) & _human_NE(x2) & kidnap_VB(e1,x4,x2) & on_IN(e1,x3) & 13th_NN(x3) & of_IN(x3,x1) & January_1990_NN(x1)

COGEX@RTE29 COGEX Enhancements (2/3)  Axioms on Demand  Lexical Chains  Consider the first k =3 senses for each word  Maximum length of a lexical chain = 3  DERIVATIONAL WordNet relation is ambiguous with respect to the role of the noun  Derivation-ACT: employ_VB(e1,x1,x2) → employment_NN(e1)  Derivation-AGENT: employ_VB(e1,x1,x2) → employer_NN(x1)  Derivation-THEME: employ_VB(e1,x1,x2) → employee_NN(x2)  Morphological derivations between adjectives and verbs

COGEX@RTE210 COGEX Enhancements (2/3)  Axioms on Demand  Lexical Chains  Augment with the NE predicate for NE target concepts  nicaraguan_JJ(x1,x2) → Nicaragua_NN(x1) & _country_NE(x1)  Discard lexical chains  with more than 2 HYPONYMY relations (H too specific)  with a HYPONYMY followed by an ISA  Chicago_NN(x1)  → Detroit_NN(x1)  which include general concepts: object/NN, act/VB, be/VB  n i = number of hyponyms of concept c i  N = number of concepts in c i ’s hierarchy

COGEX@RTE211 More Axioms  Another 73 World Knowledge axioms  Semantic Calculus – combinations of two semantic relations (82 axioms)  ISA, KINSHIP, CAUSE are transitive relations  ISA_SR(x1,x2) & PAH_SR(x3,x2) → PAH_SR(x3,x2)  Mike is a rich man → Mike is rich  Temporal Reasoning Axioms (Clark et al., 2005) (65 axioms)  Dates entail more general times  October 2000 → year 2000  during_TMP(e1,e2) & during_TMP(e2,e3) → during_TMP(e1,e3)

COGEX@RTE212 COGEX Enhancements (3/3)  Proof Re-Scoring  (T)  smart people →  people (H)  (T)  people  →  smart people (H)  Entities mentioned in T and H are existentially quantified  Universally quantified T and H entities  (T)  people →  smart people (H)  (T)  smart people  →  people (H)

COGEX@RTE213 Shallow Lexical Alignment  Compute the edit distance between T and H  Cost (deletion of a word from T) = 0  Cost (replace of a word from T with another in H) = ∞  Cost (insert a word from H) =  Edit distance between synonyms = 0 T:The Council of Europehas45 member states.Three countries from … DELINSDEL H:The Council of Europeis made up by45 member states.

COGEX@RTE214 Results Learned parameters:  IE: score given by COGEX C with some correction from COGEX D  IR: the highest contribution is made by LexAlign (~62%)  COGEX D better on IE, IR, QA (~69% accuracy)  COGEX C better on SUM (~66% accuracy)  Three-way combination outperforms any individual results and any two-system combination

COGEX@RTE215 Results, Future Work  Higher accuracy on the SUM task  SUM is the highest accuracy task for all systems (false entailment pairs had H completely unrelated with the texts T)  IE: highest number of false positives  Future enhancements  Other types of context: report, planning, etc.  Need for more axioms  Automatic gathering of semantic axioms  Paraphrase acquisition (phrase 1 → phrase 2 )

Thank You ! Questions?

Download ppt "COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006."

Similar presentations