Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Machine Reading-based Approach to Update Summarization Andrew Hickl, Kirk Roberts, and Finley Lacatusu Language Computer Corporation April 26, 2007.

Similar presentations


Presentation on theme: "A Machine Reading-based Approach to Update Summarization Andrew Hickl, Kirk Roberts, and Finley Lacatusu Language Computer Corporation April 26, 2007."— Presentation transcript:

1 A Machine Reading-based Approach to Update Summarization Andrew Hickl, Kirk Roberts, and Finley Lacatusu Language Computer Corporation April 26, 2007

2 Overview Introduction Why Machine Reading? System Overview – Question Processing – Sentence Retrieval and Ranking – Recognizing Textual Entailment – Sentence Selection – Summary Generation Results – Main Task – Update Task Conclusions and Future Considerations

3 Update Summarization As currently defined, the task of update summarization requires systems to maximize the amount of new information included in a summary that is not available from any previously-considered document. Don’t consider identical content. Don’t consider textually entailed content. Consider contradictory content. Potentially consider inferable content. Consider new information. Require access to models of the knowledge available from texts! Term overlap, etc.

4 What is Machine Reading? Machine Reading (MR) applications seek to promote the understanding of texts by provide a representation of the knowledge available from a corpus. Three important components: – Knowledge Acquisition: How can we automatically extract the semantic/pragmatic content of a text? – Knowledge Representation: How do we encode the propositional content of a text in a regular manner? – Stability/Control: How do we ensure that the knowledge acquired from text is consistent with previous commitments stored in a knowledge base? We believe that the recognition of knowledge that’s consistent with a KB is an important prerequisite for performing update summarization: – Identify content that’s already stored in the KB – Identify content that’s inferable from the KB – Identify content that contradicts content in the KB Consistency: Assume that knowledge is consistent wrt a particular model M iff the truth of a proposition can be reasonably inferred from the other knowledge commitments of M.

5 From Textual Inference to Machine Reading The recent attention paid to the task of recognizing textual entailment (Dagan et al. 2006) and textual contradiction (Harabagiu et al. 2006) has led to the development of systems capable of accurately recognizing different types of textual inference relationships in natural language texts. Text: A revenue cutter, the ship was named for Harriet Lane, niece of President James Buchanan, who served as Buchanan’s White House hostess. Hyp: Lane worked at the White House. A revenue cutter, the ship was named for Harriet Lane, niece of President James Buchanan, who served as Buchanan’s White House hostess. Hyp: Lane never set foot in White House. Textual Entailment (RTE3 Test Set)Textual Contradiction Despite still being a relatively new evaluation area for NLP, statistical knowledge-lean approaches are achieving near human-like performance: PASCAL RTE-2 (2006): 75.38% accuracy (Hickl et al. 2006) [max: 86.5%] PASCAL RTE-3 (2007): 81.75% accuracy (Hickl and Bensley 2007) [max: 85.75%] Contradiction: 66.5% accuracy (Harabagiu, Hickl, and Lacatusu, 2006) Human Agreement: 86% (entailment), 81% (contradiction)

6 The Machine Reading Cycle Document Ingestion Commitment Extraction Knowledge Base Knowledge Selection Text Repository Textual Entailment Textual Contradiction No Yes No Yes Knowledge Consolidation Text Probes (hypotheses) KB Commitments (“texts”)

7 Overview Introduction Why Machine Reading? System Overview – Question Processing – Sentence Retrieval and Ranking – Recognizing Textual Entailment – Sentence Selection – Summary Generation Results – Main Task – Update Task Conclusions and Future Considerations

8 Architecture of GISTexter Complex Question Keyword Extraction Syntactic Q Decomp Semantic Q Decomp Question Answering Multi-Document Summarization Summary Generation Summary Answer Commitment Extraction Sentence Ranking Textual Contradiction Textual Entailment No Yes No Yes Sentence Selection Knowledge Base Main Question ProcessingSentence Retrieval and RankingSummary Generation CorrectionsNew Knowledge Machine Reading Update

9 Question Processing GISTexter uses three different Question Processing modules in order to represent the information need of complex questions. What are the long term and short term implications of Israel’s continuing military action against Lebanon, including airstrikes on Hezbollah positions in Southern Lebanon? Syntactic Question Decomposition Extracted: implications, Israel, military, action, (Southern) Lebanon, airstrikes, Hezbollah, positions Alternations: implications, effects, outcomes, disaster, scandal, crisis; Israel, Israeli, Jewish state; military action, attack, operation, onslaught, invasion; Lebanon, Lebanese; positions, locations, facilities, targets, bunkers, areas, situations What are the long term implications of Israel’s action against Lebanon? What are the short term implications of Israel’s action against Lebanon? What are the long term implications of Israeli airstrikes on Hezbollah positions in Southern Lebanon? Semantic Question Decomposition What ramifications could another round of airstrikes have on relations? What could foment anti-Israeli sentiment among the Lebanese population? What kinds of humanitarian assistance has Hezbollah provided in Lebanon? How much damage resulted from the Israeli airstrikes on Lebanon? Keyword Extraction/ Alternation Who has re-built roads, schools, and hospitals in Southern Lebanon? What are the short term implications of Israeli airstrikes on Hezbollah positions in Southern Lebanon?

10 Question Processing Q0. What are the long-term ramifications of Israeli airstrikes against Hezbollah? A0: Security experts warn that this round of airstrikes could have serious ramifications for Israel, including fomenting anti-Israeli sentiment among most of the Lebanese population for generations to come. R1. ramifications-airstrikes Q1. What ramifications could this round of airstrikes have ? Q2. What could foment anti- Israeli sentiment among the Lebanese population? Q3. What kinds of humanitarian assistance has Hezbollah provided in Lebanon? A2. Hezbollah has provided humanitarian assistance to the people of Southern Lebanon following recent airstrikes; a surprising move to many who believe Hezbollah’s sole purpose was foment unrest in Lebanon and Israel. A3. Following the widespread destruction caused by Israeli airstrikes, Hezbollah has moved quickly to provide humanitarian assistance, including rebuilding roads, schools, and hospitals and ensuring that water and power is available in metropolitan areas. R2. fomenting-unrestR3. provide-humanitarian assistance A1. The most recent round of Israeli airstrikes has caused significant damage to the Lebanese civilian infrastructure, resulting in more than an estimated $900 million in damage in the Lebanese capital of Beirut alone. Q4. How much damage resulted from the airstrikes? Q5. What is Hezbollah’s sole purpose? Q6. Who has re-built roads, schools, and hospitals? R4. result-COSTR5. ORG-purposeR6. ORG-rebuild

11 Sentence Retrieval and Ranking As with our DUC 2006 system, we used two different mechanisms to retrieve relevant sentences for a summary: – Question Answering (Hickl et al. 2006): Keywords extracted from subquestions and automatically expanded Sentences retrieved and ranked based on number and proximity of keywords in each sentence Top 10 answers from each subquestion are combined and re-ranked in order to produce a ranked list of sentences for a summary – Multi-Document Summarization (Harabagiu et al. 2006, Harabagiu & Lacatusu 2005): Computed topic signatures (Lin and Hovy 2000) and enhanced topic signatures (Harabagiu 2004) for each relevant set of documents Sentences retrieved based on keywords; re-ranked based on combined topic score derived from topic signatures All retrieved sentences were then re-ranked based on a number of features, including: – Relevance score assigned by retrieval engine – Position in document – Number of topical terms / named entities – Length of original document Feature weights were determined using a hill-climber trained on “human” summaries from the DUC 2005 and 2006 main tasks.

12 Architecture of GISTexter Complex Question Keyword Extraction Syntactic Q Decomp Semantic Q Decomp Question Answering Multi-Document Summarization Summary Generation Summary Answer Commitment Extraction Sentence Ranking Textual Contradiction Textual Entailment No Yes No Yes Sentence Selection Knowledge Base Main Question ProcessingSentence Retrieval and RankingSummary Generation CorrectionsNew Knowledge Machine Reading Update

13 Recognizing Textual Entailment Preprocessing Commitment Extraction Commitment Alignment Lexical Alignment Entailment Classification text hyp Contradiction Recognition Entailed Knowledge Extracted Knowledge +TE -TE Extracted commitments from t and h NO YES NO YES Text Commitments Hyp Commitments

14 Recognizing Textual Entailment Preprocessing Commitment Extraction Commitment Alignment Lexical Alignment Entailment Classification text hyp Contradiction Recognition Entailed Knowledge Extracted Knowledge +TE -TE Extracted commitments from t and h NO YES NO YES Text Commitments Hyp Commitments Step 1. Preprocessing of text-hypothesis pairs – POS Tagging, Syntactic Parsing, Morphological Stemming, Collocation Detection – Annotation of Tense/Aspect, Modality, Polarity – Semantic Parsing (PropBank, NomBank, FrameNet) – Named Entity Recognition (~300 named entity types) – Temporal Normalization – Temporal Relation Detection (t-link, s-link, a-link) – Pronominal Co-reference – Nominal Co-reference – Synonymy and Antonymy Detection – Predicate Alternation (based on pre-cached corpus of predicate paraphrases)

15 Recognizing Textual Entailment Preprocessing Commitment Extraction Commitment Alignment Lexical Alignment Entailment Classification text hyp Contradiction Recognition Entailed Knowledge Extracted Knowledge +TE -TE Extracted commitments from t and h NO YES NO YES Text Commitments Hyp Commitments Step 2. Commitment Extraction A revenue cutter, the ship was named for Harriet Lane, niece of President James Buchanan, who served as Buchanan’s White House hostess. Harriet Lane worked at the White House. 1. The ship was named for Harriet Lane. 2. A revenue cutter was named for Harriet Lane. 3. The ship was named for the niece of Buchanan. 4. Buchanan had the title of President. 5. Buchanan had a niece. 6. A revenue cutter was named for the niece of Buchanan. 7. Harriet Lane was the niece of Buchanan. 8. Harriet Lane was related to Buchanan. 9. Harriet Lane served as Buchanan’s White House hostess. 10. Buchanan had a White House Hostess. 11. There was a hostess at the White House. 12. The niece of Buchanan served as White House hostess. 13. Harriet Lane served as White House hostess. 14. Harriet Lane served as a hostess. 15. Harriet Lane served at the White House. 1. The ship was named for Harriet Lane. 2. A revenue cutter was named for Harriet Lane. 3. The ship was named for the niece of Buchanan. 4. Buchanan had the title of President. 5. Buchanan had a niece. 6. A revenue cutter was named for the niece of Buchanan. 7. Harriet Lane was the niece of Buchanan. 8. Harriet Lane was related to Buchanan. 9. Harriet Lane served as Buchanan’s White House hostess. 10. Buchanan had a White House Hostess. 11. There was a hostess at the White House. 12. The niece of Buchanan served as White House hostess. 13. Harriet Lane served as White House hostess. 14. Harriet Lane served as a hostess. 15. Harriet Lane served at the White House. 16. Harriet Lane worked at the White House. Conjunction Subordination Reported Speech Appositives Relative Clauses Titles and Epithets Co-reference Resolution Ellipsis Resolution Pre-Nominal Modifiers Possessives

16 Recognizing Textual Entailment Preprocessing Commitment Extraction Commitment Alignment Lexical Alignment Entailment Classification text hyp Contradiction Recognition Entailed Knowledge Extracted Knowledge +TE -TE Extracted commitments from t and h NO YES NO YES Text Commitments Hyp Commitments Step 3. Commitment Alignment – Used Taskar et al. (2005)’s discriminative matching approach to word alignment Cast alignment prediction as maximum weight bipartite matching Used large-margin estimation to learn parameters w which: – where y i (correct alignment), (actual alignment), x i (sentence pair), w (parameter), f (feature mapping) – Used reciprocal best-hit match to ensure that best commitment alignments were considered 13. Harriet Lane served as White House hostess. 14. Harriet Lane served as a hostess. 15. Harriet Lane served at the White House. Harriet Lane worked at the White House.

17 Recognizing Textual Entailment Preprocessing Commitment Extraction Commitment Alignment Lexical Alignment Entailment Classification text hyp Contradiction Recognition Entailed Knowledge Extracted Knowledge +TE -TE Extracted commitments from t and h NO YES NO YES Text Commitments Hyp Commitments Step 4. Lexical Alignment – Used Maximum Entropy Classifier to identify best possible token-wise alignment for each phrase chunk found in t-h pair Morphological Stemming / Levenshtein Edit Distance Numeric /Date Comparators (second, 2; 1920’s, 1928) Named Entity Categories (350+ types from LCC’s CiceroLite) WordNet synonymy/antonymy distance

18 Recognizing Textual Entailment Preprocessing Commitment Extraction Commitment Alignment Lexical Alignment Entailment Classification text hyp Contradiction Recognition Entailed Knowledge Extracted Knowledge +TE -TE Extracted commitments from t and h NO YES NO YES Text Commitments Hyp Commitments Step 5. Entailment Classification and Contradiction Recognition – Used Decision Tree classifier (C4.5) 2006 (RTE-2): Trained on 100K+ Entailment Pairs 2007 (RTE-3): Trained only on RTE-3 Development Set (800 Pairs) – If NO judgment returned: Consider all other commitment-hypothesis pairs with p align ≥ ; ( = 0.85) Return NO as RTE judgment – If YES judgment returned: Used system for recognizing textual contradiction (Harabagiu et al. 2006) to determine whether the hypothesis contradicted any other extracted commitment If no contradiction can be found  positive instance of textual entailment If contradiction  negative instance of textual entailment

19 Architecture of GISTexter Complex Question Keyword Extraction Syntactic Q Decomp Semantic Q Decomp Question Answering Multi-Document Summarization Summary Generation Summary Answer Commitment Extraction Sentence Ranking Textual Contradiction Textual Entailment No Yes No Yes Sentence Selection Knowledge Base Main Question ProcessingSentence Retrieval and RankingSummary Generation CorrectionsNew Knowledge Machine Reading Update

20 Sentence Selection and KB Update Step 1. Entailment confidence scores assigned to commitments are then used to re- rank the sentences that they were extracted from: – Textual Entailment: Entailed Commitments (known information): Negative Weight Non-Entailed Commitments (new information): Positive Weight – Textual Contradiction: Contradicted Commitments (changed information): Positive Weight Non-Contradicted Commitment (no change): No Contribution – Confidence scores are normalized for textual entailment and textual contradiction Step 2. After each round of summarization, GISTexter’s knowledge base is updated to include: – All non-entailed commitments – All contradicted commitments Step 3. Fixed-length summaries were generated as in (Lacatusu et al. 2006): – Top-ranked sentences clustered based on topic signatures to promote coherence – Heuristics used to insert paragraph breaks, drop words until word limit was met.

21 Results: Main Task Two differences between 2006 and 2007 versions of GISTexter: – Sentence Ranking: 2006: Used textual entailment to create Pyramids from 6 candidate summaries 2007: Learned sentence weights based on 2005, 2006 summaries – Coreference Resolution: 2006: Used heuristics to select sentences with “resolvable” pronouns 2007: Used coreference resolution system to resolve all pronouns

22 Non-Redundancy vs. Referential Clarity Using output from a pronoun resolution system can boost referential clarity: but at what price? – Only a modest gain: 3.71  4.09 – Marked loss in non-redundancy: 4.60  3.89 – Same redundancy filtering techniques used in 2006, 2007 Summaries appear to be incurring a “repeat mention penalty”; need to know: – When pronouns should be resolved – When pronouns should not be resolved Resolved OutputOriginal Context Need to revisit our heuristics!

23 Results: Update Task Evaluation results from the Update Task were encouraging: GISTexter produced some of the most responsive summaries evaluated in DUC 2007.

24 Results: Update Task On average, “B” summaries were judged to be significantly worse than either “A” or “C” summaries on both Content Responsiveness and Modified Pyramid. – Unclear as to exactly why this was the case – Not due to “over-filtering”: KB A was always smaller than KB A+B, less knowledge to potentially entail commitments extracted from the text.

25 Future Considerations What’s the right way to deal with contradictory information? – Do users want to be notified when information changes? When any information changes? When relevant information changes? – How do you incorporate updates into a coherent text? How can we evaluate the quality of updates? – Current approaches only measure the responsiveness of individual summaries – Is it possible to create “gold standard” lists of the facts (propositions?) that are available from a reading of a text? – Isn’t it enough just to be responsive? For Q/A or QDS – yes. For database update tasks – maybe not. How much recall do readers have? – Is it fair to assume that a reader of a text has access to all of the knowledge stored in a knowledge repository? – What level of “recap” is needed?

26 Ensuring Stability and Control In order to take full advantage of the promise of machine reading for summarization, systems need to take steps to provide greater stability and control over the knowledge being added to a KB. – Control: How do we keep from introducing error-full knowledge into our knowledge bases? – Stability: How do we keep from removing accurate knowledge from our knowledge bases? Quality of Knowledge in KB time 1 0 ++ Introducing Errors, Removing Accurate Knowledge Including Perfect Knowledge Introducing Errors! Machine Reading!

27 Thank you!

28 Semantic Question Decomposition Method for decomposing questions operates on a Markov Chain (MC) by performing a random walk on a bipartite graph of: – Sequences of operators on relations (Addition(R1), Remove(R1), Replace(R1,R2)) – Previous questions created by previous sequence of operators Markov Chain alternates between selecting a sequence of operations ( {O i } ) and generating a question decomposition ( Q i ): O0O0 O1O1 O2O2 Q1Q1 Q2Q2 p(Q 1 |O 0 ) p(O 1 |Q 1 ) p(Q 2 |O 1 ) p(O 2 |Q 2 ) Assume initial state of MC depends initial sequence of operators available ( {O 0 } ) Defining {O 0 } depends on access to a knowledge mapping function M 1 (KB, T, TC): KB : available knowledge base T : available text in corpus TC : concepts extracted from T Assume that {O 0 } represents set of operators that maximizes value of M 1.

29 Semantic Question Decomposition Following (Lapata and Lascarides 2003), the role of M 1 is to coerce knowledge from a conceptual representation of a text that can be used in question decomposition. State transition probabilities also depend on a second mapping function, M 2,defined as: M 2 (KB, T) = {C L, R L } – C L : set of related concepts stored in a KB – R L : set of relations that exist between concepts in C L Both C L and R L are assumed to be discovered using M 1 This notation allows us to define a random walk for hypothesis generation using a matrix notation: – Given N = |C L | and M = |R L |, we define: A stochastic matrix A (with dimensions N × M ) with entries a i,j = p(r i |h j ), where r i = sequence of relations and h j = partial hypothesis generated … and a second matrix, B (with dimensions M × N ) with entries b i,j = p(h i |r j ) – We can estimate the probabilities for a i,j and b i,j by applying the Viterbi algorithm to the maximum likelihood estimations resulting from the knowledge mappings for M 1 and M 2. Several possibilities for M 1 and M 2, including: – Density functions introduced by (Resnik 1995) – Probabilistic framework for taxonomy representation (Snow et al. 2006)


Download ppt "A Machine Reading-based Approach to Update Summarization Andrew Hickl, Kirk Roberts, and Finley Lacatusu Language Computer Corporation April 26, 2007."

Similar presentations


Ads by Google