Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploiting Timelines to Enhance Multi-document Summarization

Similar presentations


Presentation on theme: "Exploiting Timelines to Enhance Multi-document Summarization"— Presentation transcript:

1 Exploiting Timelines to Enhance Multi-document Summarization
Jun-Ping Ng, Yan Chen, Min-Yen Kan and Zhoujun Li National University of Singapore Beihang University

2 Image Courtesy: Univ. Wisconsin-Madison
24 Jun 2014 ACL Timelines in Summarization Cyclone Sidr 2007, JTWC designation: 06B “A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, …” Image Courtesy: Univ. Wisconsin-Madison

3 Image Courtesy: US Navy / Wikipedia
24 Jun 2014 ACL Timelines in Summarization “… wiping out homes and trees in what officials described as the worst storm in years.” Image Courtesy: US Navy / Wikipedia

4 Image Courtesy: US State Department / Wikipedia
24 Jun 2014 ACL Timelines in Summarization “More than 100,000 coastal villagers have been evacuated before the cyclone made landfall.” Image Courtesy: US State Department / Wikipedia

5 Image Courtesy: US Navy / Wikipedia
24 Jun 2014 ACL Timelines in Summarization 1991 Bangladesh Cyclone “The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP.” Image Courtesy: US Navy / Wikipedia

6 ACL 2014 - Timelines in Summarization
24 Jun 2014 ACL Timelines in Summarization [2] “More than 100,000 coastal villagers have been evacuated before the cyclone made landfall.” [1] “A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, wiping out homes and trees in what officials described as the worst storm in years.” [3] “The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP.”

7 ACL 2014 - Timelines in Summarization
24 Jun 2014 ACL Timelines in Summarization [2] “More than 100,000 coastal villagers have been evacuated before the cyclone made landfall.” [1] “A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, wiping out homes and trees in what officials described as the worst storm in years.” [3] “The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP.”

8 ACL 2014 - Timelines in Summarization
24 Jun 2014 ACL Timelines in Summarization Timelines from Text [3] “The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP.” [1] “A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, wiping out homes and trees in what officials described as the worst storm in years.” [2] “More than 100,000 coastal villagers have been evacuated before the cyclone made landfall.”

9 Key time spans are summary worthy
24 Jun 2014 ACL Timelines in Summarization Key time spans are summary worthy [3] “The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP.” [1] “A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, wiping out homes and trees in what officials described as the worst storm in years.” [2] “More than 100,000 coastal villagers have been evacuated before the cyclone made landfall.”

10 Timelines + Summarization
24 Jun 2014 ACL Timelines in Summarization Timelines + Summarization Timelines + Summarization Summarization System Summary Lexical and positional features Overview – Of how we are putting things together Timeline-derived features Timelines (per input document)

11 ACL 2014 - Timelines in Summarization
24 Jun 2014 ACL Timelines in Summarization Outline Goal and Motivation Timeline Generation Integrating Timelines In Scoring: (Contextual) Importance, Density In Re-ordering: TimeMMR Experiments Discussion

12 ACL 2014 - Timelines in Summarization
24 Jun 2014 ACL Timelines in Summarization Timeline Generation

13 1. Event-Event Temporal Classification
24 Jun 2014 ACL Timelines in Summarization 1. Event-Event Temporal Classification (Ng et al., 2013; EMNLP) Article-wide – not all pairs shown for brevity

14 2. Event-Timex Temporal Classification
24 Jun 2014 ACL Timelines in Summarization 2. Event-Timex Temporal Classification (Ng and Kan, 2012; COLING) Only intra-sentence

15 3. Timex Normalization (HeidelTime; Strötgen and Gertz, 2013)
24 Jun 2014 ACL Timelines in Summarization 3. Timex Normalization (HeidelTime; Strötgen and Gertz, 2013) “Today”  June 6, 2014

16 Timeline Construction
24 Jun 2014 ACL Timelines in Summarization Timeline Construction Map normalized timexes to timeline Place events which OVERLAP with timexes onto timeline Place events which OVERLAP with other events onto the timeline Insert rest of events based on BEFORE/AFTER ordering 1999 Once we have the results for the underlying temporal processing, we know the relative ordering between all the events we want to place onto a timeline. In this case it is pretty straight forward to construct a timeline. This is an example of an algorithm that we can use to build a timeline, but definitely it is probably not the only way.

17 Integrating Timelines into SWING
24 Jun 2014 ACL Timelines in Summarization Integrating Timelines into SWING Temporal Processing Summarization Pipeline SWING (Ng et al., COLING 2012, TAC 2011) State-of-the-art open-source extractive summarizerhttps://github.com/WING-NUS/SWING Basic, k of n sentence summaries I base my experiments off an open-source summarization system SWING. We developed SWING originally to participate in the Text Anakysis Conference (TAC) in In the evaluation, SWING came in first when evaluated with a popular automatic measure we call ROUGE. It also did very well in manual pyramid evaluations for content, coming in in the top 2 if I remember correctly. It is open-source and available freely for non-commercial use. So do download it to give it a try if you are interested. Time Span Importance Time MMR Contextual Time Span Importance Sentence Temporal Coverage Density

18 1. Time Span Importance (TSI)
24 Jun 2014 ACL Timelines in Summarization 1. Time Span Importance (TSI) Time spans which contain many events are more salient Sentences which references events in these time spans are thus better candidates for a summary TS_L the timespan with max # of

19 2. Contextual Time Span Importance (CTSI)
24 Jun 2014 ACL Timelines in Summarization 2. Contextual Time Span Importance (CTSI) Time spans near to important time spans are important Search left and right for local peaks , where Explain concepts of TSI, CTSI and TCD

20 3. Sentence Temporal Coverage Density (TCD)
24 Jun 2014 ACL Timelines in Summarization 3. Sentence Temporal Coverage Density (TCD) Favour sentences which contain more events covering a wide variety of time spans Explain concepts of TSI, CTSI and TCD

21 Identifying Redundancies
24 Jun 2014 ACL Timelines in Summarization Identifying Redundancies SWING makes use of the Maximal Marginal Relevance (MMR) algorithm to identify redundancies in selected sentences MMR is based largely on surface lexical similarities Idea: Let’s use time as a basis to penalize the selection of sentences from redundant time periods.

22 ACL 2014 - Timelines in Summarization
24 Jun 2014 ACL Timelines in Summarization TimeMMR Beyond lexical similarities, identify sentences which contain substantial time span overlap. Candidate sentences which share many time spans with selected sentences are penalized. Proportion of overlap An official in Barisal, 120 kilometres south of Dhaka, spoke of severe destruction as the 500 kilometre-wide mass of cloud passed overhead. “Many trees have been uprooted and houses and schools blown away,” Mostofa Kamal, a district relief and rehabilitation officer, told AFP by telephone. “Mud huts have been damaged and the roofs of several houses blown off,” said the state’s relief minister, Mortaza Hossain. Lexically dissimilar but redundant Example shows events which happen in the same time span, but lexically dis-similar.

23 ACL 2014 - Timelines in Summarization
24 Jun 2014 ACL Timelines in Summarization Experiments Data TAC 2010 dataset for training TAC 2011 dataset for testing Temporal Processing Systems HeidelTime (Strötgen and Gertz, 2013) E-T temporal classification (Ng and Kan, 2012) E-E temporal classification (Ng et al., 2013) Summarization baseline SWING (Ng et al., 2012)

24 Results # Configuration R-2 R SWING 0.1339 B1 CLASSY 0.1278 1
24 Jun 2014 ACL Timelines in Summarization Results * = p < 0.1, ** = p < 0.05, against R row # Configuration R-2 R SWING 0.1339 B1 CLASSY 0.1278 1 SWING + Timeline Features 0.1394* 2 SWING + Timeline Features + TimeMMR 0.1389 Doesn’t seem very effective! Explain filtering

25 Analysis: Timelines contain errors
24 Jun 2014 ACL Timelines in Summarization Analysis: Timelines contain errors Errors from underlying temporal processing systems Simplifying assumptions made in timeline construction Lack of consistency checking and validation For effective use, we must identify good timelines Identify timelines which potentially contain more errors Exclude these when performing summarization

26 Reliability Filtering
24 Jun 2014 ACL Timelines in Summarization Reliability Filtering Short timelines can result when the system fails to extract or relate events and timexes Features derived from short timelines are prone to have extreme values Use the length of a timeline as a gauge of its accuracy Don’t use timelines shorter than average (as computed over the whole collection)

27 With Reliability Filtering
24 Jun 2014 ACL Timelines in Summarization With Reliability Filtering * = p < 0.1, ** = p < 0.05, against R row # Configuration R-2 R SWING 0.1339 B1 CLASSY 0.1278 1 SWING + Timeline Features 0.1394* 2 SWING + Timeline Features + TimeMMR 0.1389 3 SWING + Timeline Features [Filtered] 0.1418** 4 SWING + Timeline Features + TimeMMR [Filtered] 0.1402** TimeMMR doesn’t seem effective! Why? Explain verbally “filtering” – and show that it is useful!

28 Does TimeMMR actually help?
24 Jun 2014 ACL Timelines in Summarization Does TimeMMR actually help? L1 An Iraqi reporter threw his shoes at visiting U.S. President George W. Bush and called him a ”dog” in Arabic during a news conference with Iraqi Prime Minister Nuri al-Maliki in Baghdad R1 L2 ”All I can report is it is a size 10,. R2 L3 Muntadhar al-Zaidi, reporter of Baghdadiya television jumped and threw his two shoes one by one at the president, who ducked and thus narrowly missed being struck, raising chaos in the hall in Baghdad’s heavily fortified green Zone. The incident occurred as Bush was appearing with Iraqi Prime Minister Nouri al-Maliki. R3 L4 The president lowered his head and the first shoe hit the American and Iraqi flags behind the two leaders. R4 L5 The The president lowered his head and the R5 Possibly Redundant? = Right summary is better according to R2. However we argue that R3 is “redundant”, and that the left summary gives more information R-2: , worse by R-2 R-2: , better by R-2 Could an (automated) evaluation metric cater for time?

29 ACL 2014 - Timelines in Summarization
24 Jun 2014 ACL Timelines in Summarization Conclusion Use of automatic timeline generation Integration of timelines into summarization Sentence scoring via timeline features Sentence re-ordering via TimeMMR Length based timeline filtering helps to ameliorate errors For details on temporal processing, see: Jun Ping’s work at COLING 2012, EMNLP 2013 and his doctoral thesis (2014) Questions? If not, ask for more detailed analysis!

30 ACL 2014 - Timelines in Summarization
24 Jun 2014 ACL Timelines in Summarization Additional Slides

31 ACL 2014 - Timelines in Summarization
24 Jun 2014 ACL Timelines in Summarization Related Work For Sentence Reordering Barzilay et al., 1999 Recency as an indicator of salience Goldstein et al., 2000;Wan, 2007; Demartini et al., 2010 Liu et al., 2009 (“Temporal Graph”) Wu, 2008 (“Largest Cluster”) TREC Temporal Summarization Track Not as relevant; about monitoring an event over time Close to our TSI

32 With time features; better Baseline; worse
24 Jun 2014 ACL Timelines in Summarization With time features; better Baseline; worse

33 ACL 2014 - Timelines in Summarization
24 Jun 2014 ACL Timelines in Summarization TSI: A crane accident With TSI; better Without TSI; worse With TSI, the cause of the accident in this summary is included; the alternative R1 sentence is background information and does not occur at any key time span.

34 CTSI: Coral Reef Preservation
24 Jun 2014 ACL Timelines in Summarization CTSI: Coral Reef Preservation With CTSI; better Without CTSI; worse With CTSI, the “warn” and “disappear” events were promoted in importance due to their proximity with peak P

35 ACL 2014 - Timelines in Summarization
24 Jun 2014 ACL Timelines in Summarization Timeline Caveats Some events span a long period of time (i.e., “1999”) Events are ordered based on the start of the duration Timeline captures relative order Construction algorithm does not attempt to reconcile contradictions The definition of a timeline I am adopting here is a bit simplified. Mainly because we only assume 3 temporal relations. So here are some caveats with the chosen representation.

36 ACL 2014 - Timelines in Summarization
24 Jun 2014 ACL Timelines in Summarization Timex Normalization F1 scores Source:Bethard, 2013

37 ACL 2014 - Timelines in Summarization
24 Jun 2014 ACL Timelines in Summarization References Jun-Ping Ng, Interpreting Text with Time, Doctoral Thesis, National University of Singapore, 2014 Jun-Ping Ng, Min-Yen Kan, Ziheng Lin, Wei Feng, Bin Chen, Jian Su, Chew-Lim Tan, Exploiting Discourse Analysis for Article-Wide Temporal Classification, EMNLP 2013 Jun-Ping Ng, Praveen Bysani, Ziheng Lin, Min-Yen Kan, Chew-Lim Tan, Exploiting Category-Specific Information for Multi-Document Summarization, COLING 2012 Jun-Ping Ng, Min-Yen Kan, Improved Temporal Relation Classification using Dependency Parses and Selective Crowdsourced Annotations, COLING 2012


Download ppt "Exploiting Timelines to Enhance Multi-document Summarization"

Similar presentations


Ads by Google