Exploiting Timelines to Enhance Multi-document Summarization Jun-Ping Ng, Yan Chen, Min-Yen Kan, Zhoujun Li DSO National Laboratories National University of Singapore Beihang University
Outline Overview Approach Experiments and Results Discussion 2
OVERVIEW 3
Multi-document Summarization 4
Extractive Summarization Find the most salient sentences in source collection Top-k sentences are extracted to compose final summary 5
Two Storms (1)A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, wiping out homes and trees in what officials described as the worst storm in years. (2)More than 100,000 coastal villagers have been evacuated before the cyclone made landfall. (3)The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP 6
Two Storms (1)A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, wiping out homes and trees in what officials described as the worst storm in years. (2)More than 100,000 coastal villagers have been evacuated before the cyclone made landfall. (3)The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP 7
Timeline 8
APPROACH 9
Merging Timelines Into Summarization 10
Temporal Processing Based on TimeML (Pustejovsky et al 2003) Basic temporal units – events + timexes Three steps – Event-timex temporal relation classification – Event-event temporal relation classification – Timex normalization Merge to obtain timelines 11
Timelines 12
Summarization --- SWING 13
Sentence Scoring Time span importance Contextual time span importance Sentence temporal coverage density 14
Defining Timeline Features 15
Time Span Importance (TSI) Time spans which contain many events are more salient Sentences which references events in these time spans are thus better candidates for a summary 16
Scoring TSI 17
Contextual Time Span Importance (CTSI) Time spans near to “important” time spans may also be important 18
Scoring CTSI 19
Sentence Temporal Coverage Density (TCD) Number of sentences in a summary is limited Favour sentences which – contain more events – covering a wide variety of time spans 20
Scoring TCD 21
Sentence Re-ordering SWING makes use of the Maximal Marginal Relevance (MMR) algorithm to identify redundancies in selected sentences MMR is heavily biased towards lexicons and surface similarities 22
Beyond Lexical Penalties 23 An official in Barisal, 120 kilometres south of Dhaka, spoke of severe destruction as the 500 kilometre-wide mass of cloud passed overhead. “Many trees have been uprooted and houses and schools blown away,” Mostofa Kamal, a district relief and rehabilitation officer, told AFP by telephone. “Mud huts have been damaged and the roofs of several houses blown off,” said the state’s relief minister, Mortaza Hossain.
TimeMMR Novel dimension to redundancy detection Beyond lexical similarities, identify sentences which contain substantial time span overlaps Candidate sentences which share many time spans with selected sentences are penalised 24
EXPERIMENTS AND RESULTS
Results TAC-2010 data set to train regression model TAC-2011 data set to test Using timelines lead to better summaries! SystemROUGE-2 SWING Timelines0.1394* + TimeMMR
Overcoming Errors Timelines contain errors – Errors from underlying temporal processing systems – Simplifying assumptions made in timeline construction – Lack of consistency checking and validation 27
Reliability Filtering Identify timelines which potentially contain more errors Exclude these when performing summarization 28
Length as a Metric Use the length of a timeline as a gauge of its “accuracy” Drop the use of timelines which are less than the average length, computed over the whole input document collection 29
Results Experiments repeated with reliability filtering Significant improvement obtained After filtering timelines are used in 21 out of 44 document sets SystemROUGE-2 SWING Timelines0.1394* + Timelines + Filtering ** + TimeMMR TimeMMR + Filtering ** 30
DISCUSSION
Text Example 32 The Army’s surgeon general criticized stories in The Washington Post disclosing problems at Walter Reed Army Medical Center, saying the series unfairly characterized the living conditions and care for soldiers recuperating from wounds at the hospital’s facilities. Defense Secretary Robert Gates says people found to have been responsible for allowing substandard living conditions for soldier outpatients at Walter Reed Army Medical Center in Washington will be “held account- able,” although so far no one in the Army chain of com- mand has offered to resign. A top Army general vowed to personally over- see the upgrading of Walter Reed Army Medical Cen- ter’s Building 18, a dilapidated former hotel that houses wounded soldiers as outpatients. Top Army officials visited Building 18, the decrepit former hotel housing more than 80 recovering soldiers, outside “I’m not sure it was an accurate representation,” Lt. Gen. Kevin Kiley, chief of the Army Medical Com- mand which oversees Walter Reed and all Army health care, told reporters during a news conference. Timelines UsedSWING
Future Work Study the use of alternative evaluation metrics, especially for TimeMMR Look at better metrics for reliability filtering Expand the scope of the timelines that are used for more flexibility 33
Conclusion The use of time is useful for summarization! Sentence Scoring – Derive features from a timeline – Combine features with a supervised learning summarization framework Sentence Re-ordering – Use overlapping time spans to identify redundancies
Thank you! 35