Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploiting Timelines to Enhance Multi-document Summarization Jun-Ping Ng, Yan Chen, Min-Yen Kan, Zhoujun Li DSO National Laboratories National University.

Similar presentations


Presentation on theme: "Exploiting Timelines to Enhance Multi-document Summarization Jun-Ping Ng, Yan Chen, Min-Yen Kan, Zhoujun Li DSO National Laboratories National University."— Presentation transcript:

1 Exploiting Timelines to Enhance Multi-document Summarization Jun-Ping Ng, Yan Chen, Min-Yen Kan, Zhoujun Li DSO National Laboratories National University of Singapore Beihang University

2 Outline Overview Approach Experiments and Results Discussion 2

3 OVERVIEW 3

4 Multi-document Summarization 4

5 Extractive Summarization Find the most salient sentences in source collection Top-k sentences are extracted to compose final summary 5

6 Two Storms (1)A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, wiping out homes and trees in what officials described as the worst storm in years. (2)More than 100,000 coastal villagers have been evacuated before the cyclone made landfall. (3)The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP 6

7 Two Storms (1)A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, wiping out homes and trees in what officials described as the worst storm in years. (2)More than 100,000 coastal villagers have been evacuated before the cyclone made landfall. (3)The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP 7

8 Timeline 8

9 APPROACH 9

10 Merging Timelines Into Summarization 10

11 Temporal Processing Based on TimeML (Pustejovsky et al 2003) Basic temporal units – events + timexes Three steps – Event-timex temporal relation classification – Event-event temporal relation classification – Timex normalization Merge to obtain timelines 11

12 Timelines 12

13 Summarization --- SWING 13 https://github.com/WING-NUS/SWING

14 Sentence Scoring Time span importance Contextual time span importance Sentence temporal coverage density 14

15 Defining Timeline Features 15

16 Time Span Importance (TSI) Time spans which contain many events are more salient Sentences which references events in these time spans are thus better candidates for a summary 16

17 Scoring TSI 17

18 Contextual Time Span Importance (CTSI) Time spans near to “important” time spans may also be important 18

19 Scoring CTSI 19

20 Sentence Temporal Coverage Density (TCD) Number of sentences in a summary is limited Favour sentences which – contain more events – covering a wide variety of time spans 20

21 Scoring TCD 21

22 Sentence Re-ordering SWING makes use of the Maximal Marginal Relevance (MMR) algorithm to identify redundancies in selected sentences MMR is heavily biased towards lexicons and surface similarities 22

23 Beyond Lexical Penalties 23 An official in Barisal, 120 kilometres south of Dhaka, spoke of severe destruction as the 500 kilometre-wide mass of cloud passed overhead. “Many trees have been uprooted and houses and schools blown away,” Mostofa Kamal, a district relief and rehabilitation officer, told AFP by telephone. “Mud huts have been damaged and the roofs of several houses blown off,” said the state’s relief minister, Mortaza Hossain.

24 TimeMMR Novel dimension to redundancy detection Beyond lexical similarities, identify sentences which contain substantial time span overlaps Candidate sentences which share many time spans with selected sentences are penalised 24

25 EXPERIMENTS AND RESULTS

26 Results TAC-2010 data set to train regression model TAC-2011 data set to test Using timelines lead to better summaries! SystemROUGE-2 SWING Timelines0.1394* + TimeMMR

27 Overcoming Errors Timelines contain errors – Errors from underlying temporal processing systems – Simplifying assumptions made in timeline construction – Lack of consistency checking and validation 27

28 Reliability Filtering Identify timelines which potentially contain more errors Exclude these when performing summarization 28

29 Length as a Metric Use the length of a timeline as a gauge of its “accuracy” Drop the use of timelines which are less than the average length, computed over the whole input document collection 29

30 Results Experiments repeated with reliability filtering Significant improvement obtained After filtering timelines are used in 21 out of 44 document sets SystemROUGE-2 SWING Timelines0.1394* + Timelines + Filtering ** + TimeMMR TimeMMR + Filtering ** 30

31 DISCUSSION

32 Text Example 32 The Army’s surgeon general criticized stories in The Washington Post disclosing problems at Walter Reed Army Medical Center, saying the series unfairly characterized the living conditions and care for soldiers recuperating from wounds at the hospital’s facilities. Defense Secretary Robert Gates says people found to have been responsible for allowing substandard living conditions for soldier outpatients at Walter Reed Army Medical Center in Washington will be “held account- able,” although so far no one in the Army chain of com- mand has offered to resign. A top Army general vowed to personally over- see the upgrading of Walter Reed Army Medical Cen- ter’s Building 18, a dilapidated former hotel that houses wounded soldiers as outpatients. Top Army officials visited Building 18, the decrepit former hotel housing more than 80 recovering soldiers, outside “I’m not sure it was an accurate representation,” Lt. Gen. Kevin Kiley, chief of the Army Medical Com- mand which oversees Walter Reed and all Army health care, told reporters during a news conference. Timelines UsedSWING

33 Future Work Study the use of alternative evaluation metrics, especially for TimeMMR Look at better metrics for reliability filtering Expand the scope of the timelines that are used for more flexibility 33

34 Conclusion The use of time is useful for summarization! Sentence Scoring – Derive features from a timeline – Combine features with a supervised learning summarization framework Sentence Re-ordering – Use overlapping time spans to identify redundancies

35 Thank you! 35


Download ppt "Exploiting Timelines to Enhance Multi-document Summarization Jun-Ping Ng, Yan Chen, Min-Yen Kan, Zhoujun Li DSO National Laboratories National University."

Similar presentations


Ads by Google