Presentation is loading. Please wait.

Presentation is loading. Please wait.

From extracts to abstracts: human summary production operations for computer-aided summarisation Laura Hasler University of Wolverhampton

Similar presentations


Presentation on theme: "From extracts to abstracts: human summary production operations for computer-aided summarisation Laura Hasler University of Wolverhampton"— Presentation transcript:

1 From extracts to abstracts: human summary production operations for computer-aided summarisation Laura Hasler University of Wolverhampton L.Hasler@wlv.ac.uk CALP 2007: 30.09.07

2 30.09.07Laura Hasler: CALP 20072 Overview Computer-aided summarisation (CAS) Summary production stage of summarisation Classification of human summary production operations (and guidelines) Evaluation of classification (and guidelines derived from it) Some conclusions and possibilities for future work

3 30.09.07Laura Hasler: CALP 20073 Computer-aided summarisation Feasible alternative to fully automatic summarisation given problems of coherence/ readability with automatic extracts Automatic summarisation methods produce an extract (document exploration, relevance assessment) which is then post-edited by user (summary production) No resources to ensure consistency Focus of this research on summary production (extract  abstract) to improve coherence and readability

4 30.09.07Laura Hasler: CALP 20074 Aim of the research A)Chernobyl reactor number 4 was ripped apart by an explosion on 26 April 1986. Last September, the IAEA and the WHO released a report. Its headline conclusion that radiation from the accident would kill a total of 4000 people was widely reported. B) Last September, the IAEA/WHO released a report on the explosion of Chernobyl reactor number 4 on 26 April 1986, concluding that radiation from the accident would kill a total of 4000 people. (h03-ljh)

5 30.09.07Laura Hasler: CALP 20075 Classification of operations 43 pairs of news texts (extract, abstract) 30% extracts (CAST guidelines)  20% abstracts 5 general classes of operations –Atomic: deletion, insertion –Complex: replacement, reordering, merging Each split into sub-operations (26 in total) Sub-operations linked to triggers, or recognisable surface forms Function of units also important

6 30.09.07Laura Hasler: CALP 20076 Deletion “The process of removing a unit from a certain place in the extract so it does not appear in the same place in the abstract” Used alone or as part of complex operations Very useful for reducing text when used alone Deletes non-essential units (details, repetitions) Complete sentences, subordinate clauses, PPs, reporting clauses, determiners, be

7 30.09.07Laura Hasler: CALP 20077 Deletion examples [I suspect that] the set would be the ideal book for a physicist to be cast away with on a desert island. (new-sci-B7L-54-ljh) Three papers published recently in Science move us a little closer to understanding the basis of the disease[, which turns out to be highly complex]. (sci04done-an)

8 30.09.07Laura Hasler: CALP 20078 Insertion “The process of adding a unit which is not present in the extract into the abstract” Used alone or as part of complex operations Interesting because it adds text to something which is supposed to be reduced Used to add coherence and to clarify whilst saving space Connectives, modifiers, ‘formulaic units’, punctuation

9 30.09.07Laura Hasler: CALP 20079 Insertion examples He sees the need to raise public awareness and demystify science and technology as a key point… (new-sci-B7L-75-ljh) [X sees Y as Z] The TV series Men of Science is now being shown in a few other areas. (new-sci-B7L-69-ljh)

10 30.09.07Laura Hasler: CALP 200710 Replacement “The deletion of one unit and the insertion of a different unit in the same place in the text” Complex operation, can be used in combination with other complex operations Useful for avoiding repetition and saving space Pronominalisation, lexical substitution, NP restructuring, nominalisation, VPs, passivisation, abbreviations

11 30.09.07Laura Hasler: CALP 200711 Replacement examples [Zhanat Carr, a radiation scientist with the WHO in Geneva,] The WHO [says] admits the 5000 deaths were omitted because the report was a "political communication tool". (h03-ljh) [All this] [is] hardly Culver’s fault. [The same difficulties are to be found in all other parts of evolutionary ecology.]  These general difficulties of evolutionary ecology are hardly Culver’s fault. (new-sci-B7L-63-ljh)

12 30.09.07Laura Hasler: CALP 200712 Reordering “The deletion of a unit from one place in the extract and its insertion in a different place in the abstract” Complex operation, can be used in combination with other complex operations Sub-functions rather than operations – difficult to sub-classify Emphasises information, improves coherence and readability

13 30.09.07Laura Hasler: CALP 200713 Reordering example Text about world’s second face transplant, all other sentences about a specific person/operation Experts predict the number of these operations will rise rapidly as centres around the world gear up to perform the procedure. (h01-ljh) S2  last sentence

14 30.09.07Laura Hasler: CALP 200714 Merging “Taking information from different units in the extract and presenting it as one unit in the abstract” All other operations can be used Large class, most difficult to sub-classify – anything (appropriate) goes! Best embodies abstracting as opposed to extracting – conciseness Restructuring of clauses/sentences, punctuation/connectives

15 30.09.07Laura Hasler: CALP 200715 Merging example In October 1980 Zuccarelli filed [an expensive] European patent application, covering nine countries including Britain[. … The cost of pushing a European patent through in nine countries is around $10000. The cost of application alone is around $2000 and Zuccarelli has already paid an extra $500 for a further stage of official examination ]. (new-sci-B7K-37)

16 30.09.07Laura Hasler: CALP 200716 Evaluation Applied guidelines to a different set of extracts 25 human-produced extracts + corresponding abstracts 25 automatically produced extracts + corresponding abstracts Developed Centering Theory as an evaluation method (evaluation metric) due to unsuitability of existing evaluation methods

17 30.09.07Laura Hasler: CALP 200717 Centering Theory (CT) (Grosz, Joshi & Weinstein 1995) Parametric theory of local coherence and salience Accounts for coherence using repetitions of entities across consecutive utterances Uses the relationship between repetitions to derive ‘transitions’ Transitions are ordered in preference from most to least coherent Metric developed to reflect the effect of transitions in summaries

18 30.09.07Laura Hasler: CALP 200718 Evaluation 2 Human judgment obtained to complement CT Overall, human summary production operations improve texts: CT = 78%; Judge = 82% Agreement between CT and judge = 70% Classification and resulting guidelines can be reliably used during post-editing in CAS CT is useful as an evaluation method

19 30.09.07Laura Hasler: CALP 200719 Conclusions Analysis and classification of human summary production operations for CAS (  guidelines) Evaluation: applying these operations to extracts results in more coherent/readable abstracts Guidelines can help CAS system users in their task Future work To use more human summarisers/judges to further validate classification/guidelines To look at scientific texts (also popular in AS) To further explore CT for evaluation

20 30.09.07Laura Hasler: CALP 200720 Thank you! Any questions?


Download ppt "From extracts to abstracts: human summary production operations for computer-aided summarisation Laura Hasler University of Wolverhampton"

Similar presentations


Ads by Google