Automatic summarization Dragomir R. Radev University of Michigan

Automatic summarization Dragomir R. Radev University of Michigan radev@umich.edu

Outline What is summarization Genres of summarization (Single-doc, Multi- doc, Query-based, etc.) Extractive vs. non-extractive summarization Evaluation metrics Current systems –Marcu/Knight –MEAD/Lemur –NewsInEssence/NewsBlaster What is possible and what is not

Goal of summarization Preserve the “most important information” in a document. Make use of redundancy in text Maximize information density Compression Ratio = |S| |D| Retention Ratio = i (S) i (D) Goal: i (S) i (D) |S| |D| >

Sentence-extraction based (SE) summarization Classification problem Approximation

Typical approaches to SE summarization Manually-selected features: position, overlap with query, cue words, structure information, overlap with centroid Reranking: maximal marginal relevance [Carbonell/Goldstein98]

Non-SE summarization Discourse-based [Marcu97] Lexical chains [Barzilay&Elhadad97] Template-based [Radev&McKeown98]

Evaluation metrics Intrinsic measures –Precision, recall –Kappa –Relative utility [Radev&al.00] –Similarity measures (cosine, overlap, BLEU) Extrinsic measures –Classification accuracy –Informativeness for question answering –Relevance correlation

Precision and recall Precision(J1)= Recall(J2)= Recall(J1)= Precision(J2)=

Kappa N: number of items (index i) n: number of categories (index j) k: number of annotators

Cosine Overlap Similarity measures

Relevance correlation (RC)

Properties of evaluation metrics

Case study Multi-document News User-centered NewsInEssence [HLT 01] NewsBlaster [HLT02]

Web resources http://www.summarization.com http://duc.nist.gov http://www.newsinessence.com http://www.clsp.jhu.edu/ws2001/groups/asmd/ http://www.cs.columbia.edu/~jing/summarization.html http://www.dcs.shef.ac.uk/~gael/alphalist.html http://www.csi.uottawa.ca/tanka/ts.html http://www.ics.mq.edu.au/~swan/summarization/

Generative probabilistic models for summarization Wessel Kraaij TNO TPD

Summarization architecture What do human summarizers do? –A: Start from scratch: analyze, transform, synthesize (top down) –B: Select material and revise: “cut and paste summarization” (Jing & McKeown-1999) Automatic systems: –Extraction: selection of material –Revision: reduction, combination, syntactic transformation, paraphrasing, generalization, sentence reordering complexity Extracts Abstracts

Required knowledge

Examples of generative models in summarization systems Sentence selection Sentence / document reduction Headline generation

Ex. 1: Sentence selection Conroy et al (DUC 2001): HMM on sentence level, each state has an associated feature vector (pos,len, #content terms) Compute probability of being a summary sentence Kraaij et al (DUC 2001) Rank sentences according to posterior probability given a mixture model +Grammaticality is OK –Lacks aggregation, generalization, MDS

Ex. 2: Sentence reduction

Knight & Marcu (AAAI2000) Compression: delete substrings in an informed way (based on parse tree) –Required: PCFG parser, tree aligned training corpus –Channel model: probabilistic model for expansion of a parse tree –Results: much better than NP baseline +Tight control on grammaticality +Mimics revision operations by humans

Daumé & Marcu (ACL2002) Document compression, noisy channel –Based on syntactic structure and discourse structure (extension of Knight & Marcu model) –Required: Discourse & syntactic parsers –Training corpus where EDU’s in summaries are aligned with the documents –Cannot handle interesting document lengths (due to complexity)

Ex. 3: Headline generation

Berger & Mittal (sigir2000) Input: web pages (often not running text) –Trigram language model –IBM model 1 like channel model: Choose length, draw word from source model and replace with similar word, independence assumption – Trained on Open Directory +Non-extractive –Grammaticality and coherence are disappointing: indicative

Zajic, Dorr & Schwartz (duc2002) Headline generation from a full story: P(S|H)P(H) Channel model based on HMM consisting of a bigram model of headline words and a unigram model of story words, bigram language model Decoding parameters are crucial to produce good results (length, position, strings) +Good results in fluency and accuracy

Conclusions Fluent headlines within reach of simple generative models High quality summaries (coverage, grammaticality, coherence) require higher level symbolic representations Cut & paste metaphor divides the work into manageable sub-problems Noisy channel method effective, but not always efficient

Open issues Audience (user model) Types of source documents Dealing with redundancy Information ordering (e.g., temporal) Coherent text Cross-lingual summarization (Norbert Fuhr) Use summaries to improve IR (or CLIR) - relevance correlation LM for text generation Possibly not well-defined problem (low interjudge agreement) Develop models with more linguistic structure Develop integrated models, e.g. by using priors (Rosenfeld) Build efficient implementations Evaluation: Define a manageable task

Automatic summarization Dragomir R. Radev University of Michigan

Similar presentations

Presentation on theme: "Automatic summarization Dragomir R. Radev University of Michigan"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Automatic summarization Dragomir R. Radev University of Michigan

Similar presentations

Presentation on theme: "Automatic summarization Dragomir R. Radev University of Michigan"— Presentation transcript:

Similar presentations

About project

Feedback