Event-Centric Summary Generation Lucy Vanderwende, Michele Banko and Arul Menezes One Microsoft Way, WA, USA DUC 2004.

Slides:



Advertisements
Similar presentations
A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
Advertisements

CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
O(N 1.5 ) divide-and-conquer technique for Minimum Spanning Tree problem Step 1: Divide the graph into  N sub-graph by clustering. Step 2: Solve each.
Using Video Segments to Enhance Early Clinical Experiences of Prospective Teachers Kristen Cuthrell, Michael Vitale, College of Education, East Carolina.
A UTOMATICALLY A CQUIRING A S EMANTIC N ETWORK OF R ELATED C ONCEPTS Date: 2011/11/14 Source: Sean Szumlanski et. al (CIKM’10) Advisor: Jia-ling, Koh Speaker:
Software Design Process A Process is a set of related and (sequenced) tasks that transforms a set of input to a set of output. Inputs Outputs Design Process.
A Brief Introduction. Acknowledgements  The material in this tutorial is based in part on: Concurrency: State Models & Java Programming, by Jeff Magee.
A New Suffix Tree Similarity Measure for Document Clustering Hung Chim, Xiaotie Deng City University of Hong Kong WWW 2007 Session: Similarity Search April.
Web Document Clustering: A Feasibility Demonstration Hui Han CSE dept. PSU 10/15/01.
Efficient Web Browsing on Handheld Devices Using Page and Form Summarization Orkut Buyukkokten, Oliver Kaljuvee, Hector Garcia-Molina, Andreas Paepcke.
Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.
Disambiguation Algorithm for People Search on the Web Dmitri V. Kalashnikov, Sharad Mehrotra, Zhaoqi Chen, Rabia Nuray-Turan, Naveen Ashish For questions.
Cover Coefficient based Multidocument Summarization CS 533 Information Retrieval Systems Özlem İSTEK Gönenç ERCAN Nagehan PALA.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Software Design Processes and Management
Clustering Unsupervised learning Generating “classes”
Design Patterns for Efficient Graph Algorithms in MapReduce Jimmy Lin and Michael Schatz University of Maryland MLG, January, 2014 Jaehwan Lee.
Mining and Summarizing Customer Reviews Minqing Hu and Bing Liu University of Illinois SIGKDD 2004.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Implementation Yaodong Bi. Introduction to Implementation Purposes of Implementation – Plan the system integrations required in each iteration – Distribute.
Business Process Management. Key Definitions Process model A formal way of representing how a business operates Illustrates the activities that are performed.
A Compositional Context Sensitive Multi-document Summarizer: Exploring the Factors That Influence Summarization Ani Nenkova, Stanford University Lucy Vanderwende,
CS654: Digital Image Analysis Lecture 3: Data Structure for Image Analysis.
Open Information Extraction using Wikipedia
Incident Threading for News Passages (CIKM 09) Speaker: Yi-lin,Hsu Advisor: Dr. Koh, Jia-ling. Date:2010/06/14.
Laboratory for InterNet Computing CSCE 561 Social Media Projects Ryan Benton October 8, 2012.
Web Document Clustering: A Feasibility Demonstration Oren Zamir and Oren Etzioni, SIGIR, 1998.
From Social Bookmarking to Social Summarization: An Experiment in Community-Based Summary Generation Oisin Boydell, Barry Smyth Adaptive Information Cluster,
Lecture 5: Writing the Project Documentation Part III.
1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.
LexPageRank: Prestige in Multi- Document Text Summarization Gunes Erkan and Dragomir R. Radev Department of EECS, School of Information University of Michigan.
Deeper Sentiment Analysis Using Machine Translation Technology Kanauama Hiroshi, Nasukawa Tetsuya Tokyo Research Laboratory, IBM Japan Coling 2004.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,
Semantic Wordfication of Document Collections Presenter: Yingyu Wu.
Part4 Methodology of Database Design Chapter 07- Overview of Conceptual Database Design Lu Wei College of Software and Microelectronics Northwestern Polytechnical.
Union-find Algorithm Presented by Michael Cassarino.
Algorithmic Detection of Semantic Similarity WWW 2005.
DOCUMENT UPDATE SUMMARIZATION USING INCREMENTAL HIERARCHICAL CLUSTERING CIKM’10 (DINGDING WANG, TAO LI) Advisor: Koh, Jia-Ling Presenter: Nonhlanhla Shongwe.
Tool for Ontology Paraphrasing, Querying and Visualization on the Semantic Web Project By Senthil Kumar K III MCA (SS)‏
From Text to Image: Generating Visual Query for Image Retrieval Wen-Cheng Lin, Yih-Chen Chang and Hsin-Hsi Chen Department of Computer Science and Information.
Named Entity Disambiguation on an Ontology Enriched by Wikipedia Hien Thanh Nguyen 1, Tru Hoang Cao 2 1 Ton Duc Thang University, Vietnam 2 Ho Chi Minh.
A New Multi-document Summarization System Yi Guo and Gorge Stylios Heriot-Watt University, Scotland, U.K. (DUC2003)
清华大学计算机系 Answer Generating Methods for Community Question and Answering Portals {Tao Haoxiong, Hao Yu, Zhu University.
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
LexPageRank: Prestige in Multi-Document Text Summarization Gunes Erkan, Dragomir R. Radev (EMNLP 2004)
Rule-Based Method for Entity Resolution IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING JANUARY 2015.
An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities,
Block-level Link Analysis Presented by Lan Nie 11/08/2005, Lehigh University.
Using Semantic Relations to Improve Information Retrieval
Enhanced hypertext categorization using hyperlinks Soumen Chakrabarti (IBM Almaden) Byron Dom (IBM Almaden) Piotr Indyk (Stanford)
NATURAL LANGUAGE PROCESSING
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.
Plan for today Introduction Graph Matching Method Theme Recognition Comparison Conclusion.
Research Proposal Writing Resource Person : Furqan-ul-haq Siddiqui Lecture on; Wednesday, May 13, 2015 Quetta Campus.
GRAPH BASED MULTI-DOCUMENT SUMMARIZATION Canan BATUR
Introduction to Paging. Readings r 4.3 of the text book.
Data mining in web applications
Chapter 4: Business Process and Functional Modeling, continued
Data Mining K-means Algorithm
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Social Knowledge Mining
Clustering Algorithms for Noun Phrase Coreference Resolution
John Frazier and Jonathan perrier
Entity-Relationship Diagram (ERD)
An Approach to Abstractive Multi-Entity Summarization
Wednesday 2/20/19 Bell Ringer: The model to the right shows 3 plates and their type of plate boundary. Which two plates (refer to them by color)
Presented by Nick Janus
Presentation transcript:

Event-Centric Summary Generation Lucy Vanderwende, Michele Banko and Arul Menezes One Microsoft Way, WA, USA DUC 2004

2 Abstract Our primary interest is two folds: –To explore an event-centric approach to summarization –To explore a generation approach to summary realization

3 Introduction Identifying important events, as opposed to entities Generation component –Human-authored rely less on sentence extraction Graph-scoring algorithm –To identify highest weighted node to guide content selection

4 System Description MSR-NLP –Analysis component Rule-base syntactic analysis component Produces a logical form –Syntactic variations, words label –Generation component Syntactic realization component Produces a syntactic tree

5 Creating document representations Cluster sentence Analysis sentence and get logical form

6 Creating document representations Produces triples result from logical form –(LFNode i, rel, LFNode j )

7 Forming Document Graph Take those triples and join nodes by way of their semantic relation using a bidirectional link structure Keep track of how many times we observe the relationship Stop words are not included in the graph construction

8

9 Node scoring Using Pagerank Using Pagerank algorithm –Hyperlink such as WWW –When link between nodes, vote for that node –

10 Node scoring Using Pagerank Pagerank framework –“Pages”, correspond to base forms of words in the documents –“hyperlink”, correspond to semantic relationships –Verbs, identify events –Noun, Identify entities –Use event to identify summary content Typically, the algorithm converges around 40 iterations

11 Graph Scoring Use pagerank scores to assess the link weight (LW(i->n))

12 Summary Generation Generated by extracting and merging of logical form –Identify important triples Defined highly link weight node, and together with most highly weighted (leave, Tobj, LonLondon_Bridge_Hospital) Not (leave, Tobj, government) –Extract fragments divided into “event” and “entity” Event used to generate summary Entity used to expanded upon reference to the same entity within the selected event fragment

13

14 Summary Generation Event fragment order –Cluster event fragment by they refer to –Choose the greatest number of argument node for the event –Order the selected event fragments To group sentence referring to the same entity together Order sentence which exhibit event-coreference

15 Experiments and Evaluation (Rule-based pronoun resolution method, 75% accuracy)

16 Experiments and Evaluation Reason: the potential to introduce disfluent text

17

18 Directions and Future Work Produce more human-like generated summaries Further study the impact of anaphora resolution Study new page-ranking algorithm While ordering groups event fragments mentioning the same entity, we have not yet implemented a system to combine them into larger logical form construction