Efficiently Computed Lexical Chains As an Intermediate Representation for Automatic Text Summarization H.G. Silber and K.F. McCoy University of Delaware.

Slides:

Advertisements

Similar presentations

Critical Reading Strategies: Overview of Research Process

Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Minimum Clique Partition Problem with Constrained Weight for Interval Graphs Jianping Li Department of Mathematics Yunnan University Jointed by M.X. Chen.

Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.

Fast Algorithms For Hierarchical Range Histogram Constructions

The Efficiency of Algorithms

1/1/ A Knowledge-based Approach to Citation Extraction Min-Yuh Day 1,2, Tzong-Han Tsai 1,3, Cheng-Lung Sung 1, Cheng-Wei Lee 1, Shih-Hung Wu 4, Chorng-Shyong.

NYU ANLP-00 1 Automatic Discovery of Scenario-Level Patterns for Information Extraction Roman Yangarber Ralph Grishman Pasi Tapanainen Silja Huttunen.

Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.

Algorithm Analysis (Big O) CS-341 Dick Steflik. Complexity In examining algorithm efficiency we must understand the idea of complexity –Space complexity.

Lexical chains for summarization a summary of Silber & McCoy’s work by Keith Trnka.

Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.

Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures Presenter: Cosmin Adrian Bejan Alexander Budanitsky and.

Concept of Basic Time Complexity Problem size (Input size) Time complexity analysis.

Computer Science 2 Data Structures and Algorithms V section 2 Intro to “big o” Lists Professor: Evan Korth New York University 1.

CSE 730 Information Retrieval of Biomedical Data The use of medical lexicon in biomedical IR.

Chapter 2: Algorithm Discovery and Design

Computer Science 2 Data Structures and Algorithms V Intro to “big o” Lists Professor: Evan Korth New York University 1.

Query session guided multi- document summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.

An Automatic Segmentation Method Combined with Length Descending and String Frequency Statistics for Chinese Shaohua Jiang, Yanzhong Dang Institute of.

Mining and Summarizing Customer Reviews

Invitation to Computer Science 5th Edition

Semantic Analysis Legality checks –Check that program obey all rules of the language that are not described by a context-free grammar Disambiguation –Name.

1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.

Lecture 2 Computational Complexity

Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.

Unit III : Introduction To Data Structures and Analysis Of Algorithm 10/8/ Objective : 1.To understand primitive storage structures and types 2.To.

PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.

CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.

Organizing Your Information

A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.

Mining and Analysis of Control Structure Variant Clones Guo Qiao.

Analysis of Algorithms

Annotating Words using WordNet Semantic Glosses Julian Szymański Department of Computer Systems Architecture, Faculty of Electronics, Telecommunications.

Finding High-frequent Synonyms of a Domain- specific Verb in English Sub-language of MEDLINE Abstracts Using WordNet Chun Xiao and Dietmar Rösner Institut.

This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.

C++ Programming: From Problem Analysis to Program Design, Second Edition Chapter 19: Searching and Sorting.

Optimizing multi-pattern searches for compressed suffix arrays Kalle Karhu Department of Computer Science and Engineering Aalto University, School of Science,

© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.

Chapter 9 (modified) Abstract Data Types and Algorithms Nell Dale John Lewis.

How to read a scientific paper

1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.

Semantic v.s. Positions: Utilizing Balanced Proximity in Language Model Smoothing for Information Retrieval Rui Yan†, ♮, Han Jiang†, ♮, Mirella Lapata‡,

1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )

Theory of Programming Languages Introduction. What is a Programming Language? John von Neumann (1940’s) –Stored program concept –CPU actions determined.

Methods for Automatic Evaluation of Sentence Extract Summaries * G.Ravindra +, N.Balakrishnan +, K.R.Ramakrishnan * Supercomputer Education & Research.

Department of Software and Computing Systems Research Group of Language Processing and Information Systems The DLSIUAES Team’s Participation in the TAC.

General Writing - Audience What is their level of knowledge? Advanced, intermediate, basic? Hard to start too basic – but have to use the right terminology.

UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.

1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.

Probabilistic Text Structuring: Experiments with Sentence Ordering Mirella Lapata Department of Computer Science University of Sheffield, UK (ACL 2003)

Principals of Research Writing. What is Research Writing? Process of communicating your research  Before the fact  Research proposal  After the fact.

Algorithm Analysis (Big O)

Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)

In the name of Allah. Scientific & Technical Presentation Leila Sharif Sharif University of Technology

An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities,

Compiler Construction CPCS302 Dr. Manal Abdulaziz.

Text Summarization using Lexical Chains. Summarization using Lexical Chains Summarization? What is Summarization? Advantages… Challenges…

Concept-Based Analysis of Scientific Literature Chen-Tse Tsai, Gourab Kundu, Dan Roth UIUC.

Algorithm Analysis with Big Oh ©Rick Mercer. Two Searching Algorithms  Objectives  Analyze the efficiency of algorithms  Analyze two classic algorithms.

Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.

Using lexical chains for keyword extraction

Clustering Algorithms for Noun Phrase Coreference Resolution

Extracting Semantic Concept Relations

TECHNICAL REPORT.

Advanced Implementation of Tables

Chapter 10: Compilers and Language Translation

Invitation to Computer Science 5th Edition

Presentation transcript:

Efficiently Computed Lexical Chains As an Intermediate Representation for Automatic Text Summarization H.G. Silber and K.F. McCoy University of Delaware (Journal of ACL, Vol.28, No.4, 2002)

2/24 Abstract We present a linear time algorithm for lexical chain computation, which makes lexical chains a computationally feasible candidate as an intermediate representation for automatic text summarization A method for evaluating lexical chains as an intermediate step in summarization is also presented and carried out, which was not possible before due to the computational complexity of previous lexical chains algorithm

3/24 Introduction Summarization has been viewed as a two step process The extraction of important concepts from the source text by building an intermediate representation of some sort Use this intermediate representation to generate summary In this research we concentrate on the first step of summarization and follow Barzilay and Elhadad (1997) in employing lexical chains to extract important concepts from a document

4/24 Introduction Employing lexical chains has been faced with a number of difficulties: Previous methods for computing lexical chains have been manual or automated, but with exponential efficiency It is not clear how to evaluate this stage independently of the second state of summarization

5/24 Introduction Lexical Chains Described Lexical chains can be computed in a source document by grouping sets of words that are semantically related Two noun instances are identical, and are use in the same sense (The house on the hill is large. The house is made of wood.) Two noun instances are used in the same sense (The car is fast. My automobile is faster.) The senses of two noun instances have a hypernym/hyponym relation between them (John owns a car. It is a Toyota.) The senses of two noun instances are siblings in the hypernym/hyponym tree (The truck is fast. The car is faster.) Each noun instance must belong to exactly one lexical chain

6/24 Algorithm Definition

7/24 Algorithm Definition The first step Processing a document involves creating a large array of “meta-chains”, the size of which is the number of nouns senses in WordNet plus the nouns in the document to handle the possibility of words not found in WordNet. A “meta-chains” represent all possible chains that can contain the sense whose number is the index of the array When a noun is encountered, for every sense of the noun in WordNet, the noun sense is placed into every meta- chain for which it has an identify, synonym or hypernym relation with that sense

8/24 Algorithm Definition

9/24 Algorithm Definition The second step, finding the best interpretation For each noun, look at each chain, it belongs to and figure out, based on the type of relation and distance factors, which “meta chain” the noun contributes to most In the event of a tie, the higher sense number is used The noun is then deleted from all other “meta-chains” Once all of the nouns have been processed, what is left is the interpretation whose score is maximum

10/24 Algorithm Definition Scoring System Allow different types of relations within a lexical chain to contribute to that chain differently Allow distance between word instances in the original document to affect the word’s contribution to a chain

11/24 Algorithm Definition

12/24 Algorithm Definition If it is the first word, then the “Identical Word” relation score is used. If not, then the type of relation is determined, and the closest noun in the chain to which it is related is found Using the distance between these two words, and the relation type, we look up the contribution to the overall chain score of this word distance Once chains are computed, we use the idea of “strong chain” (Barzilay and Elhadad, 1997) to pick some chains as representing the important concepts from the original document

13/24 Linear Runtime Proof The runtime of the full algorithm will be O(pos tagging) + O(our algorithm) As it does not change from execution to execution of the algorithm, we shall take the size and structure of WordNet to be constant We will examine each phase of our algorithm to show that the extraction of these lexical chains can indeed be done in linear time

14/24 Linear Runtime Proof

15/24 Linear Runtime Proof Collection of WordNet Information For each noun in the source document that appears in WordNet, each sense that the word can take must be examined For each sense, we must walk up and down the hypernym/hyponym graph collecting all parent and child information Lastly, we must collect all of the senses in Wordet that are siblings with the processing word The runt-time is given as follows n*(log 2 (C 3 )+C 1 *C 2 +C 1 *C 5 )

16/24 Linear Runtime Proof Building the graph The graph of all possible interpretations is nothing more than an array of sense values (66025+n) which we will call the sense array For each word, we examine each relation computed as above from WordNet and for each of these relations, we modify the list that is indexed in the sense array Lastly, we add the value of how this word affects the score of the chain based on the scoring system to an array stored within the word structure The runtime is as follows. n*C 6 *4

17/24 Linear Runtime Proof Extracting the Best Interpretation For each word in the source document, we look at each chain to which the word can belong A list of pointers to these chains are stored within the word object, so looking them up takes O(1) The runtime for this phase n*C 6 *4

18/24 Linear Runtime Proof Overall Run Time Performance Sum of the steps listed above for an overall runtime n*( log 2 (94474)+45474*4) worst case n*(3256+log 2 (94474)+55*4) average case While in worst case these constants are quite large, in the average case they are reasonable This algorithm is O(n) in the number of nouns within source document

19/24 Evaluation Design We propose an evaluation of this intermediate stage is independent of the generation phase of summarization The basis of our evaluation is the premise that if lexical chains are a good intermediate representation, then we would expect that if we had a summary, each noun in the summary should be used in the same sense as some word instance grouped into a strong chain in the original document Moreover, we expect that all (most) strong chains in the documents should be presented in the summary

20/24 Evaluation Design We focus on two general categories of summary Scientific documents with abstracts 10 scientific articles (5 computer science, 3 anthropology, 2 biology) Chapters from University level textbooks that contain chapter summaries 14 chapters from 10 University textbooks (4 coputer science, 6 anthropology, 2 history, 2 economics)

21/24 Evaluation Design For each document in the corpus, the document and its summary were analyzed separately to produce lexical chains By comparing the sense number of words in each chain in the document with the computed sense of each noun instance in the summary, we can determine if the summary indeed contains the same “concept” as indicated by the lexical chains For analysis, the specific metrics are: The number and percentage of strong chains from the original text that are represented in the summary (Analogous to recall) The number and percentage of noun instances in the summary that represent strong chains in the document (Analogous to precision)

22/24 Evaluation Design They contain a great number of pronouns, and other anaphoric expressions

23/24 Discussion and Future Work Proper nouns are often used in naturally occurring text, but since we have no information about them Anaphora resolution is a bigger issue. Much better results are anticipated with the addition of anaphora resolution to the system Other issues which may affect the results stem from the WordNet coverage and the semantic information it captures Issues regarding generation of a summary based on lexical chains needs to be addressed and are the subject of our current work

24/24 Conclusions In this paper, we have outlined an efficient, linear time algorithm for computing lexical chains as an intermediate representation for automatic machine text summarization In addition, we have presented a method for evaluating lexical chains as an intermediate representation and have evaluated the method using 24 documents. The results of these evaluation are promising