Comparing Information Extraction Pattern Models Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield, UK.

Slides:



Advertisements
Similar presentations
Artificial Intelligence: Natural Language and Prolog
Advertisements

Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou.
Tracking L2 Lexical and Syntactic Development Xiaofei Lu CALPER 2010 Summer Workshop July 14, 2010.
Translator Architecture Code Generator ParserTokenizer string of characters (source code) string of tokens abstract program string of integers (object.
Structural Joins: A Primitive for Efficient XML Query Pattern Matching Al Khalifa et al., ICDE 2002.
Fast Algorithms For Hierarchical Range Histogram Constructions
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
Automatically Acquiring a Linguistically Motivated Genic Interaction Extraction System Mark A. Greenwood Mark Stevenson Yikun Guo Henk Harkema Angus Roberts.
NYU ANLP-00 1 Automatic Discovery of Scenario-Level Patterns for Information Extraction Roman Yangarber Ralph Grishman Pasi Tapanainen Silja Huttunen.
In Search of a More Probable Parse: Experiments with DOP* and the Penn Chinese Treebank Aaron Meyers Linguistics 490 Winter 2009.
1 Unsupervised Semantic Parsing Hoifung Poon and Pedro Domingos EMNLP 2009 Best Paper Award Speaker: Hao Xiong.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
1 Scheme Scheme is a functional language. Scheme is based on lambda calculus. lambda abstraction = function definition In Scheme, a function is defined.
1 Mining Frequent Patterns Without Candidate Generation Apriori-like algorithm suffers from long patterns or quite low minimum support thresholds. Two.
Event Extraction: Learning from Corpora Prepared by Ralph Grishman Based on research and slides by Roman Yangarber NYU.
July 9, 2003ACL An Improved Pattern Model for Automatic IE Pattern Acquisition Kiyoshi Sudo Satoshi Sekine Ralph Grishman New York University.
Chapter 3 The Efficiency of Algorithms
CS 330 Programming Languages 09 / 16 / 2008 Instructor: Michael Eckmann.
Learning syntactic patterns for automatic hypernym discovery Rion Snow, Daniel Jurafsky and Andrew Y. Ng Prepared by Ang Sun
Copyright © Cengage Learning. All rights reserved. CHAPTER 11 ANALYSIS OF ALGORITHM EFFICIENCY ANALYSIS OF ALGORITHM EFFICIENCY.
XML Technologies and Applications Rajshekhar Sunderraman Department of Computer Science Georgia State University Atlanta, GA 30302
Distributed Constraint Optimization * some slides courtesy of P. Modi
Automatically Acquiring a Linguistically Motivated Genic Interaction Extraction System Mark A. Greenwood Mark Stevenson Yikun Guo Henk Harkema Angus Roberts.
Robert Hass CIS 630 April 14, 2010 NP NP↓ Super NP tagging JJ ↓
Learning Information Extraction Patterns Using WordNet Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield,
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
Natural Language Processing Group Department of Computer Science University of Sheffield, UK Improving Semi-Supervised Acquisition of Relation Extraction.
Chapter 19: Binary Trees. Objectives In this chapter, you will: – Learn about binary trees – Explore various binary tree traversal algorithms – Organize.
Parsing arithmetic expressions Reading material: These notes and an implementation (see course web page). The best way to prepare [to be a programmer]
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
A Semantic Approach to IE Pattern Induction Mark Stevenson and Mark Greenwood Natural Language Processing Group University of Sheffield, UK.
Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.
Lecture 10 Trees –Definiton of trees –Uses of trees –Operations on a tree.
Syntax Analysis The recognition problem: given a grammar G and a string w, is w  L(G)? The parsing problem: if G is a grammar and w  L(G), how can w.
CS 461 – Oct. 7 Applications of CFLs: Compiling Scanning vs. parsing Expression grammars –Associativity –Precedence Programming language (handout)
BINARY SEARCH TREE. Binary Trees A binary tree is a tree in which no node can have more than two children. In this case we can keep direct links to the.
An Intelligent Analyzer and Understander of English Yorick Wilks 1975, ACM.
Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.
Copyright © by Curt Hill Grammar Types The Chomsky Hierarchy BNF and Derivation Trees.
August Chapter 6 - XPath & XPointer Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
A Semantic Approach to IE Pattern Induction Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield, UK.
1 Statistical NLP: Lecture 7 Collocations. 2 Introduction 4 Collocations are characterized by limited compositionality. 4 Large overlap between the concepts.
1 Intelligente Analyse- und Informationssysteme Frank Reichartz, Hannes Korte & Gerhard Paass Fraunhofer IAIS, Sankt Augustin, Germany Dependency Tree.
What’s in a translation rule? Paper by Galley, Hopkins, Knight & Marcu Presentation By: Behrang Mohit.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
Supertagging CMSC Natural Language Processing January 31, 2006.
Information Extraction from Single and Multiple Sentences Mark Stevenson Department of Computer Science University of Sheffield, UK.
Binary Search Trees (BSTs) 18 February Binary Search Tree (BST) An important special kind of binary tree is the BST Each node stores some information.
Natural Language Processing Lecture 15—10/15/2015 Jim Martin.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Probabilistic Text Structuring: Experiments with Sentence Ordering Mirella Lapata Department of Computer Science University of Sheffield, UK (ACL 2003)
FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.
Lecture 9COMPSCI.220.FS.T Lower Bound for Sorting Complexity Each algorithm that sorts by comparing only pairs of elements must use at least 
Copyright © Curt Hill Other Trees Applications of the Tree Structure.
BINARY TREES Objectives Define trees as data structures Define the terms associated with trees Discuss tree traversal algorithms Discuss a binary.
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
A Database of Narrative Schemas A 2010 paper by Nathaniel Chambers and Dan Jurafsky Presentation by Julia Kelly.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
1 Representing and Reasoning on XML Documents: A Description Logic Approach D. Calvanese, G. D. Giacomo, M. Lenzerini Presented by Daisy Yutao Guo University.
CS 3304 Comparative Languages
Introduction to Parsing (adapted from CS 164 at Berkeley)
Binary Tree and General Tree
Data Integration with Dependent Sources
A Path-based Transfer Model for Machine Translation
Presentation transcript:

Comparing Information Extraction Pattern Models Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield, UK

Information Extraction Patterns Popular approach to Information Extraction use lexico-syntactic patterns which match text and identify items of interest Several recent approaches have been based on extraction patterns derived from dependency parses Unsupervised approaches to learning extraction patterns extract all possible patterns and try to identify the useful ones

“Microsoft, forced to recruit after Adams unexpectedly resigned, last week hired Boor as interim replacement.” hire/V Microsoft/NBoor/N resign/V Adams/N nsubjnobj nsubj unexpectedly/R as force/V recruit/N last/J week/N as replacement/N an/DTinterim/J toafter partmod dep detamod

“Microsoft, forced to recruit after Adams unexpectedly resigned, last week hired Boor as interim replacement.” hire/V Microsoft/NBoor/N resign/V Adams/N nsubjnobj nsubj

Predicate Argument Model Pattern consists of a subject-verb-object tuple; Yangarber (2003); Stevenson and Greenwood (2005) hire/V IBM/NSmith/Nresign/V Jones/N nsubj nobj after nsubj

Chain Model Extraction patterns are chain-shaped paths in the dependency tree rooted at a verb; Sudo et. al. (2001), Sudo et. al. (2003) hire/V IBM/NSmith/Nresign/V Jones/N nsubj nobj after nsubj

Linked Chain Model Patterns are chains or any pair of chains sharing their root; (Greenwood et. al. 2005) hire/V IBM/NSmith/Nresign/V Jones/N nsubj nobj after nsubj

Subtree Model Patterns are any subtree of the dependency tree By its definition it contains all the patterns proposed by the previous two models; Sudo et. al. (2003) hire/V IBM/NSmith/Nresign/V Jones/N nsubj nobj after nsubj

Comparing Models The models identify different parts of a sentence. –“Smith joined Acme Inc. as CEO” –SVO model identifies link between “Smith” and “Acme Inc.” –Chain model identify link between “Acme Inc.” and “CEO” –Linked chain and subtree models could identify both links But there is a price to be paid –Models generate different numbers of patterns for a given dependency tree –More patterns probably require more memory and processing

Let T be a dependency tree consisting of N nodes. V is the set of verb nodes Now let d(v) be the count of a node v (a member of V) and its descendents. Linear Linear, polynomial in worst case Model Complexity

Let C(v) denote the set of child nodes for a verb v and c i be the i-th child. (So, C(v) = {c 1, c 2, …. c |C(v)| }) The number of subtrees can be defined recursively: Polynomial Exponential

Experiments Aim to identify how well each pattern model captures the relations occurring in an IE corpus Extract patterns from a parsed corpus and, for each model, check whether it contains the items participating in the relationship Do NOT attempting to extract the relations, just to determine whether they can be represented

Corpora Stevens succeeds Fred Casey who retired from the OCC in June Expression of sigma(K)-dependent cwlH gene depended on gerE Used corpora representing two extraction tasks –Management succession –Various biomedical texts

Parsers 1.MINIPAR 2.Machinese Syntax Parser 3.Stanford Parser SVOChainsLinked Chains Subtrees Minipar 2,98052,659149,504353,778,240,702,149,000 Machinese Syntax 2,38267,690265,6314,641,825,924 Stanford 2,95076,620478,643 1,696,259,251,073

Evaluating Expressivity Coverage: proportion of relations in corpus for which there exists a pattern that includes both items participating in that relation Analysis showed that parsers often failed to generate a parse which included all words in the sentence. For some relations it may be impossible to generate a pattern which covers it. No model can outperform subtree model. Bounded coverage: proportion of relations in corpus which can be represented (given a dependency parse) for which there exists a pattern that includes both participating items.

Management Succession Results Coverage (%)Bounded Coverage (%) ParserSVOChainsLinked Chains SubtreesSVOChainsLinked Chains MINIPAR Machinese Syntax Stanford SVO and chains do not cover many of the relations Subtree and linked chains models have roughly same coverage

Biomedical Results Coverage (%)Bounded Coverage (%) ParserSVOChainsLinked Chains SubtreesSVOChainsLinked Chains MINIPAR Machinese Syntax Stanford More difference between linked chains and simpler models on biomedical text SVO and chains consistently perform badly, linked chains do well

Bounded coverage results for all models is lower on the biomedical corpora –Parsers are not generally well adapted to deal with these sorts of text; more parsing errors? –Nominalisations appear more common in these texts “the DNA-dependent assembly of regulon into rings” assembly/N dependent/A regulon/N rings/NDNA/N

Results Summary Average coverage for each pattern model over all texts No statistical difference between (1) SVO and chains or (2) linked chains and subtrees

Summary Comparison of four models for Information Extraction patterns based on dependency trees –Trade off between pattern complexity and tractability Linked chain model performs well –But may have problems with certain linguistic constructions (such as nominalizations)