1 I256: Applied Natural Language Processing Marti Hearst Nov 13, 2006.

Slides:



Advertisements
Similar presentations
Three Basic Problems Compute the probability of a text: P m (W 1,N ) Compute maximum probability tag sequence: arg max T 1,N P m (T 1,N | W 1,N ) Compute.
Advertisements

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
NYU ANLP-00 1 Automatic Discovery of Scenario-Level Patterns for Information Extraction Roman Yangarber Ralph Grishman Pasi Tapanainen Silja Huttunen.
Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.
IR & Metadata. Metadata Didn’t we already talk about this? We discussed what metadata is and its types –Data about data –Descriptive metadata is external.
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
Event Extraction: Learning from Corpora Prepared by Ralph Grishman Based on research and slides by Roman Yangarber NYU.
An Overview of Text Mining Rebecca Hwa 4/25/2002 References M. Hearst, “Untangling Text Data Mining,” in the Proceedings of the 37 th Annual Meeting of.
Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.
Information Extraction and Ontology Learning Guided by Web Directory Authors:Martin Kavalec Vojtěch Svátek Presenter: Mark Vickers.
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
1 CS276B Web Search and Mining Lecture 10 Text Mining I Feb 8, 2005 (includes slides borrowed from Marti Hearst)
Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Learning syntactic patterns for automatic hypernym discovery Rion Snow, Daniel Jurafsky and Andrew Y. Ng Prepared by Ang Sun
Automatic Acquisition of Lexical Classes and Extraction Patterns for Information Extraction Kiyoshi Sudo Ph.D. Research Proposal New York University Committee:
1 Natural Language Processing for the Web Prof. Kathleen McKeown 722 CEPSR, Office Hours: Wed, 1-2; Tues 4-5 TA: Yves Petinot 719 CEPSR,
Comments on Guillaume Pitel: “Using bilingual LSA for FrameNet annotation of French text from generic resources” Gerd Fliedner Computational Linguistics.
Automatically Constructing a Dictionary for Information Extraction Tasks Ellen Riloff Proceedings of the 11 th National Conference on Artificial Intelligence,
Learning Dictionaries for Information Extraction by Multi- Level Bootstrapping Ellen Riloff and Rosie Jones, AAAI 99 Presented by: Sunandan Chakraborty.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
Information Extraction with Unlabeled Data Rayid Ghani Joint work with: Rosie Jones (CMU) Tom Mitchell (CMU & WhizBang! Labs) Ellen Riloff (University.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Processing of large document collections Part 10 (Information extraction: learning extraction patterns) Helena Ahonen-Myka Spring 2005.
Learning Information Extraction Patterns Using WordNet Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield,
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
Improved Bibliographic Reference Parsing Based on Repeated Patterns Guido Sautter, Klemens Böhm ViBRANT Virtual Biodiversity.
Survey of Semantic Annotation Platforms
Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.
A Semantic Approach to IE Pattern Induction Mark Stevenson and Mark Greenwood Natural Language Processing Group University of Sheffield, UK.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
Using Text Mining and Natural Language Processing for Health Care Claims Processing Cihan ÜNAL
Intelligent Database Systems Lab Presenter : YAN-SHOU SIE Authors : JEROEN DE KNIJFF, FLAVIUS FRASINCAR, FREDERIK HOGENBOOM DKE Data & Knowledge.
A Survey for Interspeech Xavier Anguera Information Retrieval-based Dynamic TimeWarping.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.
 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.
Graph-based Analysis of Espresso-style Minimally-supervised Bootstrapping Algorithms Jan 15, 2010 Mamoru Komachi Nara Institute of Science and Technology.
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
A Semantic Approach to IE Pattern Induction Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield, UK.
Bootstrapping for Text Learning Tasks Ramya Nagarajan AIML Seminar March 6, 2001.
Bootstrapping Information Extraction with Unlabeled Data Rayid Ghani Accenture Technology Labs Rosie Jones Carnegie Mellon University & Overture (With.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
Authors: Marius Pasca and Benjamin Van Durme Presented by Bonan Min Weakly-Supervised Acquisition of Open- Domain Classes and Class Attributes from Web.
1 CS276B Text Information Retrieval, Mining, and Exploitation Lecture 12 Text Mining I Feb 25, 2003 (includes slides borrowed from Marti Hearst, )
Collocations and Terminology Vasileios Hatzivassiloglou University of Texas at Dallas.
MedKAT Medical Knowledge Analysis Tool December 2009.
Finding frequent and interesting triples in text Janez Brank, Dunja Mladenić, Marko Grobelnik Jožef Stefan Institute, Ljubljana, Slovenia.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff School of Computing University of Utah Janyce Wiebe, Theresa Wilson Computing.
Commonsense Reasoning in and over Natural Language Hugo Liu, Push Singh Media Laboratory of MIT The 8 th International Conference on Knowledge- Based Intelligent.
Acquisition of Categorized Named Entities for Web Search Marius Pasca Google Inc. from Conference on Information and Knowledge Management (CIKM) ’04.
Unsupervised Relation Detection using Automatic Alignment of Query Patterns extracted from Knowledge Graphs and Query Click Logs Panupong PasupatDilek.
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
Processing of large document collections Part 9 (Information extraction: learning extraction patterns) Helena Ahonen-Myka Spring 2006.
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.
PRESENTED BY: PEAR A BHUIYAN
Social Knowledge Mining
Automatic Detection of Causal Relations for Question Answering
CS246: Information Retrieval
Enriching Taxonomies With Functional Domain Knowledge
Presentation transcript:

1 I256: Applied Natural Language Processing Marti Hearst Nov 13, 2006

2 Today Automating Lexicon Construction

3 PMI (Turney 2001) Pointwise Mutual Information Posed as an alternative to LSA score(choicei) = log2(p(problem & choicei) / (p(problem)p(choicei))) With various assumptions, this simplifies to : score(choicei) = p(problem & choicei) / p(choicei) Conducts experiments with 4 ways to compute this score1(choicei) = hits(problem AND choicei) / hits(choicei)

4 Dependency Parser (Lin 98) Syntactic parser that emphasizes dependancy relationships between lexical items. Alice is the author of the book. The book is written by Alice mod pred dets s beby-subj pcomp-n det Illustration by Bengi Mizrahi

5 Automating Lexicon Construction

6 Slide adapted from Manning & Raghavan What is a Lexicon? A database of the vocabulary of a particular domain (or a language) More than a list of words/phrases Usually some linguistic information Morphology (manag- e/es/ing/ed → manage) Syntactic patterns (transitivity etc) Often some semantic information Is-a hierarchy Synonymy Numbers convert to normal form: Four → 4 Date convert to normal form Alternative names convert to explicit form –Mr. Carr, Tyler, Presenter → Tyler Carr

7 Slide adapted from Manning & Raghavan Lexica in Text Mining Many text mining tasks require named entity recognition. Named entity recognition requires a lexicon in most cases. Example 1: Question answering Where is Mount Everest? A list of geographic locations increases accuracy Example 2: Information extraction Consider scraping book data from amazon.com Template contains field “publisher” A list of publishers increases accuracy Manual construction is expensive: 1000s of person hours! Sometimes an unstructured inventory is sufficient Often you need more structure, e.g., hierarchy

8 Semantic Relation Detection Goal: automatically augment a lexical database Many potential relation types: ISA (hypernymy/hyponymy) Part-Of (meronymy) Idea: find unambiguous contexts which (nearly) always indicate the relation of interest

9 Lexico-Syntactic Patterns (Hearst 92)

10 Lexico-Syntactic Patterns (Hearst 92)

11 Adding a New Relation

12 Automating Semantic Relation Detection Lexico-syntactic Patterns: Should occur frequently in text Should (nearly) always suggest the relation of interest Should be recognizable with little pre-encoded knowledge. These patterns have been used extensively by other researchers.

13 Slide adapted from Manning & Raghavan Lexicon Construction (Riloff 93) Attempt 1: Iterative expansion of phrase list Start with: Large text corpus List of seed words Identify “good” seed word contexts Collect close nouns in contexts Compute confidence scores for nouns Iteratively add high-confidence nouns to seed word list. Go to 2. Output: Ranked list of candidates

14 Slide adapted from Manning & Raghavan Lexicon Construction: Example Category: weapon Seed words: bomb, dynamite, explosives Context: and Iterate: Context: They use TNT and other explosives. Add word: TNT Other words added by algorithm: rockets, bombs, missile, arms, bullets

15 Slide adapted from Manning & Raghavan Lexicon Construction: Attempt 2 Multilevel bootstrapping (Riloff and Jones 1999) Generate two data structures in parallel The lexicon A list of extraction patterns Input as before Corpus (not annotated) List of seed words

16 Slide adapted from Manning & Raghavan Multilevel Bootstrapping Initial lexicon: seed words Level 1: Mutual bootstrapping Extraction patterns are learned from lexicon entries. New lexicon entries are learned from extraction patterns Iterate Level 2: Filter lexicon Retain only most reliable lexicon entries Go back to level 1 2-level performs better than just level 1.

17 Slide adapted from Manning & Raghavan Scoring of Patterns Example Concept: company Pattern: owned by Patterns are scored as follows score(pattern) = F/N log(F) F = number of unique lexicon entries produced by the pattern N = total number of unique phrases produced by the pattern Selects for patterns that are –Selective (F/N part) –Have a high yield (log(F) part)

18 Slide adapted from Manning & Raghavan Scoring of Noun Phrases Noun phrases are scored as follows score(NP) = sum_k ( * score(pattern_k)) where we sum over all patterns that fire for NP Main criterion is number of independent patterns that fire for this NP. Give higher score for NPs found by high-confidence patterns. Example: New candidate phrase: boeing Occurs in: owned by, sold to, offices of

19 Slide adapted from Manning & Raghavan Shallow Parsing Shallow parsing needed For identifying noun phrases and their heads For generating extraction patterns For scoring, when are two noun phrases the same? Head phrase matching X matches Y if X is the rightmost substring of Y “New Zealand” matches “Eastern New Zealand” “New Zealand cheese” does not match “New Zealand”

20 Slide adapted from Manning & Raghavan Seed Words

21 Slide adapted from Manning & Raghavan Mutual Bootstrapping

22 Slide adapted from Manning & Raghavan Extraction Patterns

23 Slide adapted from Manning & Raghavan Level 1: Mutual Bootstrapping Drift can occur. It only takes one bad apple to spoil the barrel. Example: head Introduce level 2 bootstrapping to prevent drift.

24 Slide adapted from Manning & Raghavan Level 2: Meta-Bootstrapping

25 Slide adapted from Manning & Raghavan Evaluation

26 Slide adapted from Manning & Raghavan CoTraining (Collins&Singer 99) Similar back and forth between an extraction algorithm and a lexicon New: They use word-internal features Is the word all caps? (IBM) Is the word all caps with at least one period? (N.Y.) Non-alphabetic character? (AT&T) The constituent words of the phrase (“Bill” is a feature of the phrase “Bill Clinton”) Classification formalism: Decision Lists

27 Slide adapted from Manning & Raghavan Collins&Singer: Seed Words Note that categories are more generic than in the case of Riloff/Jones.

28 Slide adapted from Manning & Raghavan Collins&Singer: Algorithm Train decision rules on current lexicon (initially: seed words). Result: new set of decision rules. Apply decision rules to training set Result: new lexicon Repeat

29 Slide adapted from Manning & Raghavan Collins&Singer: Results Per-token evaluation?

30 More Recent Work Knowitall system at U Washington WebFountain project at IBM

31 Slide adapted from Manning & Raghavan Lexica: Limitations Named entity recognition is more than lookup in a list. Linguistic variation Manage, manages, managed, managing Non-linguistic variation Human gene MYH6 in lexicon, MYH7 in text Ambiguity What if a phrase has two different semantic classes? Bioinformatics example: gene/protein metonymy

32 Slide adapted from Manning & Raghavan Discussion Partial resources often available. E.g., you have a gazetteer, you want to extend it to a new geographic area. Some manual post-editing necessary for high-quality. Semi-automated approaches offer good coverage with much reduced human effort. Drift not a problem in practice if there is a human in the loop anyway. Approach that can deal with diverse evidence preferable. Hand-crafted features (period for “N.Y.”) help a lot.