1 I256: Applied Natural Language Processing Marti Hearst Sept 18, 2006.

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Machine Learning PoS-Taggers COMP3310 Natural Language Processing Eric.
Advertisements

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Word-counts, visualizations and N-grams Eric Atwell, Language Research.
Word Bi-grams and PoS Tags
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING PoS-Tagging theory and terminology COMP3310 Natural Language Processing.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)
Outline Why part of speech tagging? Word classes
Chapter 8. Word Classes and Part-of-Speech Tagging From: Chapter 8 of An Introduction to Natural Language Processing, Computational Linguistics, and Speech.
BİL711 Natural Language Processing
February 2007CSA3050: Tagging II1 CSA2050: Natural Language Processing Tagging 2 Rule-Based Tagging Stochastic Tagging Hidden Markov Models (HMMs) N-Grams.
LINGUISTICA GENERALE E COMPUTAZIONALE DISAMBIGUAZIONE DELLE PARTI DEL DISCORSO.
Natural Language Processing Lecture 8—9/24/2013 Jim Martin.
1 I256: Applied Natural Language Processing Marti Hearst Sept 13, 2006.
Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.
September PART-OF-SPEECH TAGGING Universita’ di Venezia 1 Ottobre 2003.
LING 388 Language and Computers Lecture 22 11/25/03 Sandiway FONG.
Hidden Markov Model (HMM) Tagging  Using an HMM to do POS tagging  HMM is a special case of Bayesian inference.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Part-Of-Speech (POS) Tagging.
1 SIMS 290-2: Applied Natural Language Processing Marti Hearst Sept 20, 2004.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Shallow Parsing.
POS based on Jurafsky and Martin Ch. 8 Miriam Butt October 2003.
1 I256: Applied Natural Language Processing Marti Hearst Sept 20, 2006.
POS Tagging HMM Taggers (continued). Today Walk through the guts of an HMM Tagger Address problems with HMM Taggers, specifically unknown words.
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books עיבוד שפות טבעיות - שיעור חמישי POS Tagging Algorithms עידו.
I256 Applied Natural Language Processing Fall 2009 Lecture 6 Introduction of Graphical Models Part of speech tagging Barbara Rosario.
Part of speech (POS) tagging
1 PART-OF-SPEECH TAGGING. 2 Topics of the next three lectures Tagsets Rule-based tagging Brill tagger Tagging with Markov models The Viterbi algorithm.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
1 SIMS 290-2: Applied Natural Language Processing Marti Hearst Sept 15, 2004.
1 SIMS 290-2: Applied Natural Language Processing Marti Hearst Sept 13, 2004.
Albert Gatt Corpora and Statistical Methods Lecture 9.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
ELN – Natural Language Processing Giuseppe Attardi
February 2007CSA3050: Tagging I1 CSA2050: Natural Language Processing Tagging 1 Tagging POS and Tagsets Ambiguities NLTK.
Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.
Parts of Speech Sudeshna Sarkar 7 Aug 2008.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
6. N-GRAMs 부산대학교 인공지능연구실 최성자. 2 Word prediction “I’d like to make a collect …” Call, telephone, or person-to-person -Spelling error detection -Augmentative.
CS 4705 Hidden Markov Models Julia Hirschberg CS4705.
Distributional Part-of-Speech Tagging Hinrich Schütze CSLI, Ventura Hall Stanford, CA , USA NLP Applications.
Natural Language Processing Lecture 8—2/5/2015 Susan W. Brown.
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes April 5, 2012.
10/12/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 10 Giuseppe Carenini.
Fall 2005 Lecture Notes #8 EECS 595 / LING 541 / SI 661 Natural Language Processing.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Word Bi-grams and PoS Tags COMP3310 Natural Language Processing Eric Atwell,
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Word classes and part of speech tagging Chapter 5.
Speech and Language Processing Ch8. WORD CLASSES AND PART-OF- SPEECH TAGGING.
Tokenization & POS-Tagging
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging I Introduction Tagsets Approaches.
Word classes and part of speech tagging 09/28/2004 Reading: Chap 8, Jurafsky & Martin Instructor: Rada Mihalcea Note: Some of the material in this slide.
CSA3202 Human Language Technology HMMs for POS Tagging.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
Modified from Diane Litman's version of Steve Bird's notes 1 Rule-Based Tagger The Linguistic Complaint –Where is the linguistic knowledge of a tagger?
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
Part-of-Speech Tagging & Sequence Labeling Hongning Wang
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Word classes and part of speech tagging Chapter 5.
Part-of-Speech Tagging CSCI-GA.2590 – Lecture 4 Ralph Grishman NYU.
1 COMP790: Statistical NLP POS Tagging Chap POS tagging Goal: assign the right part of speech (noun, verb, …) to words in a text “The/AT representative/NN.
1 Natural Language Processing Vasile Rus
Introduction to Textual Analysis
CSC 594 Topics in AI – Natural Language Processing
CSCI 5832 Natural Language Processing
Lecture 6: Part of Speech Tagging (II): October 14, 2004 Neal Snider
Natural Language Processing
Presentation transcript:

1 I256: Applied Natural Language Processing Marti Hearst Sept 18, 2006

2 Why do puns make us groan? He drove his expensive car into a tree and found out how the Mercedes bends. Isn't the Grand Canyon just gorges?

3 Why do puns make us groan? Time flies like an arrow. Fruit flies like a banana.

4 Predicting Next Words One reason puns make us groan is they play on our assumptions of what the next word will be. They also exploit homonymy – same sound, different spelling and meaning (bends, Benz; gorges, gorgeous) polysemy – same spelling, different meaning

5 Review: ConditonalFreqDist() Data Structure CFD is a collection of FreqDist() objects Indexed by the “condition” that is being tested or compared Initialize a new one: cfd = ConditionalFreqDist() Add a count cfd[‘austen-emma’].inc(‘she’) cfd[‘austen-pride’].inc(‘she’) cfd[‘austen-pride’].inc(‘he’) Can access each FreqDist object by indexing on condition cfd[‘austen-emma’].samples() # [‘she’] cfd[‘austen-pride’].N() # 2 Can also get a list of the conditions from the cfd object cfd.conditions() >> [‘austen-emma’, ‘austen-pride’]

6 Computing Next Words

7 Computing Bigrams via Storing Adjacent Word Counts cdf = ConditionalFreqDist() prev = None for word in sentence.split(“ “) cdf[prev].inc(word. lower()) prev = word.lower() sentence = “The dog ate the crab” prev = None, word = “the” prev = “the”, word = “dog” prev = “dog”, word = “ate” prev = “ate”, word = “the” prev = “the”, word = “crab” cdf[`the’].samples() = [‘dog’, ‘crab’]

8 Auto-generate a Story How to fix this? Use a random number generator.

9 Auto-generate a Story The choice() method chooses one item randomly from a list (from random import *)

10 Adapted from slide by Bonnie Dorr Applications Why do we want to predict a word, given some preceding words? Rank the likelihood of sequences containing various alternative hypotheses – e.g., for speech recognition Theatre owners say popcorn/unicorn sales have doubled... Assess the likelihood/goodness of a sentence –for text generation or machine translation. The doctor recommended a cat scan. El doctor recommendó una exploración del gato.

11 Python Tip: Lists can build Lists

12 Bigram counts How to get the counts in a compact way from a CFD for all the ngrams starting with a given word? How to include the words themselves along with their counts?

13 Comparing Modal Verb Counts How to implement this? Which modals best characterize each genre? Hint to get you started:

14 Comparing Modals

15 Comparing Modals

16 Part-of-Speech Tagging

17 Modified from Diane Litman's version of Steve Bird's notes Terminology Tagging The process of associating labels with each token in a text Tags The labels Tag Set The collection of tags used for a particular task

18 Example from the GENIA corpus Typically a tagged text is a sequence of white-space separated base/tag tokens: These/DT findings/NNS should/MD be/VB useful/JJ for/IN therapeutic/JJ strategies/NNS and/CC the/DT development/NN of/IN immunosuppressants/NNS targeting/VBG the/DT CD28/NN costimulatory/NN pathway/NN./.

19 Modified from Diane Litman's version of Steve Bird's notes What does Tagging do? 1.Collapses Distinctions Lexical identity may be discarded e.g., all personal pronouns tagged with PRP 2.Introduces Distinctions Ambiguities may be resolved e.g. deal tagged with NN or VB 3.Helps in classification and prediction

20 Modified from Diane Litman's version of Steve Bird's notes Significance of Parts of Speech A word’s POS tells us a lot about the word and its neighbors: Limits the range of meanings (deal), pronunciation (object vs object) or both (wind) Helps in stemming Limits the range of following words Can help select nouns from a document for summarization Basis for partial parsing (chunked parsing) Parsers can build trees directly on the POS tags instead of maintaining a lexicon

21 Slide modified from Massimo Poesio's Choosing a tagset The choice of tagset greatly affects the difficulty of the problem Need to strike a balance between Getting better information about context Make it possible for classifiers to do their job

22 Slide modified from Massimo Poesio's Some of the best-known Tagsets Brown corpus: 87 tags (more when tags are combined) Penn Treebank: 45 tags Lancaster UCREL C5 (used to tag the BNC): 61 tags Lancaster C7: 145 tags

23 Modified from Diane Litman's version of Steve Bird's notes The Brown Corpus An early digital corpus (1961) Francis and Kucera, Brown University Contents: 500 texts, each 2000 words long From American books, newspapers, magazines Representing genres: –Science fiction, romance fiction, press reportage scientific writing, popular lore

24 Modified from Diane Litman's version of Steve Bird's notes Penn Treebank First large syntactically annotated corpus 1 million words from Wall Street Journal Part-of-speech tags and syntax trees

25 What are the most frequent Brown tags?

26 Slide modified from Massimo Poesio's How hard is POS tagging? Number of tags Number of word types In the Brown corpus, 12% of word types ambiguous 40% of word tokens ambiguous

27 Tagging methods Hand-coded Statistical taggers Brill (transformation-based) tagger

28 nltk_lite tag package Type of taggers: tag.Default() tag.Regexp() tag.Lookup() tag.Affix() tag.Unigram() tag.Bigram() tag.Trigram() Actions: tag.tag() tag.tagsents() tag.untag() tag.train() tag.accuracy() tag.tag2tuple() tag.string2words() tag.string2tags()

29 Hand-coded Tagger Make up some regexp rules that make use of morphology

30 Compare to Brown tags

31 Modified from Massio Poesio's lecture Tagging with lexical frequencies Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN Problem: assign a tag to race given its lexical frequency Solution: we choose the tag that has the greater P(race|VB) P(race|NN) Actual estimate from the Switchboard corpus: P(race|NN) = P(race|VB) =.00003

32 Next Time N-gram taggers Training, testing, and determining accuracy The Brill tagger