CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging I Introduction Tagsets Approaches.

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
Advertisements

CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)
CPSC 422, Lecture 16Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 16 Feb, 11, 2015.
Outline Why part of speech tagging? Word classes
Chapter 8. Word Classes and Part-of-Speech Tagging From: Chapter 8 of An Introduction to Natural Language Processing, Computational Linguistics, and Speech.
BİL711 Natural Language Processing
Part-of-speech tagging. Parts of Speech Perhaps starting with Aristotle in the West (384–322 BCE) the idea of having parts of speech lexical categories,
Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules.
Natural Language Processing Lecture 8—9/24/2013 Jim Martin.
LING 388 Language and Computers Lecture 22 11/25/03 Sandiway FONG.
1 A Hidden Markov Model- Based POS Tagger for Arabic ICS 482 Presentation A Hidden Markov Model- Based POS Tagger for Arabic By Saleh Yousef Al-Hudail.
Part of Speech Tagging with MaxEnt Re-ranked Hidden Markov Model Brian Highfill.
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Part-Of-Speech (POS) Tagging.
Ch 10 Part-of-Speech Tagging Edited from: L. Venkata Subramaniam February 28, 2002.
POS based on Jurafsky and Martin Ch. 8 Miriam Butt October 2003.
I256 Applied Natural Language Processing Fall 2009 Lecture 6 Introduction of Graphical Models Part of speech tagging Barbara Rosario.
Part of speech (POS) tagging
Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur Part-of-Speech Tagging and Chunking with Maximum Entropy Model Sandipan Dandapat.
CMSC 723 / LING 645: Intro to Computational Linguistics November 3, 2004 Lecture 9 (Dorr): Word Classes, POS Tagging (Chapter 8) Intro to Syntax (Start.
Part-of-Speech Tagging & Sequence Labeling
Word classes and part of speech tagging Chapter 5.
Albert Gatt Corpora and Statistical Methods Lecture 9.
February 2007CSA3050: Tagging I1 CSA2050: Natural Language Processing Tagging 1 Tagging POS and Tagsets Ambiguities NLTK.
8. Word Classes and Part-of-Speech Tagging 2007 년 5 월 26 일 인공지능 연구실 이경택 Text: Speech and Language Processing Page.287 ~ 303.
Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
Parts of Speech Sudeshna Sarkar 7 Aug 2008.
Some Advances in Transformation-Based Part of Speech Tagging
Graphical models for part of speech tagging
Distributional Part-of-Speech Tagging Hinrich Schütze CSLI, Ventura Hall Stanford, CA , USA NLP Applications.
Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur Part-of-Speech Tagging for Bengali with Hidden Markov Model Sandipan Dandapat,
Albert Gatt Corpora and Statistical Methods Lecture 10.
인공지능 연구실 정 성 원 Part-of-Speech Tagging. 2 The beginning The task of labeling (or tagging) each word in a sentence with its appropriate part of speech.
Natural Language Processing Lecture 6 : Revision.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
10/24/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 6 Giuseppe Carenini.
Hindi Parts-of-Speech Tagging & Chunking Baskaran S MSRI.
Part-of-Speech Tagging Foundation of Statistical NLP CHAPTER 10.
CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
10/30/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 7 Giuseppe Carenini.
13-1 Chapter 13 Part-of-Speech Tagging POS Tagging + HMMs Part of Speech Tagging –What and Why? What Information is Available? Visible Markov Models.
Word classes and part of speech tagging Chapter 5.
Speech and Language Processing Ch8. WORD CLASSES AND PART-OF- SPEECH TAGGING.
Tokenization & POS-Tagging
Word classes and part of speech tagging 09/28/2004 Reading: Chap 8, Jurafsky & Martin Instructor: Rada Mihalcea Note: Some of the material in this slide.
Natural Language Processing
CSA3202 Human Language Technology HMMs for POS Tagging.
CPSC 422, Lecture 15Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 15 Oct, 14, 2015.
February 2007CSA3050: Tagging III and Chunking 1 CSA2050: Natural Language Processing Tagging 3 and Chunking Transformation Based Tagging Chunking.
Part-of-speech tagging
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Shallow Parsing for South Asian Languages -Himanshu Agrawal.
Human Language Technology Part of Speech (POS) Tagging II Rule-based Tagging.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
POS Tagging1 POS Tagging 1 POS Tagging Rule-based taggers Statistical taggers Hybrid approaches.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
Part-of-Speech Tagging & Sequence Labeling Hongning Wang
Word classes and part of speech tagging Chapter 5.
POS TAGGING AND HMM Tim Teks Mining Adapted from Heng Ji.
Graphical Models for Segmenting and Labeling Sequence Data Manoj Kumar Chinnakotla NLP-AI Seminar.
Tasneem Ghnaimat. Language Model An abstract representation of a (natural) language. An approximation to real language Assume we have a set of sentences,
Part-Of-Speech Tagging Radhika Mamidi. POS tagging Tagging means automatic assignment of descriptors, or tags, to input tokens. Example: “Computational.
Lecture 9: Part of Speech
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 15
LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong.
CSC 594 Topics in AI – Natural Language Processing
CSCI 5832 Natural Language Processing
Natural Language Processing
Presentation transcript:

CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging I Introduction Tagsets Approaches

April 2005CLINT Lecture IV2 Acknowledgment Most slides taken from Bonnie Dorr’s course notes: In turn based on Jurafsky & Martin Chapter 8

Bibliography R. Weischedel, R. Schwartz, J. Palmucci, M. Meteer, L. Ramshaw, Coping with Ambiguity and Unknown Words through Probabilistic Models, Computational Linguistics 19.2, pp ,1993 [pdf]pdf Samuelsson, C., Morphological tagging based entirely on Bayesian inference, in 9th Nordic Conference on Computational Linguistics, NODALIDA-93, Stockholm, (see [html])html A. Ratnaparkhi, A maximum entropy model for part of speech tagging. Proceedings of the Conference on Empirical Methods in Natural Language, 1996 Processing [pdf]pdf

April 2005CLINT Lecture IV4 Outline The tagging task Tagsets Three different approaches

April 2005CLINT Lecture IV5 Definition: PoS-Tagging “ Part-of-Speech Tagging is the process of assigning a part-of-speech or other lexical class marker to each word in a corpus ” (Jurafsky and Martin) the girl kissed the boy on the cheek WORDS TAGS N V P DET

April 2005CLINT Lecture IV6 Motivation Corpus analysis of tagged corpora yields useful information Speech synthesis — pronunciation CONtent (N) vs. conTENT (Adj) Speech recognition — word class-based N- grams predict category of next word. Information retrieval stemming selection of high-content words Word-sense disambiguation

English Parts of Speech 1. Pronoun: any substitute for a noun or noun phrase 2. Adjective: any qualifier of a noun 3. Verb: any action or state of being 4. Adverb: any qualifier of an adjective verb 5. Preposition: any establisher of relation and syntactic context 6. Conjunction: any syntactic connector 7. Interjection: any emotional greeting (or "exclamation"),

April 2005CLINT Lecture IV8 Tagsets: how detailed? Swedish SUC25 Penn Treebank46 German STTS50 Lancaster BNC61 Lancaster Full146

April 2005CLINT Lecture IV9 Penn Treebank Tagset PRP PRP$

April 2005CLINT Lecture IV10 Example of Penn Treebank Tagging of Brown Corpus Sentence The/DT grand/JJ jury/NN commented/VBD on/IN a/DT number/NN of/IN other/JJ topics/NNS./. VB DT NN. Book that flight. VBZ DT NN VB NN ? Does that flight serve dinner ?

2 Problems Multiple tags for the same word Unknown words

April 2005CLINT Lecture IV12 Multiple tags for the same word 1. He can can a can. 2. I can light a fire and you can open a can of beans. Now the can is open, and we can eat in the light of the fire. 3. Flying planes can be dangerous.

April 2005CLINT Lecture IV13 Multiple tags for the same word Words often belong to more than one word class: this This is a nice day = PRP (pronoun) This day is nice = DT (determiner) You can go this far = RB (adverb) Many of the most common words (by volume of text) are ambiguous

April 2005CLINT Lecture IV14 How Hard is the Tagging Task? In the Brown Corpus 11.5% of word types are ambiguous 40% of word tokens are ambiguous Most words in English are unambiguous. Many of the most common words are ambiguous. Typically ambiguous tags are not equally probable.

April 2005CLINT Lecture IV15 Word Class Ambiguity (in the Brown Corpus) Unambiguous (1 tag): 35,340 types Ambiguous (2-7 tags): 4,100 types. 2 tags3,760 3 tags264 4 tags61 5 tags12 6 tags2 7 tags1 (Derose, 1988)

April 2005CLINT Lecture IV16 3 Approaches to Tagging 1.Rule-Based Tagger: ENCG Tagger (Voutilainen 1995,1999) 2.Stochastic Tagger: HMM-based Tagger 3.Transformation-Based Tagger: Brill Tagger (Brill 1995)

Unknown Words 1. Assume all unknown word is ambiguous amongst all possible tags Advantage: simplicity Disadvantage: ignores the fact that unknown words are unlikely to be closed class 2. Assume that probability distribution of unknown words is same as words that have been seen just once. 3. Make use of morphological information

Combining Features The last method makes use of different features, e.g. ending in -ed (suggest verb) or initial capital (suggests proper noun). Typically, a given tag is correlated with a combination of such features. These have to be incorporated into the statistical model.

Combining Tag-Predicting Features in Unknown Words HMM Models Weischedel et. al. (1993): for each feature f and tag t (e.g. proper noun) build a probability estimator p(f|t). Assume independence and multiply probabilities together Samuelsson (1993), rather than preselecting features, considers all possible suffixes up to length 10 as features for predicting tags

Combining Tag-Predicting Features in Unknown Words Maximum Entropy (ME) Models. A ME model is a classifier which assigns a class to an observation by computing a probability from an exponential function of a weighted set of features of the observation An MEMM uses the Viterbi Algorithm to extend the application of ME to labelling a sequence of observations. For further details see Ratnaparkhi (1996)

Summary External parameters to the tagging task are (i) the size of the chosen tagset and (ii) the coverage of the lexicon which gives possible tags to words. Two main problems: (i) disambiguation of tags and (ii) dealing with unknown words Several methods are available for dealing with (ii): HMMs and MEMMs