Natural Language Processing

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING PoS-Tagging theory and terminology COMP3310 Natural Language Processing.
Advertisements

Grammar Spinner Touch any part of the screen to begin. (Or click your mouse) Touch the screen again each time you want to spin.
In your notebooks make a list of the following for your MadLib. The number next to the part of speech indicates how many of each word you need. Plural.
Ch 9 Part of Speech Tagging (slides adapted from Dan Jurafsky, Jim Martin, Dekang Lin, Rada Mihalcea, and Bonnie Dorr and Mitch Marcus.) ‏
BİL711 Natural Language Processing
Noun. Noun - verb noun Noun - verb article- adj. - adj. - Noun - verb.
Part-of-speech tagging. Parts of Speech Perhaps starting with Aristotle in the West (384–322 BCE) the idea of having parts of speech lexical categories,
Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules.
Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance July 27 EMNLP 2011 Shay B. Cohen Dipanjan Das Noah A. Smith Carnegie Mellon University.
Natural Language Processing Lecture 8—9/24/2013 Jim Martin.
LING 388 Language and Computers Lecture 22 11/25/03 Sandiway FONG.
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
Learning Bit by Bit Hidden Markov Models. Weighted FSA weather The is outside
POS based on Jurafsky and Martin Ch. 8 Miriam Butt October 2003.
Word classes and part of speech tagging Chapter 5.
Parts of Speech Created by Ms. Cherylon E. Lewis English/Language Arts
CS224N Interactive Session Competitive Grammar Writing Chris Manning Sida, Rush, Ankur, Frank, Kai Sheng.
1 POS Tagging: Introduction Heng Ji Feb 2, 2008 Acknowledgement: some slides from Ralph Grishman, Nicolas Nicolov, J&M.
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
Parts of Speech Sudeshna Sarkar 7 Aug 2008.
Czech-English Word Alignment Ondřej Bojar Magdalena Prokopová
Hindi Parts-of-Speech Tagging & Chunking Baskaran S MSRI.
Dr.Chisolm What’s happening twitter.com/DrChisolmPlace
10/30/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 7 Giuseppe Carenini.
_____________________ Definition Part of Speech (circle one) Picture Antonym (Opposite) Vocab Word Noun Pronoun Adjective Adverb Conjunction Verb Interjection.
Word classes and part of speech tagging Chapter 5.
Linguistics The eleventh week. Chapter 4 Syntax  4.1 Introduction  4.2 Word Classes.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging I Introduction Tagsets Approaches.
GoBack definitions Level 1 Parts of Speech GoBack is a memorization game; the teacher asks students definitions, and when someone misses one, you go back.
Part-of-speech tagging
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
LING/C SC/PSYC 438/538 Lecture 18 Sandiway Fong. Adminstrivia Homework 7 out today – due Saturday by midnight.
The Parts of Speech nouns verbs adjectives adverbs prepositions interjections conjunctions pronouns.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
Part-of-Speech Tagging & Sequence Labeling Hongning Wang
Word classes and part of speech tagging Chapter 5.
Parts of Speech By: Miaya Nischelle Sample. NOUN A noun is a person place or thing.
SI485i : NLP Set 7 Syntax and Parsing. 2 Syntax Grammar, or syntax: The kind of implicit knowledge of your native language that you had mastered by the.
POS TAGGING AND HMM Tim Teks Mining Adapted from Heng Ji.
ENGLISH is a language Learning mode of ENGLISH Subject Language(Spoken) Literature Competition.
Speech and Language Processing SLP Chapter 5. 10/31/1 2 Speech and Language Processing - Jurafsky and Martin 2 Today  Parts of speech (POS)  Tagsets.
CSC 594 Topics in AI – Natural Language Processing
Lecture 5 POS Tagging Methods
Lecture 9: Part of Speech
CSC 594 Topics in AI – Natural Language Processing
English Basics Mrs.Azzah.
Introduction NLP Applications
Parts of Speech Review.
Introduction to Machine Learning and Text Mining
Words are classified into eight categories
wow the hungry cat chase the mouse under the table and quickly ate it
CS4705 Part of Speech tagging
CSC 594 Topics in AI – Natural Language Processing
LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong.
CSC 594 Topics in AI – Natural Language Processing
CSC 594 Topics in AI – Natural Language Processing
KEY CHALLENGES Overview POS Tagging
Grammar Review.
CSCI 5832 Natural Language Processing
Automatically Enhancing Tagging Accuracy and Readability for Common Freeware Taggers Martin Weisser Center for Linguistics & Applied Linguistics Guangdong.
GRAMMAR: PARTS OF SPEECH
Core Concepts Lecture 1 Lexical Frequency.
Parts of the speech and abbreviations
FIRST SEMESTER GRAMMAR
Natural Language Processing
The building blocks of language!
Part-of-Speech Tagging Using Hidden Markov Models
Natural Language Processing (NLP)
Parts of Speech.
Presentation transcript:

Natural Language Processing Part-of-Speech Tagging

Parts of Speech 8–10 traditional parts of speech Variously called: Noun, verb, adjective, adverb, preposition, article, interjection, pronoun, conjunction, … Variously called: Parts of speech, lexical categories, word classes, morphological classes, lexical tags, ... Lots of debate within linguistics about the number, nature, and universality of these We’ll completely ignore this debate

POS Examples N noun chair, bandwidth, pacing V verb study, debate, munch ADJ adjective purple, tall, ridiculous ADV adverb unfortunately, slowly P preposition of, by, to DET determiner the, a, that, those INT interjection ouch, hey PRO pronoun I, me, mine CONJ conjunction and, but, for, because

POS Tagging The process of assigning a part-of-speech or lexical class marker to each word in a collection WORD tag the DET koala N put V the DET keys N on P table N

Why is POS Tagging Useful? First step of a vast number of practical tasks Speech synthesis How to pronounce “lead”? INsult inSULT OBject obJECT OVERflow overFLOW DIScount disCOUNT CONtent conTENT Parsing Need to know if a word is an N or V before you can parse Information extraction Finding names, relations, etc. Machine Translation

Open and Closed Classes Closed class: a small fixed membership Prepositions: of, in, by, … Auxiliaries: may, can, will had, been, … Pronouns: I, you, she, mine, his, them, … Usually function words (short common words which play a role in grammar) Open class: new ones can be created all the time English has 4: Nouns, Verbs, Adjectives, Adverbs Many languages have these 4, but not all!

POS Tagging Choosing a Tagset There are so many parts of speech, potential distinctions we can draw To do POS tagging, we need to choose a standard set of tags to work with Could pick very coarse tagsets N, V, Adj, Adv, … More commonly used set is finer grained, the “Penn TreeBank tagset”, 45 tags PRP$, WRB, WP$, VBG Even more fine-grained tagsets exist

Penn TreeBank POS Tagset

POS Tagging Words often have more than one POS: back The back door = JJ On my back = NN Win the voters back = RB Promised to back the bill = VB The POS tagging problem is to determine the tag for a particular instance of a word These examples from Dekang Lin

How Hard is POS Tagging? Measuring Ambiguity

Evaluation So once you have you POS tagger running how do you evaluate it? Overall error rate with respect to a manually annotated gold-standard test set Error rates on particular tags Tag confusions ... Accuracy typically reaches 96–97% for English newswire text What about Turkish? What about twitter?

Error Analysis Look at a confusion matrix See what errors are causing problems Noun (NN) vs ProperNoun (NNP) vs Adj (JJ) Preterite (VBD) vs Participle (VBN) vs Adjective (JJ)