Dragomir Radev Wrocław, Poland July 29, 2009 Computational Linguistics.

Slides:



Advertisements
Similar presentations
Discourse Analysis David M. Cassel Natural Language Processing Villanova University April 21st, 2005 David M. Cassel Natural Language Processing Villanova.
Advertisements

Statistical NLP: Lecture 3
Natural Language Processing Lecture 8—9/24/2013 Jim Martin.
1 Discourse, coherence and anaphora resolution Lecture 16.
Discourse Martin Hassel KTH NADA Royal Institute of Technology Stockholm
Chapter 18: Discourse Tianjun Fu Ling538 Presentation Nov 30th, 2006.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
Hidden Markov Model (HMM) Tagging  Using an HMM to do POS tagging  HMM is a special case of Bayesian inference.
1 Words and the Lexicon September 10th 2009 Lecture #3.
Fall 2005 Lecture Notes #2 EECS 595 / LING 541 / SI 661&761 Natural Language Processing.
Artificial Intelligence 2005/06 From Syntax to Semantics.
Fall 2005 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing.
POS based on Jurafsky and Martin Ch. 8 Miriam Butt October 2003.
Features and Unification
NLP and Speech 2004 English Grammar
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Part of speech (POS) tagging
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Outline of English Syntax.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
1 Pragmatics: Discourse Analysis J&M’s Chapter 21.
Announcements Main CSE file server went down last night –Hand in your homework using ‘submit_cse467’ as soon as you can – no penalty if handed in today.
Pragmatics I: Reference resolution Ling 571 Fei Xia Week 7: 11/8/05.
Stochastic POS tagging Stochastic taggers choose tags that result in the highest probability: P(word | tag) * P(tag | previous n tags) Stochastic taggers.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
9/8/20151 Natural Language Processing Lecture Notes 1.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
1 Natural Language Processing Computational Discourse.
Fall 2005 Lecture Notes #7 EECS 595 / LING 541 / SI 661 Natural Language Processing.
CS 4705 Hidden Markov Models Julia Hirschberg CS4705.
Fall 2004 Lecture Notes #2 EECS 595 / LING 541 / SI 661 Natural Language Processing.
Natural Language Processing Lecture 8—2/5/2015 Susan W. Brown.
Dr. Monira Al-Mohizea MORPHOLOGY & SYNTAX WEEK 11.
Natural Language Processing Lecture 6 : Revision.
Fall 2005 Lecture Notes #8 EECS 595 / LING 541 / SI 661 Natural Language Processing.
CS : Language Technology for the Web/Natural Language Processing Pushpak Bhattacharyya CSE Dept., IIT Bombay Constituent Parsing and Algorithms (with.
Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing.
Introduction to CL & NLP CMSC April 1, 2003.
NLP. Introduction to NLP Example: –Input: Written English (X) –Encoder: garbles the input (X->Y) –Output: Spoken English (Y) More examples: –Grammatical.
10/30/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 7 Giuseppe Carenini.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
Word classes and part of speech tagging Chapter 5.
Reference Resolution. Sue bought a cup of coffee and a donut from Jane. She met John as she left. He looked at her enviously as she drank the coffee.
Linguistic Essentials
Coherence and Coreference Introduction to Discourse and Dialogue CS 359 October 2, 2001.
1 Natural Language Processing Chapter Outline Reference –Kinds of reference phenomena –Constraints on co-reference –Preferences for co-reference.
Rules, Movement, Ambiguity
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Section 11.3 Features structures in the Grammar ─ Jin Wang.
Natural Language Processing
CSA3202 Human Language Technology HMMs for POS Tagging.
Artificial Intelligence 2004
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
Teaching English Grammar
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
NATURAL LANGUAGE PROCESSING
NLP. Introduction to NLP #include int main() { int n, reverse = 0; printf("Enter a number to reverse\n"); scanf("%d",&n); while (n != 0) { reverse =
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Statistical NLP: Lecture 3
Basic Parsing with Context Free Grammars Chapter 13
CS 388: Natural Language Processing: Syntactic Parsing
Lecture 17: Discourse: Anaphora Resolution and Coherence
Natural Language - General
Linguistic Essentials
David Kauchak CS159 – Spring 2019
Presentation transcript:

Dragomir Radev Wrocław, Poland July 29, 2009 Computational Linguistics

Example (from a famous movie) Dave Bowman: Open the pod bay doors, HAL. HAL: I’m sorry Dave. I’m afraid I can’t do that.

Instructor Dragomir Radev, Professor, Computer Science and Information, Linguistics, University of Michigan

Natural Language Understanding … about teaching computers to make sense of naturally occurring text. … involves programming, linguistics, artificial intelligence, etc. …includes machine translation, question answering, dialogue systems, database access, information extraction, game playing, etc.

Example How many different interpretations does the above sentence have? How many of them are reasonable/grammatical? I saw her fall

Silly sentences Children make delicious snacks Stolen painting found by tree I saw the Grand Canyon flying to New York Court to try shooting defendant Ban on nude dancing on Governor’s desk Red tape holds up new bridges Iraqi head seeks arms Blair wins on budget, more lies ahead Local high school dropouts cut in half Hospitals are sued by seven foot doctors In America a woman has a baby every 15 minutes. How does she do that?

Types of ambiguity Morphological: Joe is quite impossible. Joe is quite important. Phonetic: Joe’s finger got number. Part of speech: Joe won the first round. Syntactic: Call Joe a taxi. Pp attachment: Joe ate pizza with a fork. Joe ate pizza with meatballs. Joe ate pizza with Mike. Joe ate pizza with pleasure. Sense: Joe took the bar exam. Modality: Joe may win the lottery. Subjectivity: Joe believes that stocks will rise. Scoping: Joe likes ripe apples and pears. Negation: Joe likes his pizza with no cheese and tomatoes. Referential: Joe yelled at Mike. He had broken the bike. Joe yelled at Mike. He was angry at him. Reflexive: John bought him a present. John bought himself a present. Ellipsis and parallelism: Joe gave Mike a beer and Jeremy a glass of wine. Metonymy: Boston called and left a message for Joe.

NLP Information extraction Named entity recognition Trend analysis Subjectivity analysis Text classification Anaphora resolution, alias resolution Cross-document crossreference Parsing Semantic analysis Word sense disambiguation Word clustering Question answering Summarization Document retrieval (filtering, routing) Structured text (relational tables) Paraphrasing and paraphrasing/entailment ID Text generation Machine translation

Syntactic categories Substitution test: Nathalie likes { } cats. black Persian tabby small easy to raise Open (lexical) and closed (functional) categories: No-fly-zone yadda yadda yadda the in

Jabberwocky (Lewis Carroll) Twas brillig, and the slithy toves Did gyre and gimble in the wabe: All mimsy were the borogoves, And the mome raths outgrabe. "Beware the Jabberwock, my son! The jaws that bite, the claws that catch! Beware the Jubjub bird, and shun The frumious Bandersnatch!"

Phrase structure S NPVP NP VBD caught the butterfly ThatmanPP IN NP with anet

Sample phrase-structure grammar S  NP VP NP  AT NNS NP  AT NN NP  NP PP VP  VP PP VP  VBD VP  VBD NP P  IN NP AT  the NNS  children NNS  students NNS  mountains VBD  slept VBD  ate VBD  saw IN  in IN  of NN  cake

Phrase structure grammars Local dependencies Non-local dependencies Subject-verb agreement The women who found the wallet were given a reward. wh-extraction Should Peter buy a book? Which book should Peter buy? Empty nodes

Subcategorization Subject: The children eat candy. Object: The children eat candy. Prepositional phrase: She put the book on the table. Predicative adjective: We made the man angry. Bare infinitive: She helped me walk. To-infinitive: She likes to walk. Participial phrase: She stopped singing that tune at the end. That-clause: She thinks that it will rain tomorrow. Question-form clauses: She asked me what book I was reading.

Phrase structure ambiguity Grammars are used for generating and parsing sentences Parses Syntactic ambiguity Attachment ambiguity: Our company is training workers. The children ate the cake with a spoon. High vs. low attachment Garden path sentences: The horse raced past the barn fell. Is the book on the table red?

Sentence-level constructions Declarative vs. imperative sentences Imperative sentences: S  VP Yes-no questions: S  Aux NP VP Wh-type questions: S  Wh-NP VP Fronting (less frequent): On Tuesday, I would like to fly to San Diego

Semantics and pragmatics Lexical semantics and compositional semantics Hypernyms, hyponyms, antonyms, meronyms and holonyms (part-whole relationship, tire is a meronym of car), synonyms, homonyms Senses of words, polysemous words Homophony (bass). Collocations: white hair, white wine Idioms: to kick the bucket

Discourse analysis Anaphoric relations: 1. Mary helped Peter get out of the car. He thanked her. 2. Mary helped the other passenger out of the car. The man had asked her for help because of his foot injury. Information extraction problems (entity crossreferencing) Hurricane Hugo destroyed 20,000 Florida homes. At an estimated cost of one billion dollars, the disaster has been the most costly in the state’s history.

Pragmatics The study of how knowledge about the world and language conventions interact with literal meaning. Speech acts Research issues: resolution of anaphoric relations, modeling of speech acts in dialogues

Coordination Coordinate noun phrases: –NP  NP and NP –S  S and S –Similar for VP, etc.

Agreement Examples: –Do any flights stop in Chicago? –Do I get dinner on this flight? –Does Delta fly from Atlanta to Boston? –What flights leave in the morning? –* What flight leave in the morning? Rules: –S  Aux NP VP –S  3sgAux 3sgNP VP –S  Non3sgAux Non3sgNP VP –3sgAux  does | has | can … –non3sgAux  do | have | can …

Agreement We now need similar rules for pronouns, also for number agreement, etc. –3SgNP  (Det) (Card) (Ord) (Quant) (AP) SgNominal –Non3SgNP  (Det) (Card) (Ord) (Quant) (AP) PlNominal –SgNominal  SgNoun | SgNoun SgNoun –etc.

Combinatorial explosion What other phenomena will cause the grammar to expand? Solution: parameterization with feature structures (see Chapter 11)

Parsing as search S  NP VPDet  that | this |a S  Aux NP VPNoun  book | flight | meal | money S  VPVerb  book | include | prefer NP  Det NominalAux  does Nominal  NounProper-Noun  Houston | TWA Nominal  Noun NominalPrep  from | to | on NP  Proper-Noun VP  Verb VP  Verb NP Nominal  Nominal PP

Parsing as search Book that flight. S VP NP Nom VerbDetNoun Bookthatflight Two types of constraints on the parses: a) some that come from the input string, b) others that come from the grammar

Top-down parsing S NPVP S AuxVP S S NP S VP DetNom S NPVP PropN S NPVP DetNom S VP VNP Aux S NPVPAux PropN S VP V

Bookthatflight Bookthatflight NounDetNoun Bookthatflight VerbDetNoun Bookthatflight NounDetNoun Bookthatflight VerbDetNoun Bookthatflight NounDetNoun Bookthatflight VerbDetNoun Bookthatflight VerbDetNoun Bookthatflight VerbDetNoun Bookthatflight VerbDetNoun NOM VPNP VP Bottom-up parsing NP VP

Grammatical Relations and Free Ordering of Subject and Object OSV - Кого же Вася увидел?. - Машу Вася увидел. - (Actually,) who did Vasya see? - Vasya saw Masha SVO - Кого увидел Вася? - Вася увидел Машу. - Who did Vasya see? - Vasya saw Masha. OVS - Да кого увидел Вася? - Машу увидел Вася - Well, whom did Vasya see? - It was Masha whom Vasya saw. SOV - Кого же Вася увидел? - Вася Машу увидел - Who did Vasya see? - Vasya saw Masha VSO - Увидел Вася кого? - Увидел Вася Машу. - Who did Vasya see? - Vasya saw Masha VOS -Увидел Машу кто? -Увидел Машу Вася. -Who saw Masha, at the end? -It was Vasya who saw Masha Slide from Lori Levin, originally by Leonid Iomdin

Features and unification Grammatical categories have properties Constraint-based formalisms Example: this flights: agreement is difficult to handle at the level of grammatical categories Example: many water: count/mass nouns Sample rule that takes into account features: S  NP VP (but only if the number of the NP is equal to the number of the VP)

Feature structures CAT NP NUMBER SINGULAR PERSON 3 CAT NP AGREEMENT NUMBER SG PERSON 3 Feature paths: {x agreement number}

Unification [NUMBER SG] [NUMBER SG] + [NUMBER SG] [NUMBER PL] - [NUMBER SG] [NUMBER []] = [NUMBER SG] [NUMBER SG] [PERSON 3] = ?

Agreement S  NP VP {NP AGREEMENT} = {VP AGREEMENT} Does this flight serve breakfast? Do these flights serve breakfast? S  Aux NP VP {Aux AGREEMENT} = {NP AGREEMENT}

Agreement These flights This flight NP  Det Nominal {Det AGREEMENT} = {Nominal AGREEMENT} Verb  serve {Verb AGREEMENT NUMBER} = PL Verb  serves {Verb AGREEMENT NUMBER} = SG

Subcategorization VP  Verb {VP HEAD} = {Verb HEAD} {VP HEAD SUBCAT} = INTRANS VP  Verb NP {VP HEAD} = {Verb HEAD} {VP HEAD SUBCAT} = TRANS VP  Verb NP NP {VP HEAD} = {Verb HEAD} {VP HEAD SUBCAT} = DITRANS

Eliza [Weizenbaum, 1966] User: Men are all alike ELIZA: IN WHAT WAY User: They’re always bugging us about something or other ELIZA: CAN YOU THINK OF A SPECIFIC EXAMPLE? User: Well, my boyfriend made me come here ELIZA: YOUR BOYFRIEND MADE YOU COME HERE User: He says I’m depressed much of the time ELIZA: I AM SORRY TO HEAR THAT YOU ARE DEPRESSED

Eliza-style regular expressions s/.* YOU ARE (depressed|sad).*/I AM SORRY TO HEAR YOU ARE \1/ s/.* YOU ARE (depressed|sad).*/WHY DO YOU THINK YOU ARE \1/ s/.* all.*/IN WHAT WAY/ s/.* always.*/CAN YOU THINK OF A SPECIFIC EXAMPLE/ Step 1: replace first person references with second person references Step 2: use additional regular expressions to generate replies Step 3: use scores to rank possible transformations

Finite-state automata Finite-state automata (FSA) Regular languages Regular expressions

Finite-state automata (machines) baa! baaa! baaaa! baaaaa!... q0q0 q1q1 q2q2 q3q3 q4q4 baa! a baa+! state transition final state

Input tape aba!b q0q0

Finite-state automata Q: a finite set of N states q 0, q 1, … q N  : a finite input alphabet of symbols q 0 : the start state F: the set of final states  (q,i): transition function

State-transition tables Input Stat e ba!

Morphemes Stems, affixes Affixes: prefixes, suffixes, infixes: hingi (borrow) – humingi (agent) in Tagalog, circumfixes: sagen – gesagt in German Concatenative morphology Templatic morphology (Semitic languages) : lmd (learn), lamad (he studied), limed (he taught), lumad (he was taught)

Morphological analysis rewrites unbelievably

Inflectional morphology Tense, number, person, mood, aspect Five verb forms in English 40+ forms in French Six cases in Russian, seven in Polish Up to 40,000 forms in Turkish (you will cause X to cause Y to … do Z)

Derivational morphology Nominalization: computerization, appointee, killer, fuzziness Formation of adjectives: computational, embraceable, clueless

Finite-state morphological parsing Cats: cat +N +PL Cat: cat +N +SG Cities: city +N +PL Geese: goose +N +PL Ducks: (duck +N +PL) or (duck +V +3SG) Merging: +V +PRES-PART Caught: (catch +V +PAST-PART) or (catch +V +PAST)

Phonetic symbols IPA Arpabet Examples

Using WFST for language modeling Phonetic representation Part-of-speech tagging

Dependency grammars Lexical dependencies between head words Top-level predicate of a sentence is the root Useful for free word order languages Also simpler to parse

Dependencies John likes tabby cats NNPVBSJJNNS NP VP NP S

Discourse, dialogue, anaphora Example: John went to Bill’s car dealership to check out an Acura Integra. He looked at it for about half an hour. Example: I’d like to get from Boston to San Francisco, on either December 5th or December 6th. It’s okay if it stops in another city along the way.

Information extraction and discourse analysis Example: First Union Corp. is continuing to wrestle with severe problems unleashed by a botched merger and a troubled business strategy. According to industry insiders at Paine Webber, their president, John R. Georgius, is planning to retire by the end of the year. Problems with summarization and generation

Reference resolution The process of reference (associating “John” with “he”). Referring expressions and referents. Needed: discourse models Problem: many types of reference!

Example (from Webber 91) According to John, Bob bought Sue an Integra, and Sue bough Fred a legend. But that turned out to be a lie. - referent is a speech act. But that was false. - proposition That struck me as a funny way to describe the situation. - manner of description That caused Sue to become rather poor. - event That caused them both to become rather poor. - combination of several events.

Reference phenomena Indefinite noun phrases: I saw an Acura Integra today. Definite noun phrases: The Integra was white. Pronouns: It was white. Demonstratives: this Acura. Inferrables: I almost bought an Acura Integra today, but a door had a dent and the engine seemed noisy. Mix the flour, butter, and water. Kneed the dough until smooth and shiny.

Constraints on coreference Number agreement: John has an Acura. It is red. Person and case agreement: (*) John and Mary have Acuras. We love them (where We=John and Mary) Gender agreement: John has an Acura. He/it/she is attractive. Syntactic constraints: –John bought himself a new Acura. –John bought him a new Acura. –John told Bill to buy him a new Acura. –John told Bill to buy himself a new Acura –He told Bill to buy John a new Acura.

Preferences in pronoun interpretation Recency: John has an Integra. Bill has a Legend. Mary likes to drive it. Grammatical role: John went to the Acura dealership with Bill. He bought an Integra. (?) John and Bill went to the Acura dealership. He bought an Integra. Repeated mention: John needed a car to go to his new job. He decided that he wanted something sporty. Bill went to the Acura dealership with him. He bought an Integra.

Preferences in pronoun interpretation Parallelism: Mary went with Sue to the Acura dealership. Sally went with her to the Mazda dealership. ??? Mary went with Sue to the Acura dealership. Sally told her not to buy anything. Verb semantics: John telephoned Bill. He lost his pamphlet on Acuras. John criticized Bill. He lost his pamphlet on Acuras.

Salience weights in Lappin and Leass Sentence recency100 Subject emphasis80 Existential emphasis70 Accusative emphasis50 Indirect object and oblique complement emphasis 40 Non-adverbial emphasis50 Head noun emphasis80

Lappin and Leass (cont’d) Recency: weights are cut in half after each sentence is processed. Examples: –An Acura Integra is parked in the lot. (subject) –There is an Acura Integra parked in the lot. (existential predicate nominal) –John parked an Acura Integra in the lot. (object) –John gave Susan an Acura Integra. (indirect object) –In his Acura Integra, John showed Susan his new CD player. (demarcated adverbial PP)

Algorithm 1.Collect the potential referents (up to four sentences back). 2.Remove potential referents that do not agree in number or gender with the pronoun. 3.Remove potential referents that do not pass intrasentential syntactic coreference constraints. 4.Compute the total salience value of the referent by adding any applicable values for role parallelism (+35) or cataphora (-175). 5.Select the referent with the highest salience value. In case of a tie, select the closest referent in terms of string position.

Example John saw a beautiful Acura Integra at the dealership last week. He showed it to Bill. He bought it. RecSubj Exis tObj Ind Obj Non Adv Hea d NTotal John Integra dealershi p

Example (cont’d) ReferentPhrasesValue John{John}155 Integra{a beautiful Acura Integra}140 dealership{the dealership}115

Example (cont’d) ReferentPhrasesValue John{John, he 1 }465 Integra{a beautiful Acura Integra}140 dealership{the dealership}115

Example (cont’d) ReferentPhrasesValue John{John, he 1 }465 Integra{a beautiful Acura Integra, it}420 dealership{the dealership}115

Example (cont’d) ReferentPhrasesValue John{John, he 1 }465 Integra{a beautiful Acura Integra, it}420 Bill{Bill}270 dealership{the dealership}115

Example (cont’d) ReferentPhrasesValue John{John, he 1 }232.5 Integra {a beautiful Acura Integra, it 1 }210 Bill{Bill}135 dealership{the dealership}57.5

Observations Lappin & Leass - tested on computer manuals - 86% accuracy on unseen data. Centering (Grosz, Josh, Weinstein): additional concept of a “center” – at any time in discourse, an entity is centered. Backwards looking center; forward looking centers (a set). Centering has not been automatically tested on actual data.

Part of speech tagging Problems: transport, object, discount, address More problems: content French: est, président, fils “Book that flight” – what is the part of speech associated with “book”? POS tagging: assigning parts of speech to words in a text. Three main techniques: rule-based tagging, stochastic tagging, transformation-based tagging

Rule-based POS tagging Use dictionary or FST to find all possible parts of speech Use disambiguation rules (e.g., ART+V) Typically hundreds of constraints can be designed manually

Example in French ^ beginning of sentence La rf b nms u article teneur nfs nms noun feminine singular Moyenne jfs nfs v1s v2s v3s adjective feminine singular en p a b preposition uranium nms noun masculine singular des p r preposition rivi`eres nfp noun feminine plural, x punctuation bien_que cs subordinating conjunction délicate jfs adjective feminine singular À p preposition calculer v verb

Sample rules BS3 BI1: A BS3 (3rd person subject personal pronoun) cannot be followed by a BI1 (1st person indirect personal pronoun). In the example: ``il nous faut'' ({\it we need}) - ``il'' has the tag BS3MS and ``nous'' has the tags [BD1P BI1P BJ1P BR1P BS1P]. The negative constraint ``BS3 BI1'' rules out ``BI1P'', and thus leaves only 4 alternatives for the word ``nous''. N K: The tag N (noun) cannot be followed by a tag K (interrogative pronoun); an example in the test corpus would be: ``... fleuve qui...'' (...river, that...). Since ``qui'' can be tagged both as an ``E'' (relative pronoun) and a ``K'' (interrogative pronoun), the ``E'' will be chosen by the tagger since an interrogative pronoun cannot follow a noun (``N''). R V:A word tagged with R (article) cannot be followed by a word tagged with V (verb): for example ``l' appelle'' (calls him/her). The word ``appelle'' can only be a verb, but ``l''' can be either an article or a personal pronoun. Thus, the rule will eliminate the article tag, giving preference to the pronoun.

Confusion matrix INJJNNNNPRBVBDVBN IN-.2.7 JJ NN NNP RB VBD VBN Most confusing: NN vs. NNP vs. JJ, VBD vs. VBN vs. JJ

HMM Tagging T = argmax P(T|W), where T=t 1,t 2,…,t n By Bayes’s theorem: P(T|W) = P(T)P(W|T)/P(W) Thus we are attempting to choose the sequence of tags that maximizes the rhs of the equation P(W) can be ignored P(T)P(W|T) = ? P(T) is called the prior, P(W|T) is called the likelihood.

HMM tagging (cont’d) P(T)P(W|T) = P(w i |w 1 t 1 …w i-1 t i-1 t i )P(t i |t 1 …t i-2 t i-1 ) Simplification 1: P(W|T) = P(w i |t i ) Simplification 2: P(T)= P(t i |t i-1 ) T = argmax P(T|W) = argmax P(w i |t i ) P(t i |t i- 1 )    

Estimates P(NN|DT) = C(DT,NN)/C(DT)=56509/ =.49 P(is|VBZ = C(VBZ,is)/C(VBZ)=10073/21627=.47

Example Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NR People/NNS continue/VBP to/TO inquire/VB the/AT reason/NN for/IN the/AT race/NN for/IN outer/JJ space/NN TO: to+VB (to sleep), to+NN (to school)

Example NNPVBZVBNTO VB NR Secretariatisexpectedracetomorrowto NNPVBZVBNTO NN NR Secretariatisexpectedracetomorrowto

Example (cont’d) P(NN|TO) = P(VB|TO) =.83 P(race|NN) = P(race|VB) = P(NR|VB) =.0027 P(NR|NN) =.0012 P(VB|TO)P(NR|VB)P(race|VB) = P(NN|TO)P(NR|NN)P(race|NN) =

Decoding Finding what sequence of states is the source of a sequence of observations Viterbi decoding (dynamic programming) – finding the optimal sequence of tags Input: HMM and sequence of words, output: sequence of states