Presentation is loading. Please wait.

Presentation is loading. Please wait.

Part-of-Speech Tagging Using Hidden Markov Models

Similar presentations


Presentation on theme: "Part-of-Speech Tagging Using Hidden Markov Models"— Presentation transcript:

1 Part-of-Speech Tagging Using Hidden Markov Models
Alankar kotwal Sharad mirani 12d d070003 Hi, I’m Alankar, this is Sharad. Our project is about a crucial part of Natural Language Processing, called POS Tagging. EE 638: Estimation and Identification

2 Teach a computer to recognize the role of words in a sentence
So this is the problem in general.

3 ‘Roles of words’? Why do we care?
Text ? Understanding Text ‘Why do we care?’ is a natural question to ask. The entire field of Natural Language Processing is built with the hope that computes can, one day, ‘understand’ text content as well as humans can. However, there is an inherent gap between words, their meanings an the concept they convey in a sentence. Clearly, all three are important in understanding the sentence. Our job here is to deal with the highest level of all, at a concept level. Each word expresses a concept: a noun is an entity, an adjective is a quality. We see how parts of speech come in naturally while trying to understand the ‘concept’ carried by words in a sentence.

4 Why is this non-trivial?
Are you understanding this? She was very understanding. Verb (Participle)! Adjective! One might say, this could easily be achieved! Build a set of rules that dictate what word is what part of speech. But there exist words that can act as multiple parts of speech with changing context. Even the word ‘dogs’ can be a verb. Some words like ‘still’ can represent as many as 7 distinct parts of speech. Now let us take the example on the screen. What is the difference between ‘understanding’ used in both sentences? It is a verb in the first sentence and an adjective in the second. Note that given the ‘context’ of the word, it is possible to predict the POS definitively! This turns out to be true in general. In fact, this motivates why Hidden Markov Models are such a good fit for the problem. Context Context

5 The Markov Model: Transitions
Noun Pronoun Start Verb End Thus, in general, a small sentence model would look like this. For instance, an instance ‘She was very understanding’ would have been pronoun, verb, adj, adj. Article Adjective

6 The Markov Model: Emissions
Noun He, She, It Is, Was, Presenting Estimation, Identification Pronoun Start Verb End Each state would emit words, we model the distribution of words it emits. Unknown words? Define word __UNK__ which has equal probability of emission for all states. Beautiful, Interesting Article A, An, The Adjective

7 Sentence Representation
Words POS Input Numbers HashMap Computers cannot handle words. How do we represent sentences? HashMaps.

8 Our dataset: WSJ Excerpts
Confidence/NN in/IN the/DT pound/NN is/VBZ widely/RB expected/VBN to/TO take/VB another/DT sharp/JJ dive/NN ./. Word/POS We picked out a dataset of sentences taken from the Wall Street Journal. You see an example on the screen.

9 Estimating Transition Probability
Training Estimating Transition Probability Increment the “next POS” count for each pair of POS Estimating Emission Probability Increment the “POS type” for each word How do we train?

10 Hash the input test cases
Testing Hash the input test cases Run Viterbi algorithm How do we test?

11 Our results A set is a collection. => {ART, NN, VBZ, ART, NN, .} Did you set my alarm? => {VBD, PRP, VB, PRP$, NN, .} When bank financing for the buy-out collapsed last week, so did UAL's stock. => {WRB, NN, NN, IN, DT, NN, VBD, JJ, NN, ,, RB, VBD, NNP, POS, NN, .} This is my awesome* project for Estimation*. => {DT, VBZ, PRP$, JJ, NN, IN, NN, .} A few results demonstrating the capability of the HMM to model context. The stars denote words not used in the train set. Train set with 8637 sentences, test set with 1952 sentences, 45 POS => Error rate 5.17%

12 Where do we fail? Unknown words in some cases => Not much we can do without ‘meaning’ Confusing usage: How do you tell if set is present or past? => Global context? Difficulty disambiguating POS from just previous word => Trigram HMMs! So here are our failure cases. Compare these to what a human would do in the same situation!

13 Thank You It has been a real pleasure exploring NLP from a statistical perspective! Thank you!


Download ppt "Part-of-Speech Tagging Using Hidden Markov Models"

Similar presentations


Ads by Google