Download presentation

Presentation is loading. Please wait.

Published byCale Leeming Modified over 3 years ago

1
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS) Tagging

2
**Prof. Pushpak Bhattacharyya, IIT Bombay**

Tagging or Annotation Purpose is Disambiguation A word can have a number of labels The problem is to give unique label. PoS tagging makes use of the “local context”, whereas Sense tagging needs “long distance dependency” and hence difficult too. PoS tagging is needed in mainly parsing and also in other applications. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

3
**Prof. Pushpak Bhattacharyya, IIT Bombay**

Approaches Rule Based approach Statistical approach we will mainly focus on the statistical approach 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

4
**Prof. Pushpak Bhattacharyya, IIT Bombay**

Types of Tagging Tasks PoS Named entity Sense Parse tree 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

5
**Prof. Pushpak Bhattacharyya, IIT Bombay**

PoS Tagging Example “The Orange ducks clean the bills.” Assign tags to each word from the lexicon; multiple possibilities exist 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

6
**Prof. Pushpak Bhattacharyya, IIT Bombay**

Lexicon dictionary The: DT (Determiner) Orange: NN (Noun) JJ (Adjective) Duck: NN VB ( Basic verb) Clean: NN VB Bill: JJ, VB, NN are called as Syntactic entities or PoS tags 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

7
**PoS tagging as a sequence labelling task**

Task is to assign the correct PoS tag sequence to the words. It can be: Unigram: Consider one word while deciding the sequence. Multigram: Consider multiple words. 16 (=1*2*2*2*1*2) possible sequences for the “Duck” example. It is a classification problem: classify each word’s tag correctly into the right category. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

8
**Prof. Pushpak Bhattacharyya, IIT Bombay**

Challenges Lexical ambiguity: Multiple choices Morphology analysis: Find the root word Tokenization: Find word boundaries In Thai language there is no blank space Non trivial (example: capturing boundaries when the word is continued to the next line with a “-”) 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

9
**Prof. Pushpak Bhattacharyya, IIT Bombay**

Named Entity tagging Example 1: “Mohan went to school in Kolkata” Tagged as: “Mohan_Person went to School_Place in Kolkata_Place”. Example 2: “Kolkata bore the brunt of 1947 riots when 1947 children died at Kolkata. “Kolkata_? bore the brunt of 1947_year riots when 1947_num children died at Kolkata_Place. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

10
**Prof. Pushpak Bhattacharyya, IIT Bombay**

Sense tagging Detecting the meaning. Our example tagged as: The Orange_{colour} ducks_{bird} clean the bills_{body_part} Sense tagging has been done by means of hypernymy. Semantic relations like hypernymy are stored in the lexical resource called “WordNet”. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

11
**Prof. Pushpak Bhattacharyya, IIT Bombay**

Parse Tree tagging Example parse tree: 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

12
**Parse Tree tagging (contd.)**

Given a grammar, one can construct the parse tree. Annotation will produce following structure: [ [The_DT [Orange_JJ Ducks_NN]NP]NP [clean_VB[the_VB [bills_NN]NP]NP]VP]S This structure is called the Penn Treebank form From the Treebank form, one can arrive at a grammar through learning. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

13
**Statistical Formulation of the PoS tagging problem**

Input: W1,W2,...Wn words C1,C2,....Cm Lexical tags reposition (DT,JJ, NN et. al.) Output: “Best” PoS tag sequence Ci1, Ci2, Ci3....Cin for the given words. Best means: P(Ci1, Ci2, Ci3....Cin|W1,W2,...Wn) is the maximum of all possible C-sequence. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

14
**Statistical Formation of PoS tagging problem**

Example: P(DT JJ NN| The Orange duck) > P(DT NN VB| The Orange duck) is required Why?: Because given the phrase “The orange duck”, there is overwhelming evidence in the corpus that “DT JJ NN” is the right tag sequence. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

15
**Mathematical machinery**

06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

16
**Prof. Pushpak Bhattacharyya, IIT Bombay**

Bayes Theorem P(A|B) = (P(A).P(B|A)) / P(B) Where, P(A): Prior probability P(A|B): Posterior probability P(B|A): likelihood Why apply Bayes theorem: This is the Generative Vs Discriminative model question. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

17
**Prof. Pushpak Bhattacharyya, IIT Bombay**

Apply Bayes theorem P(Ci1, Ci2, Ci3....Cin|W1,W2,...Wn) = P(C|W) = where, C = <Ci1, Ci2, Ci3....Cin> W = <W1,W2,...Wn> P(C). P(W|C) P(W) 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

18
**Prof. Pushpak Bhattacharyya, IIT Bombay**

Best tag sequence C* = <Ci1, Ci2, Ci3....Cin>* , where * signifies best C- sequence = argmax(P(C|W)) As denominator is common in all the tag sequences Therefore, C* = argmax(P(C).P(W|C)) 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

19
**Prof. Pushpak Bhattacharyya, IIT Bombay**

Processing the1st part P(C) = P(Ci1, Ci2, Ci3....Cin) = P(Ci1).P(Ci2|Ci1).P(Ci3|Ci1. Ci2)..P(Cin|Ci1Ci2.. Cin-1) (on applying chain rule of probability) Ex: P(DT JJ NN) = P(DT).P(JJ|DT).P(NN|DT JJ) 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

20
**Prof. Pushpak Bhattacharyya, IIT Bombay**

Markov assumption Tag depends only on a window, not on everything that the “chain law” of probability demands. Kth order Markov assumption considers only previous K tags. Typical values of K = 3 for English, and (it seems) 5 for Hindi. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

21
**Prof. Pushpak Bhattacharyya, IIT Bombay**

Apply assumption With K=2, our problem will be: P(C) = P(Ci|Ci-1), i: 1..n C0: sentence beginning marker. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

22
**Exercise given in the lecture**

Contrast PoS tagging with Sense tagging. Find an example to show the difference. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Similar presentations

OK

Three Basic Problems Compute the probability of a text: P m (W 1,N ) Compute maximum probability tag sequence: arg max T 1,N P m (T 1,N | W 1,N ) Compute.

Three Basic Problems Compute the probability of a text: P m (W 1,N ) Compute maximum probability tag sequence: arg max T 1,N P m (T 1,N | W 1,N ) Compute.

© 2018 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google