Download presentation
Presentation is loading. Please wait.
Published byRafe Welch Modified over 9 years ago
1
POS Tagging & Chunking Sambhav Jain LTRC, IIIT Hyderabad
2
POS Tagging - Introduction Need for POS Tag ? – Assigning category which are < Vocab – Essential for systems – Has grammatical information 2POS Tagging and Chunking
3
Building a POS Tagger Rule Based Approaches – Come up with rules – Eg. Substitution: The {sad,green,fat,old} one in the garden. Statistical Learning / Machine Learning – Make machine learn automatically from annotated instances – Unsupervised - Baum Welch Pros/Cons of each approach 3POS Tagging and Chunking
4
Machine Learning ---- ??? How can machine learn from examples (on Board) 4POS Tagging and Chunking
5
Problem Statement Given W = w0 w1 w2 w3 w4 w5 w6 w7 Find T = t1 t2 t3 t4 t5 t6 t7 OR That T for which P(T/W) should be maximum 5POS Tagging and Chunking
6
Hidden Markov Models P(T/W) = P(W/T)*P(T)/P(W) T= Argmax T P(W/T)*P(T)/P(W) = Argmax T P(W/T)*P(T) P(W/T) = P(w0 w1 w2 w3 w4 w5 w6 w7 / t1 t2 t3 t4 t5 t6 t7 ) (On Board) – Chain Rule Markov Assumption 6POS Tagging and Chunking
7
Hidden Markov Models How can we learn these values form annotated corpus. Emission Matrix Transition Matrix Example (on Board) 7POS Tagging and Chunking
8
Hidden Markov Models Given Emission/Transition Matrix – Tag a new sequence – Complexity Viterbi Algorithm - Decoding Example (On Board) 8POS Tagging and Chunking
9
Viterbi Algorithm (Decoding) Most probable tag sequence given text: T*= arg max T P λ (T | W) = arg max T P λ (W | T) P λ (T) / P λ (W) (Bayes’ Theorem) = arg max T P λ (W | T) P λ (T) (W is constant for all T) = arg max T i [ a(t i-1 t i ) b(w i | t i ) ] = arg max T i log [ a(t i-1 t i ) b(w i | t i ) ] 9POS Tagging and Chunking
10
t1t1 t2t2 t3t3 w1w1 t1t1 t2t2 t3t3 w2w2 t1t1 t2t2 t3t3 w3w3 t0t0 A(,) t1t1 t2t2 t3t3 t 0 0.0050.020.1 t 1 0.020.10.005 t 2 0.50.0005 t 3 0.05 0.005 B(,) w1w1 w2w2 w3w3 t1t1 0.20.005 t2t2 0.020.20.0005 t3t3 0.02 0.05 10POS Tagging and Chunking
11
-log A t1t1 t2t2 t3t3 t 0 2.31.71 t 1 1.712.3 t 2 0.33.3 t 3 1.3 2.3 -log B w1w1 w2w2 w3w3 t1t1 0.72.3 t2t2 1.70.73.3 t3t3 1.7 1.3 t1t1 t2t2 t3t3 w1w1 t1t1 t2t2 t3t3 w2w2 t1t1 t2t2 t3t3 w3w3 t0t0 -1.7 -0.3 -1.3 -3 -3.4 -2.7 -2.3 -1.7 -6 -4.7 -6.7 -1.7 -0.3 -1.3 -7.3 -9.3 -10.3 11POS Tagging and Chunking
12
Tools for POS Tagging It is a Sequence Labeling Task Tools – HMM based – TNT Tagger (http://www.coli.uni-saarland.de/~thorsten/tnt/)http://www.coli.uni-saarland.de/~thorsten/tnt/ – CRF based – CRF++ (http://crfpp.googlecode.com/svn/trunk/doc/index.html)http://crfpp.googlecode.com/svn/trunk/doc/index.html 12POS Tagging and Chunking
13
CRF for Chunking (On Board) Tools – CRF++ (http://crfpp.googlecode.com/svn/trunk/doc/index.html)http://crfpp.googlecode.com/svn/trunk/doc/index.html 13POS Tagging and Chunking
14
Understanding CRF Template # Unigram U00:%x[-2,0] U01:%x[-1,0] U02:%x[0,0] U03:%x[1,0] U04:%x[2,0] U05:%x[-1,0]/%x[0,0] U06:%x[0,0]/%x[1,0] U10:%x[-2,1] U11:%x[-1,1] U12:%x[0,1] U13:%x[1,1] U14:%x[2,1] U15:%x[-2,1]/%x[-1,1] U16:%x[-1,1]/%x[0,1] U17:%x[0,1]/%x[1,1] U18:%x[1,1]/%x[2,1] U20:%x[-2,1]/%x[-1,1]/%x[0,1] U21:%x[-1,1]/%x[0,1]/%x[1,1] U22:%x[0,1]/%x[1,1]/%x[2,1] # Bigram B 14POS Tagging and Chunking
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.