Presentation is loading. Please wait.

Presentation is loading. Please wait.

PoS tagging and Chunking with HMM and CRF

Similar presentations


Presentation on theme: "PoS tagging and Chunking with HMM and CRF"— Presentation transcript:

1 PoS tagging and Chunking with HMM and CRF
Pranjal Awasthi, Delip Rao, Ravindran Balaraman Dept. Of CSE IIT Madras

2 Outline Overview of the system PoS tagging with HMM Chunking with CRF
Results Summary

3 Aim: To leverage existing tools and algorithms (for English)
Overview of the system Aim: To leverage existing tools and algorithms (for English) for the NLPAI task Tools used: TnT tagger, TBL, MALLET

4 Overview of the system TNT CRF (MALLET) + TBL PoS Tagging Chunking

5 The TnT tagger (Brants, 2000)
A Second Order Hidden Markov Model based tagger Used for English and other languages On NLPAI dataset, TnT alone gave F1=78.9 Why TnT? PoS tagging a sequence labeling task HMM, CRFs are good candidates

6 Poor performance of CRFs in PoS tagging
For NLPAI dataset F1 = 69.4 Features used: wi-1, wi-1wi, wi+1, wiwi+1 Linear chain CRF was used (MALLET) Reasons for poor performance Large number of PoS tags (26) compared to Chunking Selection of features Type of CRF?

7 Transformation Based Learning (Brill, 1995)
Added as a post processing step to “correct” TnT output Idea: Derive correction rules during training based on observing what has gone wrong Apply these rules for testing

8 Transformation Based Learning (contd …)
Use of TnT improved F1 by 1% TnT is sensitive to the templates used Possible improvements on template selection Training time can be long unless indexing is used

9 Summary of PoS tagging Results
Model Precision Recall F1 CRF 69.40 TNT 78.94 TNT+TBL 80.74

10 Chunking with CRF Based on (Sha & Periera, 2003)
Using SimpleTagger provided with MALLET Chunking accuracies Chunking with F1 Reference PoS tags 89.69 Generated PoS tags 79.58

11 Summary Demonstrated the use of off-the-shelf software for Tagging and Chunking Only code written: TBL + glue scripts Overall PoS F1 = and Chunk F1 = 79.58 Have we “hit the wall” in pure ML based tools Not sure yet!

12 Thanks!


Download ppt "PoS tagging and Chunking with HMM and CRF"

Similar presentations


Ads by Google