Presentation is loading. Please wait.

Presentation is loading. Please wait.

Part-Of-Speech Tagging using Neural Networks Ankur Parikh LTRC IIIT Hyderabad

Similar presentations


Presentation on theme: "Part-Of-Speech Tagging using Neural Networks Ankur Parikh LTRC IIIT Hyderabad"— Presentation transcript:

1 Part-Of-Speech Tagging using Neural Networks Ankur Parikh LTRC IIIT Hyderabad ankur.parikh85@gmail.com ankur.parikh85@gmail.com

2 Outline 1.Introduction 2.Background and Motivation 3.Experimental Setup 4.Preprocessing5.Representation 6.Single-neuro tagger 7.Experiments 8.Multi-neuro tagger 9.Results10.Discussion 11.Future Work

3 Introduction POS-Tagging: POS-Tagging: It is the process of assigning the part of speech tag to the NL text based on both its definition and its context. Uses: Parsing of sentences, MT, IR, Word Sense disambiguation, Speech synthesis etc. Methods: 1. Statistical Approach 2. Rule Based

4 Background: Previous Approaches  Lots of work has been done using various machine learning algorithms like TNT TNT CRF CRF for Hindi.  Trade-off: Performance versus Training time - Less precision affects later stages - For a new domain or new corpus parameter tuning is a non-trivial task.

5 Background: Previous Approaches & Motivation  Empirically chosen context.  Effective Handling of corpus based features  Need of the hour: - Good performance - Less training time - Multiple contexts - exploit corpus based features effectively  Two Approaches and their comparison with TNT and CRF  Word level tagging

6 Experimental Setup : Corpus statitstics Tag set of 25 tags Tag set of 25 tags Corpus Size (in words) Unseen words (in percentage) Training187,095- Development23,5655.33% Testing23,2818.15%

7 Experimental Setup: Tools and Resources  Tools - CRF++ - TNT - Morfessor Categories – MAP  Resources - Universal word – Hindi Dictionary - Hindi Word net - Morph Analyzer

8 Preprocessing  XC tag is removed (Gadde et. Al., 2008).  Lexicon - For each unique word w of the training corpus => ENTRY(t1,……,t24) - where tj = c(posj, w) / c(w)

9 Representation: Encoding & Decoding  Each word w is encoded as an n-element vector INPUT(t1,t2,…,tn) where n = size of the tag set.  INPUT(t1,t2,…,tn) comes from lexicon if training corpus contains w.  If w is not in the training corpus - N(w) = Number of possible POS tags for w - tj = 1/N(w) if posj is a candidate = 0 otherwise = 0 otherwise

10 Representation: Encoding & Decoding  For each word w, Desired Output is encoded as D = (d1,d2,….,dn). - dj = 1 if posj is a desired ouput = 0 otherwise = 0 otherwise  In testing, for each word w, an n-element vector OUTPUT(o1,…,on) is returned. - Result = posj, if oj = max(OUTPUT)

11 Single – neuro tagger: Structure

12 Single – neuro tagger: Training & Tagging  Error Back-propagation learning Algorithm  Weights are Initialized with Random values  Sequential mode  Momentum term  Eta = 0.4 and Alpha = 0.1  In tagging, it can give multiple outputs or a sorted list of all tags.

13 Experiments: Development Data FeaturesPrecision Corpus based and contextual 93.19% Root of the word 93.38% Length of the word 94.04% Handling of unseen words Root->Dictionary->Word net->Morfessor {tj = c(posj,s) + c(posj,p)/ c(s) + c(p)} 95.62%

14 Development of the system

15 Multi – neuro tagger: Structure

16 Multi – neuro tagger: Training

17 Multi – neuro tagger: Learning curves

18 Multi – neuro tagger: Results StructureContextDevelopmentTest 97-48-24395.44%91.87% 121-48-244_prev95.64%92.05% 121-48-244_next95.66%91.95% 145-72-24595.55%92.15% 169-72-246_prev95.56%92.14% 169-72-246_next95.54%92.14% 193-96-24795.46%92.07%

19 Multi – neuro tagger: Comparison  Precision after voting : 92.19% TaggerDevelopmentTest Training Time TNT95.18%91.58% 1-2 (Seconds) Multi – neuro tagger 95.78%92.19% 13-14 (Minutes) CRF96.05%92.92%2-2.5(Hours)

20 Conclusion  Single versus Multi-neuro tagger  Multi-neuro tagger versus TNT and CRF  Corpus and Dictionary based features  More parameters need to be tuned  24^5 = 79,62,624 n-grams, while 250,560 weights  Well suited for Indian Languages

21 Future Work  Better voting schemes (Confidence point based)  Finding the right context (Probability based)  Various Structures and algorithms - Sequential Neural Network - Convolution Neural Network - Combination with SVM

22 Thank You!! Queries???


Download ppt "Part-Of-Speech Tagging using Neural Networks Ankur Parikh LTRC IIIT Hyderabad"

Similar presentations


Ads by Google