Presentation is loading. Please wait.

Presentation is loading. Please wait.

Classifying Parts of Speech Based on Sparse Data Katherine Brainard.

Similar presentations


Presentation on theme: "Classifying Parts of Speech Based on Sparse Data Katherine Brainard."— Presentation transcript:

1 Classifying Parts of Speech Based on Sparse Data Katherine Brainard

2 The Problem Sparse data has little contextual information Many words fall into this category Automatic PoS taggers and finders are useful

3 Approach Relatively easy to learn categories from frequent words Infrequent words often more “ regular ” than their common counterparts Learn frequent words, then use these to classify infrequent Uses clustering for the frequent words

4 Evaluating the Model Somewhat tricky - want eval function that doesn ’ t encourage degenerate behavior Evaluation separated from clustering Used both bigram probability model and comparison with already-tagged data

5 Results Improvement of ~36% from delaying processing of data About 2.5 times better than classifying infrequent words into one lump Using just contextual data produced the best performance


Download ppt "Classifying Parts of Speech Based on Sparse Data Katherine Brainard."

Similar presentations


Ads by Google