POS Tagging1 POS Tagging 1 POS Tagging Rule-based taggers Statistical taggers Hybrid approaches.

Slides:



Advertisements
Similar presentations
Three Basic Problems Compute the probability of a text: P m (W 1,N ) Compute maximum probability tag sequence: arg max T 1,N P m (T 1,N | W 1,N ) Compute.
Advertisements

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING PoS-Tagging theory and terminology COMP3310 Natural Language Processing.
Machine Learning Approaches to the Analysis of Large Corpora : A Survey Xunlei Rose Hu and Eric Atwell University of Leeds.
Three Basic Problems 1.Compute the probability of a text (observation) language modeling – evaluate alternative texts and models P m (W 1,N ) 2.Compute.
Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules.
1 Statistical NLP: Lecture 12 Probabilistic Context Free Grammars.
Tagging with Hidden Markov Models. Viterbi Algorithm. Forward-backward algorithm Reading: Chap 6, Jurafsky & Martin Instructor: Paul Tarau, based on Rada.
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
CS4705 Natural Language Processing.  Regular Expressions  Finite State Automata ◦ Determinism v. non-determinism ◦ (Weighted) Finite State Transducers.
Midterm Review CS4705 Natural Language Processing.
Part-of-speech Tagging cs224n Final project Spring, 2008 Tim Lai.
Ch 10 Part-of-Speech Tagging Edited from: L. Venkata Subramaniam February 28, 2002.
1 Empirical Learning Methods in Natural Language Processing Ido Dagan Bar Ilan University, Israel.
More about tagging, assignment 2 DAC723 Language Technology Leif Grönqvist 4. March, 2003.
Part-of-Speech (POS) tagging See Eric Brill “Part-of-speech tagging”. Chapter 17 of R Dale, H Moisl & H Somers (eds) Handbook of Natural Language Processing,
POS based on Jurafsky and Martin Ch. 8 Miriam Butt October 2003.
Tagging – more details Reading: D Jurafsky & J H Martin (2000) Speech and Language Processing, Ch 8 R Dale et al (2000) Handbook of Natural Language Processing,
Introduction LING 572 Fei Xia Week 1: 1/3/06. Outline Course overview Problems and methods Mathematical foundation –Probability theory –Information theory.
Boosting Applied to Tagging and PP Attachment By Aviad Barzilai.
Statistical techniques in NLP Vasileios Hatzivassiloglou University of Texas at Dallas.
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books עיבוד שפות טבעיות - שיעור חמישי POS Tagging Algorithms עידו.
Part of speech (POS) tagging
تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.
Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur Part-of-Speech Tagging and Chunking with Maximum Entropy Model Sandipan Dandapat.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
Course Summary LING 572 Fei Xia 03/06/07. Outline Problem description General approach ML algorithms Important concepts Assignments What’s next?
1 Introduction LING 575 Week 1: 1/08/08. Plan for today General information Course plan HMM and n-gram tagger (recap) EM and forward-backward algorithm.
Natural Language Understanding
Albert Gatt Corpora and Statistical Methods Lecture 9.
Part-of-Speech Tagging
Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.
1 Persian Part Of Speech Tagging Mostafa Keikha Database Research Group (DBRG) ECE Department, University of Tehran.
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
EMNLP’01 19/11/2001 ML: Classical methods from AI –Decision-Tree induction –Exemplar-based Learning –Rule Induction –T ransformation B ased E rror D riven.
Parts of Speech Sudeshna Sarkar 7 Aug 2008.
Some Advances in Transformation-Based Part of Speech Tagging
Graphical models for part of speech tagging
Distributional Part-of-Speech Tagging Hinrich Schütze CSLI, Ventura Hall Stanford, CA , USA NLP Applications.
Comparative study of various Machine Learning methods For Telugu Part of Speech tagging -By Avinesh.PVS, Sudheer, Karthik IIIT - Hyderabad.
Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur Part-of-Speech Tagging for Bengali with Hidden Markov Model Sandipan Dandapat,
Albert Gatt Corpora and Statistical Methods Lecture 10.
인공지능 연구실 정 성 원 Part-of-Speech Tagging. 2 The beginning The task of labeling (or tagging) each word in a sentence with its appropriate part of speech.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
Part-of-Speech Tagging Foundation of Statistical NLP CHAPTER 10.
Real-World Semi-Supervised Learning of POS-Taggers for Low-Resource Languages Dan Garrette, Jason Mielens, and Jason Baldridge Proceedings of ACL 2013.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
10/30/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 7 Giuseppe Carenini.
Transformation-Based Learning Advanced Statistical Methods in NLP Ling 572 March 1, 2012.
Word classes and part of speech tagging Chapter 5.
Optimizing Local Probability Models for Statistical Parsing Kristina Toutanova, Mark Mitchell, Christopher Manning Computer Science Department Stanford.
Tokenization & POS-Tagging
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging I Introduction Tagsets Approaches.
Supertagging CMSC Natural Language Processing January 31, 2006.
Shallow Parsing for South Asian Languages -Himanshu Agrawal.
Human Language Technology Part of Speech (POS) Tagging II Rule-based Tagging.
HMM vs. Maximum Entropy for SU Detection Yang Liu 04/27/2004.
Stochastic and Rule Based Tagger for Nepali Language Krishna Sapkota Shailesh Pandey Prajol Shrestha nec & MPP.
Part of Speech Tagging in Context month day, year Alex Cheng Ling 575 Winter 08 Michele Banko, Robert Moore.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Part-of-Speech Tagging CSCI-GA.2590 – Lecture 4 Ralph Grishman NYU.
N-Gram Model Formulas Word sequences Chain rule of probability Bigram approximation N-gram approximation.
A CRF-BASED NAMED ENTITY RECOGNITION SYSTEM FOR TURKISH Information Extraction Project Reyyan Yeniterzi.
Part-Of-Speech Tagging Radhika Mamidi. POS tagging Tagging means automatic assignment of descriptors, or tags, to input tokens. Example: “Computational.
CSC 594 Topics in AI – Natural Language Processing
CSCI 5832 Natural Language Processing
CS4705 Natural Language Processing
Hindi POS Tagger By Naveen Sharma ( )
Meni Adler and Michael Elhadad Ben Gurion University COLING-ACL 2006
Presentation transcript:

POS Tagging1 POS Tagging 1 POS Tagging Rule-based taggers Statistical taggers Hybrid approaches

POS Tagging2 POS Tagging 2 Yo bajo con el hombre bajo a tocarel bajo bajo la escalera. PP VM AQ NC SP TDNC VM SP VM AQ NC SP NC SP TD VM AQ NC SP VM AQ NC SP TD NC PP NC FP Words taken isolatedly are ambiguous regarding its POS

POS Tagging3 POS Tagging 3 Yo bajo con el hombre bajo a tocarel bajo bajo la escalera. PP VM AQ NC SP TDNC VM SP VM AQ NC SP NC SP TD VM AQ NC SP VM AQ NC SP TD NC PP NC FP Most of words have a unique POS within a context

POS Tagging4 POS Tagging 4 The goal of a POS tagger is to assign each word the most likely within a context Pos taggers Rule-based Statistical Hybrid

POS Tagging5 POS Tagging 5 W = w 1 w 2 … w n sequence of words T = t 1 t 2 …t n sequence of POS tags For each word w i only some of the tags can be assigned (except the unknown words). We can get them from a lexicon or a morphological analyzer. Tagset. Open vs closed categories f : W  T = f(W)

POS Tagging6 POS Tagging 6 Knowledge-driven taggers Usually rules built manually Limited amount of rules (  1000) LM and smoothing explicitely defined. Rule-based taggers

POS Tagging7 POS Tagging 7 TAGGIT, Green,Rubin,1971 TOSCA, Oosdijk,1991 Constraint Grammars, EngCG, Voutilainen,1994, Karlsson et al, 1995 AMBILIC, de Yzaguirre et al, 2000  Linguistically motivated rules  High precision  ej. EngCG 99.5% –High development cost –Not transportable –Time cost of tagging Rule-based taggers

POS Tagging8 POS Tagging 8 A CG consists of a sequence of subgrammars each one consisting of a set of restrictions (constraints) which set context conditions ej. =0 VFIN (-1 TO)) Discards POS VFIN when the previous word is “to” ENGCG ENGTWOL Reductionist POS tagging 1,100 constraints 93-97% of the words are correctily disambiguated 99.7% accuracy Heuristic rules can be applied over the rest 2-3% residual ambiguity with 99.6% precision CG syntactic Constraint Grammars CG

POS Tagging9 POS Tagging 8 LM and smoothing automatically learned from tagged corpora (supervised learning). Data-driven taggers Statistical inference Techniques coming from speech processing Statistical POS taggers

POS Tagging10 POS Tagging 9 CLAWS, Garside et al, 1987 De Rose, 1988 Church, 1988 Cutting et al,1992 Merialdo, 1994  Well founded theoretical framework  Simple models.  Acceptable precision  > 97%  Language independent –Learning the model –Sparseness –less precision Statistical POS taggers

POS Tagging11 POS Tagging 10 N-gram smoothing interpolation HMM ML Supervised learning MLE Semi-supervised Forward-Backward, Baum-Welch (EM Expectation Maximization) Charniak, 1993 Jelinek, 1998 Manning, Schütze, 1999 Statistical POS taggers

POS Tagging12 POS Tagging 11 Contextual probability (trigrams) Lexical Probability 3-gram tagger

POS Tagging13 POS Tagging 12 Hidden States associated to n-grams Transition probabilities restricted to valid transitions: *BC -> BC* Emision probabilities restricted by lexicons HMM tagger

POS Tagging14 POS Tagging 13 Transformation-based, error-driven Based on rules automatically acquired Maximum Entropy Combination of several knowledge sources No independence is assumed A high number of parameters is allowed (e.g. lexical features) Brill, 1995 Roche,Schabes, 1995 Ratnaparkhi, 1998, Rosenfeld, 1994 Ristad, 1997 Hybrid systems

POS Tagging15 POS Tagging 14 Based on transformation rules that corerect errors produced by an initial HMM tagger rule change label A into label B when... Each rule corresponds to the instantiation of a templete templetes The previous (following) word is tagged with Z One of the two previous (following) words is tagged with Z The previous word is tagged with Z and the following with W... Learning of the variables A,B,Z,W through an iterative process That choose at each iteration the rule (the instanciation) correcting more errors. Brill’s system

POS Tagging16 POS Tagging 15 Decision trees Supervised learning ej. TreeTagger Case-based, Memory-based Learning IGTree Relaxation labelling Statistical and linguistic constraints ej. RELAX Black,Magerman, 1992 Magerman 1996 Màrquez, 1999 Màrquez, Rodríguez, 1997 TiMBL Daelemans et al, 1996 Padrò, 1997 Other complex systems

POS Tagging17 POS Tagging 16 root P(IN)=0.81 P(RB)=0.19 Word Form leaf P(IN)=0.83 P(RB)=0.17 tag(+1) P(IN)=0.13 P(RB)=0.87 tag(+2) P(IN)=0.013 P(RB)=0.987 “As”,“as” RB IN other... IN/RB ambiguity IN (preposition) RB (adverb) stat í stical interpretation : P( RB | word=“A/as” & tag(+1)=RB & tag(+2)=IN) = P( IN | word=“A/as” & tag(+1)=RB & tag(+2)=IN) = ^ ^ Màrquez’s Tree-tagger

POS Tagging18 POS Tagging 17 Combination of LM in a tagger STT+ RELAX Combination of taggers through votation bootstrapping Combinación of classifiers bagging (Breiman, 1996) boosting (Freund, Schapire, 1996) Màrquez, Rodríguez, 1998 Màrquez, 1999 Padrò, 1997 Màrquez et al, 1998 Brill, Wu, 1998 Màrquez et al, 1999 Abney et al, 1999 Combining taggers

POS Tagging19 Viterbi algorithm Tagged text Raw text Morphological analysis Language Model Disambiguation N-grams Lexical probs. + + Contextual probs. POS Tagging 18 Màrquez’s STT+

POS Tagging20 Relaxation Labelling (Padró, 1996) Tagged text Raw text Morphological analysis Language Model Disambiguation Linguistic rules N-grams + + Set of constraints POS Tagging 19 Padró’s Relax