Presentation is loading. Please wait.

Presentation is loading. Please wait.

Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules.

Similar presentations


Presentation on theme: "Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules."— Presentation transcript:

1 Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules to parse meanings of sentences and phrases

2 Approaches to POS Tagging
Part of Speech Tagging Determine a word’s lexical class based on context Approaches to POS Tagging

3 Approaches to POS Tagging
Initialize and maintain tagging criteria Supervised: uses pre-tagged corpora Unsupervised: Automatically induce classes by probability and learning algorithms Partially supervised: combines the above approaches Algorithms Rule based: Use pre-defined grammatical rules Stochastic: use HMM and other probabilistic algorithms Neural: Use neural nets to learn the probabilities

4 The man ate the fish on the boat in the morning
Example Word Tag The Determiner Man Noun Ate Verb Fish On Preposition Boat In Morning The man ate the fish on the boat in the morning

5 Word Class Categories Note: Personal pronoun often PRP, Possessive Pronoun often PRP$

6 Word Classes Open (Classes that frequently spawn new words)
Common Nouns, Verbs, Adjectives, Adverbs. Closed (Classes that don’t often spawn new words): prepositions: on, under, over, … particles: up, down, on, off, … determiners: a, an, the, … pronouns: she, he, I, who, ... conjunctions: and, but, or, … auxiliary verbs: can, may should, … numerals: one, two, three, third, … Particle: An uninflected item with a grammatical function but without clearly belonging to a major part of speech. Example: He looked up the word.

7 The Linguistics Problem
Unambiguous: 35,340 Words often are in multiple classes. Example: this This is a nice day = preposition This day is nice = determiner You can go this far = adverb Accuracy 96 – 97% is a baseline for new algorithms 100% impossible even for human annotators 2 tags 3,760 3 tags 264 4 tags 61 5 tags 12 6 tags 2 7 tags 1 (Derose, 1988)

8 Rule-Based Tagging Basic Idea: Assign all possible tags to words
Remove tags according to a set of rules Example rule: IF word+1 is adjective, adverb, or quantifier ending a sentence IF word-1 is not a verb like “consider” THEN eliminate non-adverb ELSE eliminate adverb There are more than 1000 hand-written rules

9 Stage 1: Rule-based tagging
First Stage: FOR each word Get all possible parts of speech using a morphological analysis algorithm Example NN RB VBN JJ VB PRP VBD TO VB DT NN She promised to back the bill

10 Stage 2: Rule-based Tagging
Apply rules to remove possibilities Example Rule: IF VBD is an option and VBN|VBD follows “<start>PRP” THEN Eliminate VBN NN RB VBN JJ VB PRP VBD TO VB DT NN She promised to back the bill

11 Stochastic Tagging Use probability of certain tag occurring given various possibilities Requires a training corpus Problems to overcome Algorithm to assign type for words that are not in corpus Naive Method Choose most frequent tag in training text for each word! Result: 90% accuracy

12 HMM Stochastic Tagging
Intuition: Pick the most likely tag based on context Maximize the formula using a HMM P(word|tag) × P(tag|previous n tags) Observe: W = w1, w2, …, wn Hidden: T = t1,t2,…,tn Goal: Find the part of speech that most likely generate a sequence of words

13 Transformation-Based Tagging (TBL)
(Brill Tagging) Combine Rule-based and stochastic tagging approaches Uses rules to guess at tags machine learning using a tagged corpus as input Basic Idea: Later rules correct errors made by earlier rules Set the most probable tag for each word as a start value Change tags according to rules of type: IF word-1 is a determiner and word is a verb THEN change the tag to noun Training uses a tagged corpus Step 1: Write a set of rule templates Step 2: Order the rules based on corpus accuracy

14 TBL: The Algorithm Step 1: Use dictionary to label every word with the most likely tag Step 2: Select the transformation rule which most improves tagging Step 3: Re-tag corpus applying the rules Repeat 2-3 until accuracy reaches threshold RESULT: Sequence of transformation rules

15 TBL: Problems Problems Advantages Accuracy
Infinite loops and rules may interact The training algorithm and execution speed is slower than HMM Advantages It is possible to constrain the set of transformations with “templates” IF tag Z or word W is in position *-k THEN replace tag X with tag Learns a small number of simple, non-stochastic rules Speed optimizations are possible using finite state transducers TBL is the best performing algorithm on unknown words The Rules are compact and can be inspected by humans Accuracy First 100 rules achieve 96.8% accuracy First 200 rules achieve 97.0% accuracy

16 Neural Network Digital approximation of biological neurons

17 Digital Neuron Σ f(n) W INPUTS Outputs Activation Function W=Weight

18 Transfer Functions 1 Input Output

19 Networks without feedback
Multiple Inputs and Single Layer Multiple Inputs and layers

20 Feedback (Recurrent Networks)

21 Supervised Learning Σ Actual System Neural Network
Inputs from the environment Neural Network Actual System Σ Error + - Expected Output Actual Output Training Run a set of training data through the network and compare the outputs to expected results. Back propagate the errors to update the neural weights, until the outputs match what is expected

22 Multilayer Perceptron
Definition: A network of neurons in which the output(s) of some neurons are connected through weighted connections to the input(s) of other neurons. Inputs First Hidden layer Second Hidden Layer Output Layer

23 Backpropagation of Errors
Function Signals Error Signals


Download ppt "Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules."

Similar presentations


Ads by Google