Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semi-Supervised Natural Language Learning Reading Group I set up a site at: ervised/

Similar presentations


Presentation on theme: "Semi-Supervised Natural Language Learning Reading Group I set up a site at: ervised/"— Presentation transcript:

1 Semi-Supervised Natural Language Learning Reading Group I set up a site at: http://www.cs.cmu.edu/~acarlson/semisup ervised/ http://www.cs.cmu.edu/~acarlson/semisup ervised/ Cover other applications of semi- supervised learning? Volunteers? Every week or bi-weekly? Time change? 1pm? Noon?

2 Unsupervised Word Sense Disambiguation Rivaling Supervised Methods Author: David Yarowsky (1995) Presented by: Andy Carlson

3 Word Sense Disambiguation Determining what sense of a word is meant in a given sentence “Toyota is considering opening a plant in Detroit.” “The banana plant is grown all over the tropics for its fruit.” Different from sense induction– we assume we already know distinct senses

4 Using unlabeled data Two properties of language let us use unlabeled data: One sense per collocation –Nearby words provide strong and consistent clues One sense per discourse –With a document, the sense of a word is highly consistent We can base an iterative bootstrapping algorithm on these two properties

5 One sense per discourse How accurate? How frequently does it apply?

6

7 Decision Lists List of rules of the form “collocation => sense” Example: life (within 2-10 words) => biological sense of plant Rules are ordered by log-likelihood ratio

8 The algorithm – step 1 Find all occurrences of the given polysemous word We follow examples for the word plant

9

10 Step 2 – Initial Labeling For each sense of the word, identify a small number of training examples Strategies: dictionary words, human- labelling of most frequent collocates, or human-chosen collocates Example: the words life and manufacturing are used as seed collocations

11 Labeled as ‘living’ plant

12 Unlabeled examples

13 Labeled as ‘factory’ plant

14 Sample initial state

15 Step 3a Train the decision list based on the current labeling of the state space

16 Step 3b Apply learned classifier to all examples

17 Step 3c Optionally, apply the one-sense-per- discourse constraint

18 Step 3c

19

20 After steps 3b and 3c

21 Step 3d Repeat step 3 iteratively Details – grow window size for collocations, and randomly perturb the class inclusion threshold

22 Step 4 Stop. The algorithm converges to a stable residual set.

23 Sample final state

24 Final decision list

25 Results

26


Download ppt "Semi-Supervised Natural Language Learning Reading Group I set up a site at: ervised/"

Similar presentations


Ads by Google