Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Similar presentations


Presentation on theme: "Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions."— Presentation transcript:

1 Learning Bit by Bit Class 4 - Ngrams

2 Ngrams Counting words Using observation to make predictions

3 Ngrams Corpus/Corpora

4 Unigram “how’s the weather out there?” [how’s, the, weather, out, there]

5 Unigram how many words are there?

6 Unigram How many times does “weather” occur?

7 Unigram Prob “weather” = occurrences of “weather”/ total # words

8 Unigram P(“weather”) = c(“weather”) / c(total)

9 Bigram “the storm swept through the land” [(the, storm), (storm, swept), (swept, through), (through, the), (the land)]

10 Bigram How many times does “storm” follow “the”?

11 Bigram How many times does the word “the” occur?

12 Bigram Prob “the storm” given “the” = occurrences of “the storm”/ occurrences of “the”

13 Bigram Prob “the storm” = occurrences of “the storm”/ occurrences of “the” P(word n| word n-1)

14 Markov Assumption The assumption that the probability of a word can depend only on the previous word, or previous N words P(“land” | “the”) P (“land” | “the storm swept through the”)

15 N gram Extends bigram model to previous N words

16 Maximum Likelihood Estimation N-Gram probability based on corpus counts P(word n| word n-1) = counts of word n-1 followed by word n / Counts of all times word n-1 occurs

17 Trigram “the quick red fox jumped the quick black bear. The quick red fox hopped away.” [(the, quick, red), (quick, red, fox), (red, fox, jumped), (fox, jumped, the), (jumped, the, quick), (the, quick, black), (quick, black, bear) (the, quick, red) (quick, red, fox), (red, fox, hopped), (fox, hopped, away)]

18 Trigram How many times does “the quick red” occur?

19 Trigram How many times does “the quick” occur?

20 Trigram Prob “the quick red” given “the quick” = occurrences of “the quick red” / occurrences of “the quick”

21 Test it in Google Google “the weather” How many results?

22 Test it in Google Google “the weather is” How many results?

23 Test it in Google Google “the weather out” How many results?

24 Test it in Google Google “weather the out” How many results?

25 Test it in Google Prob “the weather out” = Count “the weather out”/ Count “the weather”

26 Test in Google Why so few results for “weather the out”?

27 Training and Testing Training set – bigger ie. 80-90% Testing set – smaller ie. 10-20%

28 Examples


Download ppt "Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions."

Similar presentations


Ads by Google