Presentation is loading. Please wait.

Presentation is loading. Please wait.

LING 388: Computers and Language

Similar presentations


Presentation on theme: "LING 388: Computers and Language"— Presentation transcript:

1 LING 388: Computers and Language
Lecture 30

2 Last class today Reminder: TCE survey!
submit your term project (Friday); have already received some!

3 nltk book: chapter 5 (Adjectives and Adverbs)
(Attributive) The falling man … / (Predicative) The man is falling … Difference in meaning sometimes: Mary's really nice teacher Mary's teacher is really nice

4 nltk book: chapter 5 (Exploring Tagged Corpora)
Words that follow 'often', using .words(): Words categories that follow 'often', using .tagged_words():

5 nltk book: chapter 5 (Trigram search)
from nltk.corpus import brown brown.tagged_sents() [[('The', 'AT'), ('Fulton', 'NP-TL'), ('County', 'NN-TL'), ('Grand', 'JJ- TL'), ('Jury', 'NN-TL'), ('said', 'VBD'), ('Friday', 'NR'), ('an', 'AT'), ('investigation', 'NN'), ('of', 'IN'), ("Atlanta's", 'NP$'), ('recent', 'JJ'), ('primary', 'NN'), ('election', 'NN'), ('produced', 'VBD'), ('``', '``'), ('no', 'AT'), ('evidence', 'NN'), ("''", "''"), ('that', 'CS'), ('any', 'DTI'), ('irregularities', 'NNS'), ('took', 'VBD'), ('place', 'NN'), ('.', '.')], [('The', 'AT'), ('jury', 'NN'), ('further', 'RBR'), ('said', 'VBD'), ('in', 'IN'), ('term-end', 'NN'), ('presentments', 'NNS'), ('that', 'CS'), ('the', 'AT'), ('City', 'NN-TL'), ('Executive', 'JJ-TL'), ('Committee', 'NN-TL'), (',', ','), ('which', 'WDT'), ('had', 'HVD'), ('over-all', 'JJ'), ('charge', 'NN'), ('of', 'IN'), ('the', 'AT'), ('election', 'NN'), (',', ','), ('``', '``'), ('deserves', 'VBZ'), ('the', 'AT'), ('praise', 'NN'), ('and', 'CC'), ('thanks', 'NNS'), ('of', 'IN'), ('the', 'AT'), ('City', 'NN-TL'), ('of', 'IN-TL'), ('Atlanta', 'NP-TL'), ("''", "''"), ('for', 'IN'), ('the', 'AT'), ('manner', 'NN'), ('in', 'IN'), ('which', 'WDT'), ('the', 'AT'), ('election', 'NN'), ('was', 'BEDZ'), ('conducted', 'VBN'), ('.', '.')], ...]

6 nltk book: chapter 5 (Trigram search)
>>> for ts in brown.tagged_sents(): ...     for (w1,t1), (w2,t2), (w3,t3) in nltk.trigrams(ts): ...             if t1.startswith('V') and t3.startswith('V') and t2 == 'TO': ...                     print(w1,w2,w3) ...  combined to achieve continue to place serve to protect help to intensify seems to overtake want to buy 4023 such fragments

7 nltk book: chapter 5 (tag ambiguity > 3)

8 nltk book: chapter 5 (tag ambiguity > 3)
import nltk from nltk.corpus import brown bnt = brown.tagged_words(categories='news', tagset='universal') cfd = nltk.ConditionalFreqDist((w.lower(),tag) for (w,tag) in bnt) cfd <ConditionalFreqDist with conditions> cfd['bank'] FreqDist({'NOUN': 16}) cfd['grant'] FreqDist({'NOUN': 11, 'VERB': 4}) len(cfd['grant']) 2

9 nltk book: chapter 5 (tag ambiguity > 3)
>>> for w in sorted(cfd.conditions()): ... if len(cfd[w]) > 3: ... print(w,cfd[w]) ... best <FreqDist with 4 samples and 31 outcomes> close <FreqDist with 4 samples and 12 outcomes> open <FreqDist with 4 samples and 33 outcomes> present <FreqDist with 4 samples and 30 outcomes> that <FreqDist with 4 samples and 829 outcomes>

10 nltk book: chapter 5 (tag ambiguity > 3)
>>> for w in sorted(cfd.conditions()): ... if len(cfd[w]) > 3: ... print(w,cfd[w].most_common()) ... best [('ADJ', 28), ('ADV', 1), ('NOUN', 1), ('VERB', 1)] close [('ADV', 6), ('ADJ', 3), ('VERB', 2), ('NOUN', 1)] open [('ADJ', 13), ('VERB', 11), ('NOUN', 8), ('ADV', 1)] present [('ADJ', 21), ('ADV', 7), ('NOUN', 1), ('VERB', 1)] that [('ADP', 546), ('DET', 150), ('PRON', 128), ('ADV', 5)]


Download ppt "LING 388: Computers and Language"

Similar presentations


Ads by Google