Presentation is loading. Please wait.

Presentation is loading. Please wait.

Natural Language Processing Slides adapted from Pedro Domingos

Similar presentations


Presentation on theme: "Natural Language Processing Slides adapted from Pedro Domingos"— Presentation transcript:

1 Natural Language Processing Slides adapted from Pedro Domingos
What’s the problem? Input? Natural Language Sentences Output? Parse Tree Semantic Interpretation (Logical Representation)

2 Example Applications Enables great user interfaces!
Spelling and grammar checkers. Document understanding on the WWW. Spoken language control systems: banking, shopping Classification systems for messages, articles. Machine translation tools.

3 NLP Problem Areas Morphology: structure of words
Syntactic interpretation (parsing): create a parse tree of a sentence. Semantic interpretation: translate a sentence into the representation language. Pragmatic interpretation: incorporate current situation into account. Disambiguation: there may be several interpretations. Choose the most probable

4 Some Difficult Examples
From the newspapers: Squad helps dog bite victim. Helicopter powered by human flies. Levy won’t hurt the poor. Once-sagging cloth diaper industry saved by full dumps. Ambiguities: Lexical: meanings of ‘hot’, ‘back’. Syntactic: I heard the music in my room. Referential: The cat ate the mouse. It was ugly.

5 Parsing Context-free grammars: (2 + X) * (17 + Y) is in the grammar.
EXPR -> NUMBER EXPR -> VARIABLE EXPR -> (EXPR + EXPR) EXPR -> (EXPR * EXPR) (2 + X) * (17 + Y) is in the grammar. (2 + (X)) is not. Why do we call them context-free?

6 Using CFG’s for Parsing
Can natural language syntax be captured using a context-free grammar? Yes, no, sort of, for the most part, maybe. Words: nouns, adjectives, verbs, adverbs. Determiners: the, a, this, that Quantifiers: all, some, none Prepositions: in, onto, by, through Connectives: and, or, but, while. Words combine together into phrases: NP, VP

7 An Example Grammar S -> NP VP VP -> V NP NP -> NAME
NP -> ART N ART -> a | the V -> ate | saw N -> cat | mouse NAME -> Sue | Tom

8 Example Parse The mouse saw Sue.

9 “Sue bought the cat biscuits
Ambiguity “Sue bought the cat biscuits S -> NP VP VP -> V NP VP -> V NP NP NP -> N NP -> N N NP -> Det NP Det -> the V -> ate | saw | bought N -> cat | mouse |biscuits | Sue | Tom

10 Example: Chart Parsing
Three main data structures: a chart, a key list, and a set of edges Chart: Name of terminal or non-terminal 4 length 3 2 1 1 2 3 4 Starting points

11 Key List and Edges Key list: Push down stack of chart entries
“the” “box” “floats” Edges: rules that can be applied to chart entries to build up larger entries 4 length 3 box 2 floats 1 the 1 2 3 4 detthe o

12 Chart Parsing Algorithm
Loop while entries in key list 1. Remove entry from key list 2. If entry already in chart, Add edge list Break 3. Add entry from key list to chart 4. For all rules that begin with entry’s type, add an edge for that rule 5. For all edges that need the entry next, add an extended edge (see algorithm on right) 6. If the edge is finished, add an entry to the key list with type, start point, length, and edge list To extend an edge with chart entry c Create a new edge e’ Set start (e’) to start (e) Set end(e’) to end(e) Set rule(e’) to rule(e) with “o” moved beyond c. Set the righthandside(e’) to the righthandside(e)+c

13 Try it S -> NP VP VP -> V NP -> Det N Det -> the
N -> box V -> floats

14 Semantic Interpretation
Our goal: to translate sentences into a logical form. But: sentences convey more than true/false: It will rain in Seattle tomorrow. Will it rain in Seattle tomorrow? A sentence can be analyzed by: propositional content, and speech act: tell, ask, request, deny, suggest

15 Propositional Content
We develop a logic-like language for representing propositional content: Word-sense ambiguity Scope ambiguity Proper names --> objects (John, Alon) Nouns --> unary predicates (woman, house) Verbs --> transitive: binary predicates (find, go) intransitive: unary predicates (laugh, cry) Quantifiers: most, some Example: Mary: Loves(John, Mary)

16 Statistical NLP(see book by Charniak, “Statistical Language Learning”, MIT Press, 1993
Consider the problem of tagging part-of-speech: “The box floats” “The”  Det; “Box”  N; “Floats”  V; Given a sentence w(1,n), where w(i) is the i-th word, we want to find tags t(i) assigned to each word w(i)

17 The Equations Find the t(1,n) that maximizes Assume that
P[t(1,n)|w(1,n)]=P[w(1,n)|t(1,n)]/P(w(1,n)) So, only need to maximize P[w(1,n)|t(1,n)] Assume that A word depends only on previous tag A tag depends only on previous tag We have: P[w(j)|w(1,j-1),t(1,j)]=P[w(j)|t(j)], and P[t(j)|w(1,j-1),t(1,j-1)] = P(t(j)|t(j-1)] Thus, want to maximize P[w(n)|t(n-1)]*P[t(n+1)|t(n)]*P[w(n-1)|t(n-2)]*P[t(n)|t(n-1)]…

18 Example “The box floats”: given a corpus (a training set)
Assignment one: T(1)=det, T(2) = V, T(3)=V P(V|det) is rather low, so is P(V|V). Thus is less likely compared to Assignment two: T(t)=det, T(2) = N; t(3) = V P(N|det) is high, and P(V|N) is high, thus is more likely! In general, can use Hidden Markov Models to find probabilities floats the box V det N

19 Experiments Charniak and Colleagues did some experiments on a collection of documents called the “Brown Corpus”, where tags are assigned by hand. 90% of the corpus are used for training and the other 10% for testing They show they can get 95% correctness with HMM’s. A really simple algorithm: assign t to w by the highest probability tag P(t|w)  91% correctness!

20 Natural Language Summary
Parsing: context free grammars with features. Semantic interpretation: Translate sentences into logic-like language Use additional domain knowledge for word-sense disambiguation. Use context to disambiguate references.


Download ppt "Natural Language Processing Slides adapted from Pedro Domingos"

Similar presentations


Ads by Google