Download presentation

Presentation is loading. Please wait.

1
Probabilistic Parsing Ling 571 Fei Xia Week 4: 10/18-10/20/05

2
Outline Misc: Hw3 and Hw4: lexicalized rules CYK recap –Converting CFG into CNF –N-best Quiz #2 Common prob equations Independence assumption Lexicalized models

3
CYK Recap

4
Converting CFG into CNF CNF Extended CNF CFG in general vs. CFG for natural languages Converting CFG into CNF Converting PCFG into CNF Recovering parse trees

5
Definition of CNF A, B,C are non-terminal, a is terminal, S is start symbol Definition 1: –A B C, –A a, –S Where B, C are not start symbols. Definition 2: -free grammar –A B C –A a

6
Extended CNF Definition 3: –A B C –A a or A B We use Def 3: –Unit rules such as NP N are allowed. –No need to remove unit rules during conversion. –CYK algorithm needs to be modified.

7
CYK algorithm with Def 2 For every rule A w_i, For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: if then

8
CYK algorithm with Def 3 For every position i for all A, if A w_i, for all A and B, if A=>B, update For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: …. for all non-terminals A and B, if A B, update

9
CFG CFG in general: –G=(N, T, P, S) –Rules: CFG for natural languages: –G=(N, T, P, S) –Pre-terminal: –Rules: Syntactic rules: Lexicon:

10
Conversion from CFG to CNF CFG (in general) to CNF (Def 1) –Add S0 S –Remove e-rules –Remove unit rules –Replace n-ary rules with binary rules CFG (for NL) to CNF (Def 3) –CFG (for NL) has no e-rules –Unit rules are allowed in CNF (Def 3) –Only the last step is necessary

11
An example VP V NP PP PP To recover the parse tree w.r.t original CFG, just remove added non-terminals.

12
Converting PCFG into CNF VP V NP PP PP 0.1 => VP V X1 0.1 X1 NP X2 1.0 X2 PP PP 1.0

13
CYK with N-best output

14
N-best parse trees Best parse tree: N-best parse trees:

15
CYK algorithm for N-best For every rule A w_i, For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: for each if val > one of probs in then remove the last element in and insert val to the array remove the last element in B[begin][end][A] and insert (m, B,C,i, j) to B[begin][end][A].

16
Mary bought books with cash S NP VP (1,1,1) S NP VP (1,1,2) VP V NP (2,1,1) VP VP PP (3,1,1) NP NP PP (3,1,1) PP P NP (4,1,1) N cash NP N ---P with S NP VP (1,1,1) VP V NP (2,1,1) N books NP N -V bought N book NP N

17
Common probability equations

18
Three types of probability Joint prob: P(x,y)= prob of x and y happening together Conditional prob: P(x|y) = prob of x given a specific value of y Marginal prob: P(x) = prob of x for all possible values of y

19
Common equations

20
An example #(words)=100, #(nouns)=40, #(verbs)=20 “books” appears 10 times, 3 as verbs, 7 as nouns P(w=books)=0.1 P(w=books,t=noun)=0.07 P(t=noun|w=books)=0.7 P(nouns)=0.4 P(w=books|t=nouns)=7/40

21
More general cases

22
Independence assumption

23
Two variables A and B are independent if –P(A,B)=P(A)*P(B) –P(A)=P(A|B) –P(B)=P(B|A) Two variables A and B are conditional independent given C if –P(A,B|C)=P(A|C) * P(B|C) –P(A|B,C)=P(A|C) –P(B|A,C)=P(B|C) Independence assumption is used to remove some conditional factors, which will reduce the number of parameters in a model.

24
PCFG parsers It assumes each rule is independent of other rules

25
Problems of independence assumptions Lexical independence: –P(VP V, V bought) = P(VP V)*P(V bought) See Table 12.2 on M&S P418. cometakethinkwant VP->V9.5%2.6%4.6%5.7% VP->V NP1.1%32.1%0.2%13.9% VP->V PP34.5%3.1%7.1%0.3% VP->V SBAR6.6%0.3%73.0%0.2%

26
Problems of independence assumptions (cont) Structural independence: –P(S NP VP, NP Pron) = P(S NP VP) * P(NP Pron) See Table 12.3 on M&S P420. % as subj% as obj NP Pron13.7%2.1% NP Det NN 5.6%4.6% NP NP SBAR 0.5%2.6% NP NP PP 5.6%14.1%

27
Dealing with the problems Lexical rules: –P(VP V | V=come) –P(VP V | V=think) Adding context info: is a function that groups into equivalence classes.

28
PCFG It assumes each rule is independent of other rules

29
A lexicalized model

30
An example he likes her

31
Head-head probability

32
Head-rule probability

33
Collecting the counts

34
Remaining problems he likes her The Prob(T,S) is the same if the sentence is changed to “her likes he”.

35
Previous model

36
A new model

37
New formula he likes her

Similar presentations

© 2020 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google