September PROBABILISTIC CFGs & PROBABILISTIC PARSING Universita’ di Venezia 3 Ottobre 2003
September Probabilistic CFGs Context-Free Grammar Rules are of the form: – S NP VP In a Probabilistic CFG, we assign a probability to these rules: – S NP VP, P(S NP,VP|S)
September Why PCFGs? DISAMBIGUATION: with a PCFG, probabilities can be used to choose the most likely parse ROBUSTNESS: rather than excluding things, a PCFG may assign them a very low probability LEARNING: CFGs cannot be learned from positive data only
September An example of PCFG
September PCFGs in Prolog (courtesy Doug Arnold) s(P0, [s,NP,VP] ) --> np(P1,NP), vp(P2,VP), { P0 is 1.0*P1*P2 }. …. vp(P0, [vp,V,NP] ) --> v(P1,V), np(P2,NP ), { P0 is 0.7*P1*P2 }.
September Notation and assumptions
September Independence assumptions PCFGs specify a language model, just like n-grams We need however to make some independence assumptions yet again: the probability of a subtree is independent of:
September The language model defined by PCFGs
September Using PCFGs to disambiguate: “Astronomers saw stars with ears”
September A second parse
September Choosing among the parses, and the sentence’s probability
September Parsing with PCFGs: A comparison with HMMs An HMM defines a REGULAR GRAMMAR:
September Parsing with CFGs: A comparison with HMMs
September Inside and outside probabilities (cfr. forward and backward probabilities for HMMs)
September Parsing with probabilistic CFGs
September The algorithm
September Example
September Initialization
September Example
September Example
September Learning the probabilities: the Treebank
September Learning probabilities Reconstruct the rules used in the analysis of the Treebank Estimate probabilities by: P(A B) = C(A B) / C(A)
September Probabilistic lexicalised PCFGs (Collins, 1997; Charniak, 2000)
September Parsing evaluation
September Performance of current parsers
September Readings Manning and Schütze, chapters 11 and 12
September Acknowledgments Some slides and the Prolog code are borrowed from Doug Arnold Thanks also to Chris Manning & Diego Molla