Presentation is loading. Please wait.

Presentation is loading. Please wait.

More on Text Management. Context Free Grammars Context Free Grammars are a more natural model for Natural Language Syntax rules are very easy to formulate.

Similar presentations


Presentation on theme: "More on Text Management. Context Free Grammars Context Free Grammars are a more natural model for Natural Language Syntax rules are very easy to formulate."— Presentation transcript:

1 More on Text Management

2 Context Free Grammars Context Free Grammars are a more natural model for Natural Language Syntax rules are very easy to formulate using CFGs Provably more expressive than Finite State Machines – E.g. Can check for balanced parentheses

3 Context Free Grammars Non-terminals Terminals Production rules – V → w where V is a non-terminal and w is a sequence of terminals and non-terminals

4 Context Free Grammars Can be used as acceptors Can be used as a generative model Similarly to the case of Finite State Machines How long can a string generated by a CFG be?

5 Stochastic Context Free Grammar Non-terminals Terminals Production rules associated with probability – V → w where V is a non-terminal and w is a sequence of terminals and non-terminals

6 Chomsky Normal Form Every rule is of the form V → V1V2 where V,V1,V2 are non-terminals V → t where V is a non-terminal and t is a terminal Every (S)CFG can be written in this form Makes designing many algorithms easier

7 Questions What is the probability of a string? – Defined as the sum of probabilities of all possible derivations of the string Given a string, what is its most likely derivation? – Called also the Viterbi derivation or parse – Easy adaptation of the Viterbi Algorithm for HMMs Given a training corpus, and a CFG (no probabilities) learn the probabilities on derivation rule

8 Inside probability: probability of generating w p …w q from non-terminal N j. Outside probability: total prob of beginning with the start symbol N 1 and generating and everything outside w p …w q Inside-outside probabilities

9 CYK algorithm NjNj NrNr NsNs wpwp wdwd W d+1 wqwq

10 CYK algorithm NjNj NgNg wpwp wqwq W q+1 wewe NfNf N1N1 w1w1 wmwm

11 CYK algorithm NgNg NjNj wewe W p-1 WpWp wqwq NfNf N1N1 w1w1 wmwm

12 Outside probability

13 Probability of a sentence

14 The probability that a binary rule is used (1)

15 The probability that N j is used (2)

16

17 The probability that a unary rule is used (3)

18 Multiple training sentences (1) (2)


Download ppt "More on Text Management. Context Free Grammars Context Free Grammars are a more natural model for Natural Language Syntax rules are very easy to formulate."

Similar presentations


Ads by Google