# More on Text Management. Context Free Grammars Context Free Grammars are a more natural model for Natural Language Syntax rules are very easy to formulate.

## Presentation on theme: "More on Text Management. Context Free Grammars Context Free Grammars are a more natural model for Natural Language Syntax rules are very easy to formulate."— Presentation transcript:

More on Text Management

Context Free Grammars Context Free Grammars are a more natural model for Natural Language Syntax rules are very easy to formulate using CFGs Provably more expressive than Finite State Machines – E.g. Can check for balanced parentheses

Context Free Grammars Non-terminals Terminals Production rules – V → w where V is a non-terminal and w is a sequence of terminals and non-terminals

Context Free Grammars Can be used as acceptors Can be used as a generative model Similarly to the case of Finite State Machines How long can a string generated by a CFG be?

Stochastic Context Free Grammar Non-terminals Terminals Production rules associated with probability – V → w where V is a non-terminal and w is a sequence of terminals and non-terminals

Chomsky Normal Form Every rule is of the form V → V1V2 where V,V1,V2 are non-terminals V → t where V is a non-terminal and t is a terminal Every (S)CFG can be written in this form Makes designing many algorithms easier

Questions What is the probability of a string? – Defined as the sum of probabilities of all possible derivations of the string Given a string, what is its most likely derivation? – Called also the Viterbi derivation or parse – Easy adaptation of the Viterbi Algorithm for HMMs Given a training corpus, and a CFG (no probabilities) learn the probabilities on derivation rule

Inside probability: probability of generating w p …w q from non-terminal N j. Outside probability: total prob of beginning with the start symbol N 1 and generating and everything outside w p …w q Inside-outside probabilities

CYK algorithm NjNj NrNr NsNs wpwp wdwd W d+1 wqwq

CYK algorithm NjNj NgNg wpwp wqwq W q+1 wewe NfNf N1N1 w1w1 wmwm

CYK algorithm NgNg NjNj wewe W p-1 WpWp wqwq NfNf N1N1 w1w1 wmwm

Outside probability

Probability of a sentence

The probability that a binary rule is used (1)

The probability that N j is used (2)

The probability that a unary rule is used (3)

Multiple training sentences (1) (2)

Download ppt "More on Text Management. Context Free Grammars Context Free Grammars are a more natural model for Natural Language Syntax rules are very easy to formulate."

Similar presentations