Presentation on theme: "The Practical Value of Statistics for Sentence Generation: The Perspective of the Nitrogen System Irene Langkilde-Geary."— Presentation transcript:
The Practical Value of Statistics for Sentence Generation: The Perspective of the Nitrogen System Irene Langkilde-Geary
How well do statistical n-grams make linguistic decisions? Subject-Verb Agreement Article-Noun Agreement I am 2797 a trust 394 an trust 0 the trust 1355 I are 47 a trusts 2 an trusts 0 the trusts 115 I is 14 Singular vs Plural Word Choice their trust 28 reliance 567 trust 6100 their trusts 8 reliances 0 trusts 1083
More Examples Relative pronoun Preposition visitor who 9 visitors who 20 in Japan 5413 to Japan 1196 visitor which 0 visitors which 0 visitor that 9 visitors that 14 came to 2443 arrived in 544 came in 1498 arrived to 35 Singular vs Plural came into 244 arrived into 0 visitor 575 visitors 1083 came to Japan 7 arrived to Japan 0 Verb Tense came into Jap 1 arrived into Japan 0 admire 212 admired 211 came in Japan 0 arrived in Japan 4 admires 107
How can we get a computer to learn by “reading”?
Nitrogen takes a two-step approach 1.Enumerate all possible expressions 2.Rank them in order of probabilistic likelihood Why two steps? They are independent.
Assigning probabilities Ngram model Formula for bigrams: P(S) = P(w 1 | START ) * P(w 2 |w 1 ) * … * P(w n |w n-2 ) Probabilistic syntax (current work) –A variant of probabilistic parsing models
Sample Results of Bigram model Random path : (out of a set of 11,664,000 semantically-related sentences) Visitant which came into the place where it will be Japanese has admired that there was Mount Fuji. Top three: Visitors who came in Japan admire Mount Fuji. Visitors who came in Japan admires Mount Fuji. Visitors who arrived in Japan admire Mount Fuji. Strengths Reflects reality that 55% (Stolke et al. 1997) of dependencies are binary, and between adjacent words Embeds linear ordering constraints
Limitations of Bigram model Example Reason Visitors come in Japan. A three-way dependency He planned increase in sales. Part-of-speech ambiguity A tourist who admire Mt. Fuji... Long-distance dependency A dog eat/eats bone. Previously unseen ngrams I cannot sell their trust. Nonsensical head-arg relationship The methods must be modified to Improper subcat structure the circumstances.
Representation of enumerated possibilities (Easily on the order of 10 15 to 10 32 or more) List Lattice Forest Issues space/time constraints redundancy localization of dependencies non-uniform weights of dependencies
Mapping Rules 1.Recast one input to another –(implicitly providing varying levels of abstraction) 2.Assign linear order to constituents 3.Add missing info to under-specified inputs Matching Algorithm Rule order determines priority. Generally: –Recasting < linear ordering < under-specification –High (more semantic) level of abstraction < low (more syntactic) –Distant position (adjuncts) from head < near (complements) –Basic properties < specialized
Properties used by Nitrogen :cat [nn, vv, jj, rb, etc.] :polarity [+, -] :number [sing, plural] :tense [past, present] :person [1s 2s 3s 1p 2p 3p s p all] :mood [indicative, pres-part, past-part, infinitive, to-inf, imper]
How many grammar rules needed for English? Sentence Constituent+ Constituent Constituent+ OR Leaf Leaf Punc* FunctionWord* ContentWord FunctionWord* Punc* FunctionWord ``and'' OR ``or'' OR ``to'' OR ``on'' OR ``is'' OR ``been'' OR ``the'' OR …. ContentWord Inflection(RootWord,Morph) RootWord ``dog'' OR ``eat'' OR ``red'' OR.... Morph none OR plural OR third-person-singular...
Computational Complexity (x 2 /A 2 ) + (y 2 /B 2 ) = 1 X Y ???
Advantages of a statistical approach for symbolic generation module Shifts focus from “grammatical” to “possible” Significantly simplifies knowledge bases Broadens coverage Potentially improves quality of output Dramatically reduces information demands on client Greatly increases robustness