Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Natural Language Generation

Similar presentations


Presentation on theme: "Introduction to Natural Language Generation"— Presentation transcript:

1 Introduction to Natural Language Generation
Yael Netzer Department of Computer Science Ben Gurion University

2 Outline Introduction – what is NLG
Traditional architecture of NLG system Statistical methods in NLG FUF/SURGE An example in Hebrew – the noun phrase A statistical method for generation Yael Netzer BGU November 6, 2001

3 What is Natural Language Generation (NLG)
NLG is the process of constructing natural language outputs from non-linguistic inputs. [VanLinden] NLG is mapping some communication goal to some surface utterance that satisfies the goal. [Reiter & Dale] Yael Netzer BGU November 6, 2001

4 Aspects in NLG Theoretical and practical interests:
Theoretical: modeling various depths of human language representation and production. Practical: engineering human/computer interfaces (computer as an author/authoring aid). Yael Netzer BGU November 6, 2001

5 Systems for examples: NLG as an Author: NLG as an author aid
Weather reports (FoG) Stock market descriptions Museum artifacts descriptions (ILEX) “Personal” letters to costumers (AlethGen) NLG as an author aid Integrated (partial) NLG uses: NLG in augmentative and alternative communication Summarization (integrate ‘cut and paste’ techniques with generation) Machine Translation (generation from interlingua) Yael Netzer BGU November 6, 2001

6 Inputs of NLG systems Formally, a system can be defined as a four-tuple: {k,c,u,d} k- knowledge source (tables of numbers, knowledge representation lang.) domain dependent, no generalizations. c - communicative goal: the consequence of a given execution of the system (considering appropriate information) Yael Netzer BGU November 6, 2001

7 NLG input spec. cont. u - user model: characterization of the hearer or intended audience for whom the text is to be generated. d - discourse history: previous interactions between user and NLG controlling anaphoric forms, preventing repetitions. Yael Netzer BGU November 6, 2001

8 The output for an NLG system
Any text conveying the communicative goal: It can be a word like ``yes'' in a dialogue - or a text consisting of many paragraphs in other cases. The output should be related to the medium: web pages with hyperlinks, voice stream etc. Yael Netzer BGU November 6, 2001

9 Main (Pipeline) Architecture
Content determination What information should be included in the text? Document structuring how to organize text Lexicalisation choosing particular words or phrases Aggregation composing chunks of info into sentences. Referring expression generation – what properties should be used in referring to an entity. Surface realization mapping underlying content of text to a grammatically correct sentence that expresses the desired meaning. Yael Netzer BGU November 6, 2001

10 Content Determination
The process of deciding what to say. No general rules - domain specific. what is important - what should always be included, what is exceptional information, etc. Practically – constructs a set of messages from the underlying data (entities, concepts and relations). Yael Netzer BGU November 6, 2001

11 Document Structuring Document Structuring:
imposing ordering and structure over the information. - conceptual grouping - rhetorical relationships. Yael Netzer BGU November 6, 2001

12 Lexical choice Lexical chooser:
determining the particular words to be used to express concepts and relations. complexity of coding vs. richer language. choosing content words: information is mapped from conceptual vocabulary. LC should supply a variety of words, consider the user model [precise vs. general description of weather phenomenon], and account for pragmatic considerations (formal vs. casual style). Yael Netzer BGU November 6, 2001

13 Aggregation Aggregation - can be performed in various stages:
the planner: combines similar data. In lexicalization: aggregates some concepts into one lexical element. Aggregations of sentences: The month was cooler than average. The month was drier than average into The month was cooler and drier than average Yael Netzer BGU November 6, 2001

14 Referring expression generation
an entity can be referred in many ways: initially, subsequently, distinguishing, definite, pronouns. Proper names: באר שבע באר שבע בית הנגב Definite descriptions: The train that leaves at 10am The next train. Prounouns it Yael Netzer BGU November 6, 2001

15 Syntactic realizer Syntactic Realizer: syntax and morphology.
Most general, domain independent (but definitely language dependent). Various Usage Scenarios Input to syntactic realization is not observable Input for syntactic realizers in NLG What knowledge is needed to prepare input? Who supplies this knowledge? Can we find a common abstraction, common across languages and applications? Yael Netzer BGU November 6, 2001

16 Possible techniques for realizers
Bi-directional grammar specification. Grammar specifications tuned for generation. Templates Corpus statistics Yael Netzer BGU November 6, 2001

17 A note on bi-directional grammar
Realization, in some aspects, is easier than parsing: no need to handle the full range of syntax that a human might use, no need to resolve ambiguities, no need to recover ill-formed input. A bi-directional grammar, is, theoretically, a possible elegant approach. However, most NLG systems use a generation-oriented grammar Yael Netzer BGU November 6, 2001

18 Why not bi-directional?
Output of NLU parser is very different from the input to an NLG realizer. Not obvious that lexicalization is a part of the realization. Practically, not easy to engineer large bi-directional grammars. And more: generation is the process of choices, even to use ‘canned text’ when needed. Yael Netzer BGU November 6, 2001

19 Syntactic Realizer This work concerns Syntactic Realizers – the grammar Input for grammar: lexicalized representation of a phrase in various levels of abstractions. Output of grammar: a grammatical string, representing most accurately the info in the input. Yael Netzer BGU November 6, 2001

20 The input question is: Knowledge Input?? Syntactic base Realizer
Application Content planner And lexicon Knowledge base Syntactic Realizer Yael Netzer BGU November 6, 2001

21 FUF/SURGE - Implementation
The grammar is written in FUF – Functional Unification Formalism [Elhadad] FD - a list of (att val) val = atom\fd\path Grammar: meta-FD: disjunction with ALT, control with NONE, GIVEN, ANY. All components in the generation process can be implemented with this formalism. Yael Netzer BGU November 6, 2001

22 Requirements for a syntactic realizer
Mapping thematic structure onto syntactic roles. Control of syntactic paraphrasing and alternations. Provision of default for syntactic features. Propagation of agreement features. Selection of closed class words. The imposition of linear precedence constraints. The inflection of open class words. Yael Netzer BGU November 6, 2001

23 SURGE [Elhadad&Robin 96]
Functional Grammar, HPSG and descriptive studies of language Input for the grammar is a lexicalized representation of a phrase (a clause, NP, AP). Minimal syntactic information in the input allows isolating earlier stages of the process from containing purely syntactic knowledge, it gives the grammar paraphrasing power, and it is also useful for multilingual application. Yael Netzer BGU November 6, 2001

24 Input for SURGE in general
Each constituent has the feature cat which determines which part of the grammar it will be unified with. The representation of the clause is mostly semantic: a process (in SFL terms) and its participant. Paraphrasing can be done using one feature, like focus The input of an NP uses mostly syntactic features. Paraphrases requires different input. Yael Netzer BGU November 6, 2001

25 The girl was kissed by John. (focus {partic affected})
An Example The girl was kissed by John. John kissed the girl. ((cat clause) (tense past) (process ((type material) (agentless no) (lex “kiss”))) (participants ((agent ((cat proper) (lex “John”))) (affected ((cat common) (lex “girl”)))))) (focus {partic affected}) Yael Netzer BGU November 6, 2001


Download ppt "Introduction to Natural Language Generation"

Similar presentations


Ads by Google