Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 1 Natural Language Understanding Outline: Motivation Structural vs Statistical Approaches.

Similar presentations


Presentation on theme: "CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 1 Natural Language Understanding Outline: Motivation Structural vs Statistical Approaches."— Presentation transcript:

1 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 1 Natural Language Understanding Outline: Motivation Structural vs Statistical Approaches Syntax Semantics Semantic grammars Augmented Transition Nets NLU in Closed Worlds: Operational Semantics The STONEWORLD program Statistical NLP

2 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 2 Motivation Make it easier for people to give commands to computers. Allow computers to perform language translation. Allow computers to listen to lectures and read books, in order alleviate the knowledge acquisition bottleneck. Improve information retrieval services including search engines such as Google. Integrate robots into human society. Better understand human communication and linguistics.

3 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 3 Structural vs Statistical Approaches Structural Approach: Analytical approach based on the linguistic structure of language – esp. syntax as studied by Chomsky. Encompasses handcrafted lexical analyzers, parsers, semantic interpreters, and knowledge bases. Example technique: Augmented Transition Nets based on semantic grammars. Statistical Approach: Grows out of the availability of large language corpora via the Internet, and improvements in machine learning technology. Example technique: Latent Semantic Analysis

4 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 4 Levels of Analysis for NLU (for both structural and statistical approaches) (Read up from the acoustic level to the pragmatic level) Pragmatic level (goals, intents, dialog, rhetorical structure, speech acts) Semantic level (meaning, representation) Syntactic level (grammar, phrase structure) Lexical, Morphological level (words, inflections) Phonological level (acoustic features -- phonemes) Acoustic level (sensing, signal processing)

5 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 5 Syntax, Semantics, Pragmatics By taking a more systematic approach to NLU at these levels (than was done in programs like ELIZA), we will be able to create more useful and reliable natural language interfaces. Issues to resolve: What is the ultimate purpose of language, and how does that influence NLU? How can the phrase structure of natural language be captured in a grammar? How can meaning be interpreted and represented? How can the syntax and semantics of a system be designed to match the needs of an application?

6 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 6 Communicating with Language Language is for communication. Communication usually means sending and receiving information. Sentences describe events, states of the world, objects and ideas, feelings and attitudes, and hypothetical situations. Phrase-structure grammars provide a method of organizing the components of messages, allowing for a great variety of possible meanings.

7 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 7 Syntax Describes the form, not meaning, of sentences in a language. Syntax is traditionally described with formal systems called grammars. A context free grammar can be specified with 4 components: G = (Σ, V, S, P) where Σ is a finite set of terminal symbols called the alphabet. V is a finite set of nonterminal symbols (“syntactic categories,” e.g., noun, noun-phrase, clause, etc.) S is a distinguished member of V called the start symbol (or “the initial sentential form”). P is a finite set of productions (rewrite rules). Each production has the form A  b 0 b 1... b n-1 where A is a nonterminal symbol and each b i is either a terminal or nonterminal.

8 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 8 Example Grammar from a Formal Languages Context G = ({0, 1}, {S, A, B}, S, P), where P = { S  A S  B A  0A0 A  1 B  1B1 B  0 }

9 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 9 Example Grammar from a Computational Linguistics context G = ({symbols, are, tools}, {S, N, V}, S, P), where P = { S  NVN N  symbols N  tools V  are } A derivation of a sentence from S: S  NVN  tools VN  tools are N  tools are symbols Each item in the sequence is a sentential form.

10 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 10 Exercise For each of the strings below, determine whether or not it is in L(G), the language generated by G. If it’s in the language, give a derivation. 01 λ 011001 01S10 101S101 G = ({0, 1}, {S}, S, P), where P = {S  01S, S  10S, S  0S1, S  1S0, S  01, S  10}

11 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 11 Semantics The job of semantic analysis is to construct a representation of the meaning of a piece of NL text. Meaning representations can be descriptive – like definitions of words in a dictionary operational – e.g., executable program code anything in-between Semantic primitives: Often the meaning of a word or small phrase consists of a reference to a node in a semantic network, such as WordNet. Semantic compounds: More complex meanings may be represented as case frames, or (relatively) small semantic networks whose nodes in turn reference nodes in a large semantic network or dictionary.

12 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 12 Semantics (cont.) One approach: Representation of meaning using case frames. A frame is an attribute-value structure. In a case frame, the frame has a type that usually corresponds to a verb. The particular kinds of attributes in the frame depend on the type. “Alexander took an exam.” Action: take (write, submit to) Agent: Alexander Object: examination Time: past

13 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 13 Semantic Analysis: Interpretation The process of semantic analysis starts with either NL text or a parse (e.g., parse tree). It produces a representation of the meaning of the text. This process is also called “semantic interpretation” or simply “interpretation”. One successful approach to interpretation for some computer applications involves coordinating parsing and interpretation (similar to syntax-directed translation in some programming language compilers). For this approach, we usually need a “semantic grammar”...

14 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 14 Semantic Grammar A semantic grammar is a grammar whose syntactic categories correspond directly to groups of words whose meanings can be largely inferred from the parse.  the  do | perform | start | finish  job | task | command | activity | operation “start the activity” “do the operation” “finish the job”

15 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 15 Controlled Language A controlled language is a subset of a natural language specified in a computer-based representation or formal system for the purpose of facilitating analysis or understanding by computer. The language generated by a semantic grammar is one type of controlled language.

16 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 16 Augmented Transition Nets An ATN is a language processor that combines parsing and translation. It is based on a collection of transition diagrams. the do, etc. job, etc

17 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 17 Stone World A microworld: 2-D cellular space in which various objects can be placed. An agent “Mace” that takes commands from the user, and which inhabits the microworld. Stationary objects: pillars, wells, quarries. Portable objects: stones, gems. Actions: Mace can move and can carry objects. A natural-language interface: Augmented transition network based on a semantic grammar.

18 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 18 Stone World Motivation Demonstrates a full combination of syntax, semantics, actions, and responses. An artificial, closed world permits unambiguous interpretation. Stone World offers a substrate upon which experiments can games can be constructed. Stone World, while simple by comparison, shares these features with the well-known research system SHRDLU, developed by Terry Winograd at MIT.

19 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 19 Stone World’s ATN SHOW * LASTG1T2T3T4 P2P3 G2G3 DOWN, * IT DOWN, (PUT-VERB) (TAKE-VERB) UP, (GO-VERB) * TO (DNP1) TOWARD (DNP1) * G1(NP1)

20 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 20 Stone World’s ATN (Cont) (ARTICLE)(OBJ-NOUN) (ARTICLE) (OBJ-NOUN) (DIRECTION-NOUN) DNP1DNP2 NP1NP2

21 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 21 Demonstration of Stone World The Python Implementation of Stone World consists of two parts: 1.representation and methods for accessing and transforming the state of the microworld; 2.the Augmented Transition Network and other support for the natural- language interface.

22 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 22 Sample Conversation WALK NORTH * I UNDERSTAND YOU. OK GO TO THE WEST * I UNDERSTAND YOU. OK GO WEST * I UNDERSTAND YOU. OK TAKE A STONE FROM THE QUARRY * I UNDERSTAND YOU. OK DROP THE STONE TOWARD THE EAST * I UNDERSTAND YOU. OK TAKE A STONE * I UNDERSTAND YOU. OK DROP IT TO THE NORTH * I UNDERSTAND YOU. OK GO SOUTH * I UNDERSTAND YOU. OK

23 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 23 Statistical NLP Statistics has long been a part of computational linguistics. However, interest in the approach has grown rapidly during the 1990s as the Internet has grown. Subareas include corpus-based language description, applications in improving search- engine indexing and retrieval, question answering, and data mining.

24 CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 24 Statistical NLP (cont) Latent Semantic Analysis (use of singular- value decomposition of large term-document matrices to create “semantic spaces” in which semantically related words and documents tend to be close together – to be presented later).


Download ppt "CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding 1 Natural Language Understanding Outline: Motivation Structural vs Statistical Approaches."

Similar presentations


Ads by Google