1 Understanding Natural Language 14 14.0The Natural Language Understanding Problem 14.1Deconstructing Language: A Symbolic Analysis 14.2Syntax 14.3Syntax.

Slides:



Advertisements
Similar presentations
Natural Language Understanding Difficulties: Large amount of human knowledge assumed – Context is key. Language is pattern-based. Patterns can restrict.
Advertisements

1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
1 CS 385 Fall 2006 Chapter 14 Understanding Natural Language Problems.
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Understanding Natural Language
CSE 3302 Programming Languages Chengkai Li, Weimin He Spring 2008 Syntax Lecture 2 - Syntax, Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai.
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
1 Chapter 20 Understanding Language. 2 Chapter 20 Contents (1) l Natural Language Processing l Morphologic Analysis l BNF l Rewrite Rules l Regular Languages.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Amirkabir University of Technology Computer Engineering Faculty AILAB Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing Course,
Prof. Bodik CS 164 Lecture 61 Building a Parser II CS164 3:30-5:00 TT 10 Evans.
1 Understanding Natural Language The Natural Language Understanding Problem 14.1Deconstructing Language: A Symbolic Analysis 14.2Syntax 14.3Syntax.
CSE (c) S. Tanimoto, 2008 Natural Language Understanding 1 Natural Language Understanding Outline: Motivation Structural vs Statistical Approaches.
1 Foundations of Software Design Lecture 23: Finite Automata and Context-Free Grammars Marti Hearst Fall 2002.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
Parsing II : Top-down Parsing Lecture 7 CS 4318/5331 Apan Qasem Texas State University Spring 2015 *some slides adopted from Cooper and Torczon.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
Lexical and syntax analysis
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
EECS 6083 Intro to Parsing Context Free Grammars
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
CCSB354 ARTIFICIAL INTELLIGENCE (AI)
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
1 CS 385 Fall 2006 Chapter 14 Understanding Natural Language (omit 14.4)
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Syntax and Backus Naur Form
Understanding Natural Language
PART I: overview material
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
Syntax and Semantics Structure of programming languages.
C Chuen-Liang Chen, NTUCS&IE / 51 CONTEXT-FREE GRAMMARS Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
CPS 506 Comparative Programming Languages Syntax Specification.
Chapter 3 Describing Syntax and Semantics
Artificial Intelligence: Natural Language
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
Muhammad Idrees, Lecturer University of Lahore 1 Top-Down Parsing Top down parsing can be viewed as an attempt to find a leftmost derivation for an input.
csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
The Interpreter Pattern (Behavioral) ©SoftMoore ConsultingSlide 1.
Chapter 3 Context-Free Grammars Dr. Frank Lee. 3.1 CFG Definition The next phase of compilation after lexical analysis is syntax analysis. This phase.
Parsing and Code Generation Set 24. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program,
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.
GRAMMARS & PARSING. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program, referred to as a.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Composing Music with Grammars. grammar the whole system and structure of a language or of languages in general, usually taken as consisting of syntax.
Parsing III (Top-down parsing: recursive descent & LL(1) )
Formal grammars A formal grammar is a system for defining the syntax of a language by specifying sequences of symbols or sentences that are considered.
By Kyle McCardle.  Issues with Natural Language  Basic Components  Syntax  The Earley Parser  Transition Network Parsers  Augmented Transition Networks.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
Modeling Arithmetic, Computation, and Languages Mathematical Structures for Computer Science Chapter 8 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesAlgebraic.
Programming Languages Translator
Basic Parsing with Context Free Grammars Chapter 13
Natural Language Understanding
Compiler Design 4. Language Grammars
Lecture 7: Introduction to Parsing (Syntax Analysis)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
COMPILER CONSTRUCTION
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

1 Understanding Natural Language The Natural Language Understanding Problem 14.1Deconstructing Language: A Symbolic Analysis 14.2Syntax 14.3Syntax and Knowledge with ATN parsers 14.4Stochastic Tools for Language Analysis 14.5Natural Language Applications 14.6Epilogue and References 14.7Exercises Additional source used in preparing the slides: Patrick H. Winston’s AI textbook, Addison Wesley, 1993.

2 Chapter objective Give a brief introduction to deterministic techniques used in understanding natural language

3 An early natural language understanding system: SHRDLU (Winograd, 1972)

4 It could converse about a blocks world What is sitting on the red block? What shape is the blue block on the table? Place the green cylinder on the red brick. What color is the block on the red block? Shape?

5 The problems Understanding language is not merely understanding the words: it requires inference about the speaker’s goals, knowledge, assumptions. The context of interaction is also important.  Do you know where Rekhi 309 is?  Yes.  Good, then please go there and pick up the documents.  Do you know where Rekhi 309 is?  Yes, go up the stairs and enter the semi-circular section.  Thank you.

6 The problems (cont’d) Implementing a natural language understanding program requires that we represent knowledge and expectations of the domain and reason effectively about them.  nonmonotonicity  belief revision  metaphor  planning  learning  … Shall I compare thee to a summer’s day? Thou art more lovely and more temperate: Rough winds do not shake the darling buds of May, And summer’s lease hath all too short a date: Shakespeare’s Sonnet XVIII

7 The problems (cont’d) There are three major issues involved in understanding natural language:  A large amount of human knowledge is assumed.  Language is pattern based. Phoneme, word, and sentence orders are not random.  Language acts are products of agents embedded in complex environments.

8 SHRDLU’s solution Restrict focus to a microworld : blocks world Constrain the language: use templates Do not deal with problems involving commonsense reasoning: still can communicate meaningfully

9 Linguists’ approach Prosody: rhythm and intonation of language Phonology: sounds that are combined Morphology: morphemes that make up words Syntax: rules for legal phrases and sentences Semantics: meaning of words, phrases, sentences Pragmatics: effects on the listener World knowledge: background knowledge of the physical world

10 Stages of language analysis 1. Parsing: analyze the syntactic structure of a sentence 2. Semantic interpretation: analyze the meaning of a sentence 3. Contextual/world knowledge representation: Analyze the expanded meaning of a sentence For instance, consider the sentence: Tarzan kissed Jane.

11 The result of parsing would be:

12 The result of semantic interpretation

13 The result of contextual/world knowledge interpretation

14 Can also represent questions: Who loves Jane?

15 Parsing using Context-Free Grammars A bunch of rewrite rules: 1. sentence  noun_phrase verb_phrase 2. noun_phrase  noun 3. noun_phrase  article noun 4. verb_phrase  verb 5. verb_phrase  verb noun_phrase 6. article  a 7. article  the 8. noun  man 9. noun  dog 10. verb  likes 11. verb  bites these are the terminals these are the symbols of the grammar these are the nonterminals

16 Parsing It is the search for a legal derivation of the sentence. sentence  noun_phrase verb_phrase  article noun verb_phrase  The noun verb_phrase  The man verb_phrase  The man verb noun_phrase  The man bites noun_phrase  The man bites article noun  The man bites the noun  The man bites the dog Each intermediate form is a sentential form.

17 Parsing (cont’d) The result is a parse tree. A parse tree is a structure where each node is a symbol from the grammar. The root node is the starting nonterminal, the intermediate nodes are nonterminals, the leaf nodes are terminals. “Sentence” is the starting nonterminal. There are two classes of parsing algorithms  top-down parsers: start with the starting symbol and try to derive the complete sentence  bottom-up parsers: start with the complete sentence and attempt to find a series of reductions to derive the start symbol

18 The parse tree

19 Parsing is a search problem Search for the correct derivation If a wrong choice is made, the parser needs to backtrack Recursive descent parsers maintain backtrack pointers Look-ahead techniques help determine the proper rule to apply We’ll study transition network parsers (and augmented transition networks)

20 Transition networks

21 Transition networks (cont’d) It is a set of finite-state machines representing the rules in the grammar Each network corresponds to a single nonterminal Arcs are labeled with either terminal or nonterminal symbols Each path from the start state to the final state corresponds to a rule for that nonterminal If there is more than one rule for a nonterminal there are multiple paths from the start to the goal (e.g., noun_phrase)

22 The main idea Finding a successful transition through the network corresponds to replacement of the nonterminal with the RHS Parsing a sentence is a matter of traversing the network:  If the label of the transition (arc) is a terminal, it must match the input, and the input pointer advances  If the label of the transition (arc) is a nonterminal, the corresponding transition network is invoked recursively  If several alternative paths are possible, each must be tried (backtracking)---very much like nondeterministic finite automaton---until a successful path is found

23 Parsing the sentence “Dog bites.”

24 Notes A “successful parse” is the complete traversal of the net for the starting nonterminal from s initial to s final. If no path works, the parse “fails.” It is not a valid sentence (or part of sentence). The following algorithm would be called using parse(s initial ) It would start with the net for “sentence.”

25 The algorithm Function parse(grammar_symbol); begin save pointer to current location in input stream; case grammar_symbol is a terminal; if grammar_symbol matches the next word in the input stream then return(success) else begin reset input stream return(failure) end;

26 The algorithm (cont’d) … case … … grammar_symbol is a nonterminal; begin retrieve the transition network labeled by grammar_symbol state := start state of network; if transition(state) returns success then return(success) else begin reset input stream; return (failure) end end end.

27 The algorithm (cont’d) Function transition(current_state); begin case current_state is a final state: return (success) current_state is not a final state: while there are unexamined transitions out of current_state do begin grammar_symbol := the label on the next unexamined transition if parse(grammar_symbol) returns (success) then begin next_state := state at the end of the transition; if transition(next_state) returns (success); then return(success) end end return(failure) end end.

28 Modifications to return the parse tree 1. Each time the function parse is called with a terminal symbol as argument and that terminal matches the next symbol of input, it returns a tree consisting of a single leaf node labeled with that symbol. 2. When parse is called with a nonterminal, N, it calls transition. If transition succeeds, it returns an ordered set of subtrees. Parse combines these into a tree whose root is N and whose children are the subtrees returned by transition.

29 Modifications to return the parse tree (cont’d) 3. In searching for a path through a network, transition calls parse on the label of each arc. On success, parse returns a tree representing a parse of that symbol. Transition saves these subtrees in an ordered set and, on finding a path through the network, returns the ordered set of parse trees corresponding to the sequence of arc labels on the path.

30 Comments of transition networks They capture the regularity in the sentence structure They exploit the fact that only a small vocabulary is needed in a specific domain If a sentence “doesn’t make sense”, it might be caught by the domain information. For instance, the answer to both of the following questions is “there is none”  “Pick up the blue cylinder”  “Pick up the red blue cylinder”

31 The Chomsky Hierarchy and CFGs A CFG: a single nonterminal is allowed on the left-hand side. CFGs are not powerful enough to represent natural language Simply add plural nouns to the dogs world grammar: noun  men noun  dogs verb  bites verb  like “A men bites a dogs” will be a legal sentence

32 Options to deal with context Extend CFGs Use context-sensitive grammars (CSGs) With CSGs the only restriction is that the RHS is at least as long as the LHS Note that the one higher class, recursively enumerable languages or Turing recognizable languages is not an usually regarded as an option

33 A context-sensitive grammar sentence  noun_phrase verb_phrase noun_phrase  article number noun noun_phrase  number noun number  singular number  plural article singular  a singular article singular  the singular article plural  the plural singular noun  dog singular singular noun  man singular plural noun  men plural plural noun  dogs plural singular verb_phrase  singular verb plural verb_phrase  plural verb

34 A context-sensitive grammar (cont’d) singular verb  bites singular verb  likes plural verb  bite plural verb  like

35 “The dogs bite” sentence  noun_phrase verb_phrase  article plural noun verb_phrase  The plural noun verb_phrase  The dogs plural verb_phrase  The dogs plural verb  The dogs bite

36 CSGs for practical parsing 1.The number of rules and nonterminals in the grammar increase drastically. 2.They obscure the phrase structure of the language that is so clearly represented in the context-free rules 3.By attempting to handle more complicated checks for agreement and semantic consistency in the grammar itself, they lose many of the benefits of separating the syntactic and semantic components of language 4.CSGs do not address the problem of building a semantic representation of the text