Inteligenta Artificiala

Slides:



Advertisements
Similar presentations
CSA2050: DCG I1 CSA2050 Introduction to Computational Linguistics Lecture 8 Definite Clause Grammars.
Advertisements

1 Inteligenta Artificiala Universitatea Politehnica Bucuresti Anul universitar Adina Magda Florea
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,
Statistical NLP: Lecture 3
Grammars.
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
1 Words and the Lexicon September 10th 2009 Lecture #3.
Let remember from the previous lesson what is Knowledge representation
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Outline of English Syntax.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Chapter 3: Formal Translation Models
Specifying Languages CS 480/680 – Comparative Languages.
LING 364: Introduction to Formal Semantics Lecture 5 January 26th.
Languages and Grammars MSU CSE 260. Outline Introduction: E xample Phrase-Structure Grammars: Terminology, Definition, Derivation, Language of a Grammar,
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
Grammars.
Communication and Language Chap. 22. Outline Communication Grammar Syntactic analysis Problems.
ICS611 Introduction to Compilers Set 1. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
Introduction Syntax: form of a sentence (is it valid) Semantics: meaning of a sentence Valid: the frog writes neatly Invalid: swims quickly mathematics.
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
For Monday Read chapter 23, sections 1-2 FOIL exercise due.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
LING 388: Language and Computers Sandiway Fong Lecture 7.
CSI 3120, Grammars, page 1 Language description methods Major topics in this part of the course: –Syntax and semantics –Grammars –Axiomatic semantics (next.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
1 Inteligenta Artificiala Universitatea Politehnica Bucuresti Anul universitar Adina Magda Florea
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
ProgrammingLanguages Programming Languages Language Syntax This lecture introduces the the lexical structure of programming languages; the context-free.
1 Logical Agents CS 171/271 (Chapter 7) Some text and images in these slides were drawn from Russel & Norvig’s published material.
NLP. Introduction to NLP Is language more than just a “bag of words”? Grammatical rules apply to categories and groups of words, not individual words.
TextBook Concepts of Programming Languages, Robert W. Sebesta, (10th edition), Addison-Wesley Publishing Company CSCI18 - Concepts of Programming languages.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Natural Language Sections What the Speaker Speaks §Intention l S wants H to believe P §Generation l S chooses the words, W to convey the.
7. Parsing in functional unification grammar Han gi-deuc.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
Culture , Language and Communication
Semantic Construction lecture 2. Semantic Construction Is there a systematic way of constructing semantic representation from a sentence of English? This.
Agents That Communicate Chris Bourne Chris Christen Bryan Hryciw.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
CPS 506 Comparative Programming Languages Syntax Specification.
Rules, Movement, Ambiguity
Chapter 3 Describing Syntax and Semantics
Artificial Intelligence: Natural Language
1 Logical Agents CS 171/271 (Chapter 7) Some text and images in these slides were drawn from Russel & Norvig’s published material.
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
For Friday No reading Program 4 due. Program 4 Any questions?
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
Natural Language Processing Slides adapted from Pedro Domingos
SYNTAX.
English Syntax Read J & M Chapter 9.. Two Kinds of Issues Linguistic – what are the facts about language? The rules of syntax (grammar) Algorithmic –
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
Parsing and Code Generation Set 24. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program,
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
GRAMMARS & PARSING. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program, referred to as a.
NATURAL LANGUAGE PROCESSING
Formal grammars A formal grammar is a system for defining the syntax of a language by specifying sequences of symbols or sentences that are considered.
7.2 Programming Languages - An Introduction to Informatics WMN Lab. Hye-Jin Lee.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 King Faisal University.
Natural Language Processing Vasile Rus
Presentation transcript:

Inteligenta Artificiala Universitatea Politehnica Bucuresti Anul universitar 2003-2004 Adina Magda Florea http://turing.cs.pub.ro/ia_2005

Curs nr. 12 Prelucrarea limbajului natural (Natural Language Processing) 2

Defining Languages with Backus-Naur Form (BNF) A formal language is defined as a set of strings, where each string is a sequence of symbols All the languages consist of an infinite set of strings  need a concise way to characterize the set  use a grammar Terminal Symbols Symbols or words that make up the strings of the language Example Set of symbols for the language of simple arithmetic expressions {0,1,2,3,4,5,6,7,8,9,+,-,*,/,(,)}

Components in a BNF Grammar Nonterminal Symbols Categorize subphrases of the language Example The nonterminal symbol NP (NounPhrase) denotes an infinite set of strings, including “you” and “the big dog”

Components in a BNF Grammar Start Symbol Nonterminal symbol that denotes the complete strings of the language Set of rewrite rules or productions LHS  RHS LHS is a nonterminal RHS is a sequence of zero or more symbols (either terminal or nonterminal)

Example: BNF Grammar for Simple Arithmetic Expressions Exp  Exp Operator Exp | (Exp) | Number Number  Digit | Number Digit Digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Operator  + | - | * | /

The Component Steps of Communication A typical communication, in which the speaker S wants to transmit the proposition P to the hearer H using words W, is composed of 7 processes. 3 take place in the speaker 4 take place in the hearer

Processes in the Speaker Intention S wants H to believe P (where S typically believes P) Generation S chooses the words W (because they express the meaning P) Synthesis S tells the words W (usually addressing them to H)

Processes in the Hearer Perception H perceives W’ (ideally W’ = W, but misperception is possible) Analysis H infers that W’ has possible meanings P1,…,Pn (words and phrases can have several meanings)

Processes in the Hearer Disambiguation H infers that S intended to express Pi (where ideally Pi = P, but misinterpretation is possible) Incorporation H decides to believe Pi (or rejects it if it is out of line with what H already believes)

Observations If the perception refers to spoken expressions, this is speech recognition If the perception refers to hand written expressions, this is recognition of hand writing Neural networks have been successfully used to both speech recognition and to hand writing recognition

Observations The analysis, disambiguation and incorporation form natural language understanding are relying on the assumption that the words of the sentence are known Many times, recognition of individual words may be driven by the sentence structure, so perception and analysis interact, as well as analysis, disambiguation, and incorporation

Defining a Grammar Lexicon - list of allowable vocabulary words, grouped in categories (parts of speech): open classes - words are added to the category all the time (natural language is dynamic, it constantly evolves) closed classes - small number of words, generally it is not expected that other words will be added

Example - A Small Lexicon Noun  stench | breeze | wumpus Example - A Small Lexicon Noun  stench | breeze | wumpus .. Verb  is | see | smell .. Adjective  right | left | smelly … Adverb  here | there | ahead … Pronoun  me | you | I | it RelPronoun  that | who Name  John | Mary Article  the | a | an Preposition  to | in | on Conjunction  and | or | but

The Grammar Associated to the Lexicon Combine the words into phrases Use nonterminal symbols to define different kinds of phrases sentence S noun phrase NP verb phrase VP prepositional phrase PP relative clause RelClause

Example - The Grammar Associated to the Lexicon S  NP VP | S Conjunction S NP  Pronoun | Noun | Article Noun | NP PP | NP RelClause VP  Verb | VP NP | VP Adjective | VP PP | VP Adverb PP  Preposition NP RelClause  RelPronoun VP

Syntactic Analysis (Parsing) Parsing is the problem of constructing a derivation tree for an input string from a formal definition of a grammar. Parsing algorithms may be divided into two classes: top-down parsing bottom-up parsing

Top-Down Parsing Start with the top-level sentence symbol and attempt to build a tree whose leaves match the target sentence's words (the terminals) Better if many alternative terminal symbols for each word Worse if many alternative rules for a phrase

Example for Top-Down Parsing "John hit the ball" 1. S 2. S  NP, VP 3 Example for Top-Down Parsing "John hit the ball" 1. S 2. S  NP, VP 3. S  Noun, VP 4. S  John, Verb, NP 5. S  John, hit, NP 6. S  John, hit, Article, Noun 7. S  John, hit, the, Noun 8. S  John, hit, the, ball

Bottom-Up Parsing Start with the words in the sentence (the terminals) and attempt to find a series of reductions that yield the sentence symbol Better if many alternative rules for a phrase Worse if many alternative terminal symbols for each word

Example for Bottom-Up Parsing. 1. John, hit, the, ball 2 Example for Bottom-Up Parsing 1. John, hit, the, ball 2. Noun, hit, the, ball 3. Noun, Verb, the, ball 4. Noun, Verb, Article, ball 5. Noun, Verb, Article, Noun 6. NP, Verb, Article, Noun 7. NP, Verb, NP 8. NP, VP 9. S

Definite Clause Grammar (DCG) Problems with BNF Grammar BNF only talks about strings, not meanings Want to describe context-sensitive grammars, but BNF is context-free Introduce a formalism that can handle both of these problems Use the first-order logic to talk about strings and their meanings

Definite Clause Grammar (DCG) We are interested in using language for communication  need some way of associating a meaning with each string Each nonterminal symbol becomes a one-place predicate that is true of strings that are phrases of that category Example Noun(“ball”) is a true logical sentence Noun(“the”) is a false logical sentence

Definite Clause Grammar (DCG) A definite clause grammar (DCG) is a grammar in which every sentence must be a definite clause. A definite clause is a type of Horn clause that, when written as an implication, has exactly one atom in the conclusion and a conjunction of zero or more atoms in the hypothesis, for example A1  A2  …  C1

Example 1 In BNF notation, we have: S  NP VP In First-Order Logic notation, we have: NP(s1)  VP(s2)  S(Append(s1, s2)) We read: If there is a string s1 that is a noun phrase and a string s2 that is a verb phrase, then the string formed by appending them together is a sentence

Example 2 In BNF notation, we have: Noun  ball | book In First-Order Logic notation, we have: (s = “ball”  s = “book”)  Noun(s) We read: If s is the string “ball” or the string “book”, then the string s is a noun

Rules to Translate BNF in DCG

Augmenting the DCG Extend the notation to incorporate grammars that can not be expressed in BNF Nonterminal symbols can be augmented with extra arguments

Augmenting the DCG Add one argument for semantics In DCG, the nonterminal NP translates as a one-place predicate where the single argument is a string: NP(s) In the augmented DCG, we can write NP(sem) to express “an NP with semantics sem”. This gets translated into logic as the two-place predicate NP(sem, s)

Augmenting the DCG Add one argument for semantics FOPL PROLOG S(sem)  NP(sem1) VP(sem2) {compose(sem1, sem2, sem)} NP(s1, sem1)  VP(s2, sem2)  S(append(s1, s2)), compose(sem1, sem2, sem) See later on

Semantic Interpretation Compositional semantics - the semantics of any phrase is a function of the semantics of its subphrases; it does not depend on any other phrase before, after, or encompassing the given phrase But natural languages does not have a compositional semantics for the general case.

sentence(S, Sem) :- np(S1, Sem1), vp(S2, Sem2), append(S1, S2, S), Sem = [Sem1 | Sem2]. np([S1, S2], Sem) :- article(S1), noun(S2, Sem). vp([S], Sem) :- verb(S, Sem1), Sem = [property, Sem1]. vp([S1, S2], Sem) :- verb(S1), adjective(S2, color, Sem1), Sem = [color, Sem1]. vp([S1, S2], Sem) :- verb(S1), noun(S2, Sem1), Sem = [parts, Sem1].

Problems with Augmented DCG The previous grammar will generate sentences that are not grammatically correct NL is not a context free language Must deal with cases agreement between subject and main verb in the sentence (predicate) verb subcategorization: the complements that a verb can accept

Solution Augment the existing rules of the grammar to deal with context issues Start by parameterizing the categories NP and Pronoun so that they take a parameter indicating their case

CASES Dative case Nominative case (subjective case) + agreement I take the bus Je prends l’autobus Eu iau autobuzul You take the bus Tu prends l’autobus Tu iei autobuzul He takes the bus Il prend l’autobus El ia autobuzul   Accusative case (objective case) He gives me the book Il me donne le livre El imi da cartea  Dative case You are talking to me Il parle avec moi El vorbeste cu mine

Example - The Grammar Using Augmentations to Represent Noun Cases S  NP(Subjective) VP NP(case)  Pronoun (case) | Noun | Article Noun Pronoun(Subjective)  I | you | he | she Pronoun(Objective)  me | you | him | her  

sentence(S) :- np(S1,subjective), vp(S2),. append(S1, S2, S) sentence(S) :- np(S1,subjective), vp(S2), append(S1, S2, S). np([S], Case) :- pronoun(S, Case). np([S], _ ) :- noun(S). np([S1, S2], _ ) :- article(S1), noun(S2). pronoun(i, subjective). pronoun(you, _ ). pronoun(he, subjective). pronoun(she, subjective). pronoun(me, objective). pronoun(him, objective). pronoun(her, objective).

Verb Subcategorization Augment the DCG with a new parameter to describe the verb subcategorization The grammar must state which verbs can be followed by which other categories. This is the subcategorization information for the verb Each verb has a list of complements

Integrate Verb Subcategorization into the Grammar A subcategorization list is a list of complement categories that the verb accepts Augment the category VP to take a subcategorization argument that indicates the complements that are needed to form a complete VP

Integrate Verb Subcategorization into the Grammar Change the rule for S to say that it requires a verb phrase that has all its complements, and thus a subcategorization list of [ ] Rule S  NP(Subjective) VP([ ]) The rule can be read as “A sentence can be composed of a NP in the subjective case, followed by a VP which has a null subcategorization list “

Integrate Verb Subcategorization into the Grammar Verb phrases can take adjuncts, which are phrases that are not licensed by the individual verb, but rather may appear in any verb phrase Phrases representing time and place are adjuncts, because almost any action or event can have a time or a place VP(subcat)  VP(subcat) PP | VP(subcat) Adverb I smell the wumpus now

VP(subcat)  VP([NP | subcat]) NP(Objective) VP(subcat)  VP([NP | subcat]) NP(Objective) | VP([Adjective | subcat]) Adjective | VP ([PP | subcat]) PP | Verb(subcat) | VP(subcat) PP | VP(subcat) Adverb The first line can be read as “A VP, with a given subcategorization list, subcat, can be formed by a VP followed by a NP in the objective case, as long as that VP has a subcategorization list that starts with the symbol NP and is followed by the elements of the list subcat ”

give. [NP, PP]. give the gold in box to me. [NP, NP] give [NP, PP] give the gold in box to me [NP, NP] give me the gold smell [NP] smell a wumpus [Adjective] smell awfull [PP] smell like a wumpus is [Adjective] is smelly [PP] is in box [NP] is a pit died [] died believe [S] believe the wumpus is dead

VP(subcat)  VP([NP | subcat]) NP(Objective) VP(subcat)  VP([NP | subcat]) NP(Objective) | VP([Adjective | subcat]) Adjective | VP ([PP | subcat]) PP | Verb(subcat) | VP(subcat) PP | VP(subcat) Adverb vp(S, [np | Subcat]) :- vp(S1, [np | Subcat]), np(S2, objective), append(S1, S2, S). vp(give, [np, pp]). vp(give, [np, np]). vp(smell, [np]). vp(smell,[adjective]). vp(smell,[pp]).

But dangerous to translate VP(subcat)  VP(subcat) PP Solution vp(S, Subcat) :- vp1(S1, Subcat), pp(S2), append(S1, S2, S).

Generative Capacity of Augmented Grammars The generative capacity of augmented grammars depends on the number of values for the augmentations If there is a finite number, then the augmented grammar is equivalent to a context-free grammar

Semantic Interpretation The semantic interpretation is responsible for getting all possible interpretations, and disambiguation is responsible for choosing the best one. Disambiguation is done starting from the pragmatic interpretation of the sentence.

Pragmatic Interpretation Complete the semantic interpretation by adding information about the current situation Pragmatics shows how the language is used and its effects on the listener Pragmatics will tell why it is not appropriate to answer "Yes" to the question "Do you know what time it is?"

Indexicals Indexical - phrase that refer directly to the current situation Example I am in Bucharest today.

Anaphora Anaphora - the occurrence of phrases referring to objects that have been mentioned previously Example John was hungry. He entered a restaurant. The ball hit the house. It broke the window.

Ambiguity Lexical Ambiguity Syntactic Ambiguity Referential Ambiguity Pragmatic Ambiguity

Lexical Ambiguity A word has more than one meaning Examples A clear sky A clear profit The way is clear John is clear It is clear that ...

Syntactic Ambiguity Can occur with or without lexical ambiguity Examples I saw the Statue of Liberty flying over New York. I saw John in a restaurant with a telescope.

Referential Ambiguity Occurs because natural languages consist almost entirely of words for categories, not for individual objects Example John met Mary and Tom. They went to a restaurant. Block A is on block B and it is not clear.

Pragmatic Ambiguity Occurs when the speaker and the hearer disagree on what the current situation is Example I will meet you tomorrow.