English Syntax Read J & M Chapter 9.. Two Kinds of Issues Linguistic – what are the facts about language? The rules of syntax (grammar) Algorithmic –

Slides:



Advertisements
Similar presentations
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Advertisements

Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Probabilistic Context Free Grammars (Chapter 14) Muhammed Al-Mulhem March 1,
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Artificial Intelligence 2005/06 From Syntax to Semantics.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
Syntax LING October 11, 2006 Joshua Tauberer.
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
CS 4705 Lecture 11 Feature Structures and Unification Parsing.
Stochastic POS tagging Stochastic taggers choose tags that result in the highest probability: P(word | tag) * P(tag | previous n tags) Stochastic taggers.
Features and Unification Read J & M Chapter 11.. Solving the Agreement Problem Number agreement: S  NP VP * Mary walk. [NP NUMBER] [VP NUMBER] NP  det.
Syntax Construction of phrases and sentences from morphemes and words. Usually the word syntax refers to the way words are arranged together. Syntactic.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
Dr. Ansa Hameed Syntax (4).
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
Constituency Tests Phrase Structure Rules
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 Features and Unification Chapter 15 October 2012 Lecture #10.
ICS611 Introduction to Compilers Set 1. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
TEORIE E TECNICHE DEL RICONOSCIMENTO Linguistica computazionale in Python: -Analisi sintattica (parsing)
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
Chapter 12: FORMAL GRAMMARS OF ENGLISH Heshaam Faili University of Tehran.
A sentence (S) is composed of a noun phrase (NP) and a verb phrase (VP). A noun phrase may be composed of a determiner (D/DET) and a noun (N). A noun phrase.
Context-Free Parsing Read J & M Chapter 10.. Basic Parsing Facts Regular LanguagesContext-Free Languages Required Automaton FSMPDA Algorithm to get rid.
Natural Language Processing Artificial Intelligence CMSC February 28, 2002.
NLP. Introduction to NLP Is language more than just a “bag of words”? Grammatical rules apply to categories and groups of words, not individual words.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
1 LIN6932 Spring 2007 LIN6932 Topics in Computational Linguistics Lecture 6: Grammar and Parsing (I) February 15, 2007 Hana Filip.
Notes on Pinker ch.7 Grammar, parsing, meaning. What is a grammar? A grammar is a code or function that is a database specifying what kind of sounds correspond.
Syntax Why is the structure of language (syntax) important? How do we represent syntax? What does an example grammar for English look like? What strategies.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
Linguistic Essentials
November 2011CLINT-LN CFG1 Computational Linguistics Introduction Context Free Grammars.
Parsing with Context-Free Grammars for ASR Julia Hirschberg CS 4706 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Rules, Movement, Ambiguity
Artificial Intelligence: Natural Language
CSA2050 Introduction to Computational Linguistics Parsing I.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
Section 11.3 Features structures in the Grammar ─ Jin Wang.
Artificial Intelligence 2004
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
SYNTAX.
Parsing and Code Generation Set 24. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program,
1 Some English Constructions Transformational Framework October 2, 2012 Lecture 7.
GRAMMARS & PARSING. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program, referred to as a.
NATURAL LANGUAGE PROCESSING
NLP. Introduction to NLP #include int main() { int n, reverse = 0; printf("Enter a number to reverse\n"); scanf("%d",&n); while (n != 0) { reverse =
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 King Faisal University.
Context Free Grammars. Slide 1 Syntax Syntax = rules describing how words can connect to each other * that and after year last I saw you yesterday colorless.
Describing Syntax and Semantics Chapter 3: Describing Syntax and Semantics Lectures # 6.
Natural Language Processing Vasile Rus
Basic Parsing with Context Free Grammars Chapter 13
SYNTAX.
Part I: Basics and Constituency
CS 388: Natural Language Processing: Syntactic Parsing
Natural Language - General
David Kauchak CS159 – Spring 2019
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

English Syntax Read J & M Chapter 9.

Two Kinds of Issues Linguistic – what are the facts about language? The rules of syntax (grammar) Algorithmic – what are effective computational procedures for dealing with those facts? Building parsers

What is Syntax? Try 1: the rules for stringing words together to form sentences. The boys hit the ball. vs. Ball boys hit the the. I gave Sue a ride to the store vs. I gave Sue ride to store. I saw the book that Mary had written. vs. I saw the book what Mary had written. But if that’s all it were, we wouldn’t have to do much for understanding assuming legal input.

What is Syntax? Try 2: The rules for forming constituents that correspond to meaningful entities. Example: The cat with the furry tail purred.

Why Do We Care about Syntax? Morphology POS Tagging Syntax Semantics Discourse Integration Generation goes backwards. For this reason, we generally want declarative representations of the facts.

Sometimes We Need it Even if We Don’t Go All the Way Question answering: Lawyers whose clients committed fraud vs Lawyers who committed fraud vs Clients whose lawyers committed fraud

Finding Constituents in Sentences A constituent is a word or group of words that functions as a unit. How can we discern constituents? Semantically: The cat with the furry tail purred. What can be chopped out and replaced by a single word? Agnes purred. *Agnes tail purred.

Finding Constituents in Sentences, con’t Preposed and postposed constructions: Early next year I’d like to go to Paris. I’d like to go to Paris early next year. I’d like early next year to go to Paris. * Early I’d like to go to Paris next year. * I’d like early to go to Paris next year. * The early next year old man would like to go to Paris.

How Many Kinds of Constituents are There? Although there may be an infinite number of possible constituent tokens, there’s quite a small number of constituent types, e.g., NP, PP, VP. On what basis can we group tokens into types? Occurrence in similar contexts.

How Many Kinds of Constituents are There, con’t The cat with the furry tail purred. Every dog wore a collar. Most of the children in the room brought a dog with a furry tail and a collar. The furry tail brought a room. Every room purred. A dog with a furry tail and a collar purred. Mary saw most of the children in the room. NPs occur as subjects, objects of verbs, and objects of prepositions.

Single Word Constituents Single word constituents are exactly the parts of speech that we have already considered. How many of these single word constituent types are there? Look at sizes of tagsets. Lots of design decisions: Sue bought the big white house. * Sue bought the white big house. Are big and white the same POS?

Simple Constituent Types Don’t Capture Everything * The cat with a furry tail purred a collar. Mary imagined a cat with a furry tail. Mary decided to go. * Mary decided a cat with a furry tail. Mary decided a cat with a furry tail would be her next pet. Mary gave Lucy the food. * Mary decided Lucy the food.

Subcategorization Frame VerbExample Ø eat, sleep, … I want to eat NP prefer, find, leave,... Find [NP the flight from Pittsburgh to Boston] NP NP show, give, … Show [NP me] [NP airlines with flights from Pittsburgh] PP from PP to fly, travel, … I would like to fly [pp from Boston] [pp to Philadelphia] NP PP with help, load, … Can you help [NP me] [pp with a flight] VP to prefer, want, need, … I would prefer [VP to to go by United airlines] VP brst can, would, might, … I can [VP brst go from Boston] S mean Does this mean [S AA has a hub in Boston]?

The Role of the Lexicon in Parsing Serves as the starting point for POS tagging. Provides additional information such as subcategorization: For verbs For adjectives: I’m angry with Mary. I’m angry at Mary. I’m mad at Mary.* I’m mad with Mary. For nouns: Jane has a passion for old movies. Jane has an interest in old movies.

One Other Barrier to a Small Number of Kinds of Constituents - Agreement Number agreement: The boys want to go to the game(s). * The boy want to to to the game(s). Case agreement: I want to give it to him. * Me want to give it to he. In English it’s just pronouns, but not so in many other languages.

The Solution – Augmenting the Constituent Types To solve these and other problems, one strategy is to augment constituent types with other sorts of information: V +pl +[NP NP]  VP/NP/NP +plShow VP/NP +plShow me VP +plShow me the book.

Specifying a Language The set of sentences in English is large (maybe even infinite). We want a concise (i. e., much shorter than a list of sentences) definition of it. We have a finite (in fact quite small) set of constituent types (NP, VP, etc.) from which to build our description. So we appeal to recursion and write grammar rules such as: S  NP VP VP  V NP NP  NP PP NP  NP S (The boy who went to the store won the game.) PP  prep NP

A Context-Free Grammar for English If we ignore: subcategorization agreement gapping Then we can build a context-free grammar for English that does a pretty good job of: generating all and only the acceptable sentences, and of building reasonable parse trees for those sentences. We’ll look at whether English is formally context free later.

Context-Free Grammars A context-free grammar (CFG) is a 4-tuple: 1.A set of non-terminal symbols N 2.A set of terminals  (disjoint from N) 3.A set of productions P, each of the form A  , where A is a non-terminal and  is a string of symbols from the infinite set of strings (  N)* 4.A designated start symbol S In our grammar of English:  is the set of POS, and N is the set of remaining constituent types, e.g., NP, VP, PP

Derivations Using CFGs The standard formal definition: L G generated by grammar G is the set of strings composed of terminal symbols which can be derived from the designated start symbol S. L G = {w | w is in  * and S  w} But we won’t generally want our grammar to have to all the way to words. We want to let the lexicon do that. That’s why we let  be the set of POS. So the grammar may generate strings such as: N V Det N

Derivations Using CFGs So we will use the following definition: L G = {s | w is in  * and S  w and s can be derived from w by substituting words for POS as licensed by the lexicon} Note that this doesn’t change the formal picture. We could instead augment our grammar with tens of thousands of rules of the form: N  phlogiston This is a system design decision.

Context-Free Grammars and Parse Trees S VP NP Name VNP DetN Johnate thepizza S  NP VP NP  Name NP  Det N VP  V NP (S(NP (NAME John)) (VP(V ate) (NP (ART the) (N pizza))))

Long Distance Dependencies Who did she say she saw ____ coming down the hill? She did say she saw who coming down the hill. The boy she saw coming down the road was crying. The boy she saw _____ coming down the road was crying.

Long Distance Dependencies – A Linguistic Solution Transformational Grammar (Chomsky, 1965): A context free grammar generates base forms A transformational component moves constituents around and may delete them from the surface form. But how can we run these rules backwards? This approach went out of fashion at least 20 years ago.

Long Distance Dependencies – Computational Solutions Augmented Transition Networks: All arbitrary actions on the arcs. These permit insertions and movements of constituents. But any procedural solution won’t be reversible for generation. Unification systems: Declarative patterns for assigning constituents to fill subcategorization slots.

Spoken Language Syntax Speech is collected in utterances rather than in text. Spoken language is looser than written with more pauses, ‘nonverbal events’, disfluencies such as er, uh, um. Sample spoken language utterances from users interacting with ATIS

Spoken Language Syntax The repair often has the same structure as the constituent immediately before the interruption point.