Class 1a. Introduction to the enterprise

1 Class 1a. Introduction to the enterprise
CAS LX 522 Syntax I Class 1a. Introduction to the enterprise

2 Some things we know Is this English? Why? The cat slept.
Slept the cat. Cat slept the. Cat the slept. Why?

3 The task What do we know? Try to generalize. Formalize (make precise)
The comes before cat, cat comes before slept. Try to generalize. Slept is the verb, maybe this holds of all verbs. The cat is the subject, maybe this holds of all subjects. Subjects contain the and a noun, with the first. An English sentence has a subject followed by a verb. Formalize (make precise) Nouns: cat, dog Verbs: slept, yawned [Sentence [Subject the Noun ] Verb ]

4 The task Check: Look at further data (predictions):
[Sentence [Subject the Noun ] Verb ] The cat slept. The dog yawned. The cat yawned. The dog slept. Look at further data (predictions): The cat chased the dog. This is an English sentence, but our schema cannot produce it. Our “theory of English sentences” is insufficient. We need to revise/extend it.

5 The task Consider the counterexample (or the class of counterexamples) to understand where the current theory falls short. The cat chased the dog. The dog is probably the same kind of thing as the cat, but we don’t want to call it a “subject” (it’s traditionally called the “object”). It contains the and a noun, and the noun seems to be the most important part. Since it contains more than one word, we can call it a “phrase”—it’s not a whole sentence, but it’s more than a word. So, we’ll call it a “noun phrase.”

6 The task Consider the counterexample (or the class of counterexamples) to understand where the current theory falls short. The cat chased the dog. In this English sentence, there is a noun phrase both before and after the verb. So, in addition to our previous schema, we add a second one. Theory of English sentences: [Sentence [NP the Noun ] Verb ] [Sentence [NP the Noun ] Verb [NP the Noun ] ]

7 Lather, rinse, repeat And the process continues. The cat chased a dog.
A cat chased the dog. A cat chased a dog. It looks like a NP can either have the or a as its first element. Thus: Theory of English sentences: [Sentence [NP the Noun ] Verb ] [Sentence [NP a Noun ] Verb ] [Sentence [NP the Noun ] Verb [NP the Noun ] ] [Sentence [NP the Noun ] Verb [NP a Noun ] ] [Sentence [NP a Noun ] Verb [NP the Noun ] ]

8 Generalizing What we’ve ended up with is a bit clumsy, but we can now generalize our schemas to make this more compact: [NP the Noun ] [NP a Noun ] [Sentence NP Verb ] [Sentence NP Verb NP] Not only does this reduce the amount we have to write down, but it actually makes a more profound prediction: If this much of our theory of English sentences is right, then anything that can be a noun phrase subject can also be a noun phrase object. This is not just making our notation more compact, but it is a substantive addition to the theory.

There are some further ways we can consolidate our theory of English sentences by using some common notational tools. X is optional: (X) Either Y or Z: {Y/Z} Thus: [Sentence NP Verb (NP) ] [NP {the/a} Noun ] Unlike our introduction of a separate schema for NP, this change is not a substantive change to our theory of English sentences, it is just a shorthand for the same theory.

10 The grumpy cat As a demonstration of the benefit of introducing a separate NP schema, consider: The grumpy cat chased the unhappy dog. How can we extend our theory of English sentences to allow for this sentence? What other word sequences are predicted to be English sentences? Are they?

11 Now, what are we doing? Ok, so we have the beginnings of a theory of English sentences. But what is it? As we’ve developed it, it is a description of sentences of English, what we might need if we wanted to program a computer to produce English sentences. But it is also a subset of what English speakers know about English. You may or may not have previously thought about the fact that subjects precede verbs and objects follow verbs (or the analog in your native language), but you knew it nevertheless. You could identify sequences of words that did not have this property as not being part of your language, but it’s tacit knowledge. As such, we have to study this knowledge indirectly, based on what are judged to be valid sentences and what aren’t.

An English speaker has a complex system of knowledge that allows him/her to distinguish between sentences of English and non-sentences of English. We’ll refer to this system as a grammar. At its simplest, a grammar is a means of deciding whether a sequence of words is grammatical (e.g., a sentence of English) or not. We’re studying the properties of that system. It’s not always obvious what it is that is wrong with non-sentences, but still the judgments (intuitions) are clear.

*Big that under staple run the jump swim. *The dog are snoring. These are ungrammatical—there is a problem with their form, they are not English. We write * to indicate this. My toothbrush is pregnant again. This is nonsensical, given our knowledge about the world (not about English), but it is grammatical. As I knitted the sock The horse raced fell to the floor. past the barn fell. The rat the cat the dog chased caught escaped adeptly. These are interestingly difficult to parse but once you “get it,” they are fine (if clumsy) sentences of English. There are many things that can go wrong with a string. It might sound bad because it is ungrammatical, unparseable, semantically anomalous. The grammar of a language assigns the stars, and it’s the grammar we seek to explain. So this is our primary data.

In describing data, people will often use the (), {} shorthand notation to indicate optionality or options: Pat (quickly) ran to the bank. Pat ran to the bank. Pat quickly ran to the bank. Pat washed (*quickly) the asparagus. Pat washed the asparagus. *Pat washed quickly the asparagus. The dish ran away with *(the) spoon. The dish ran away with the spoon. *The dish ran away with spoon. The cat chased {a/the} dog. The cat chased a dog. The cat chased the dog.

I sat by the bank. Sometimes we might have reason to expect ambiguity that is not there, which is also indicated using *, on a disambiguating continuation. How did John say Mary fixed the car? With a wrench. In a high-pitched voice. How did John ask if Mary fixed the car? *With a wrench. Sometimes we find that where we might expect ambiguity, there isn’t any. We also use an asterisk to indicate this—the string cannot express this meaning.

Bill told her mother that Mary is a genius. Bill told her that Mary is a genius. I told Mary that Pat gave a book to me. Who did I tell that Pat gave a book to me? *Who did I tell Mary that gave a book to me? Who did I tell Mary that Pat gave a book to? I loaned Mary the book Pat gave me. Who did I loan the book Pat gave me? *Who did I loan Mary the book gave me? *Who did I loan Mary the book Pat gave?

17 How do people know this? All native speakers of English know this.
Little kids weren’t told these rules (or punished for violating them)… “You can’t question a subject in a complement embedded with that” “You can’t use a proper name as an object if the subject is co-referential.”

18 Two questions What do people know about their language?
Including things we know “unconsciously” How do people come to know it? Tricky question for things that we don’t know we know.

19 Systematicity What people eventually end up with is a system with which they can produce (and rate) sentences. A grammar. Even if you’ve never heard these before, you know which one is “English” and which one isn’t: Eight very lazy elephants drank brandy. Eight elephants very lazy brandy drank, Kids say wugs.

Adults know if a given sentence S is grammatical or ungrammatical. This is part of the knowledge kids gain through language acquisition. Kids hear grammatical sentences (positive evidence) Kids are not generally told which sentences are ungrammatical (no negative evidence)

21 Positive and negative evidence
One of the striking things about child language is how few errors they actually make. For negative feedback to work, the kids have to make the errors (so that it can get the negative response). But they don’t make the errors. (Kids do make errors, but not of the kind that one might expect if they were just trying to extract patterns from the language data they hear)

22 Poverty of the stimulus
What is the next number in this sequence? 1, 2, 3, __ How do you form a yes-no question? Pat will leave. Will Pat leave? The book that you were reading was good. *Book the that you were reading was good? *Were the book that you reading was good? Was the book that you were reading good?

23 The “Language instinct”
The linguistic capacity is part of being human. Like having two arms, ten fingers, a vision system, humans have a language faculty. The language faculty (tightly) constrains what kinds of languages a child can learn. =“Universal Grammar” (UG).

24 But languages differ English, French: Subject Verb Object (SVO)
John ate an apple. Pierre a mangé une pomme. Japanese, Korean: Subject Object Verb (SOV) Taroo-wa ringo-o tabeta. Chelswu-ka sakwa-lul mekessta. Irish, Arabic (VSO), Malagasy (VOS), …

25 But languages differ English: Adverbs before verbs
Mary quickly eats an apple. (also: Mary ate an apple quickly) *Mary eats quickly an apple. French: Adverbs after verbs Geneviève mange rapidement une pomme. *Geneviève rapidement mange une pomme.

26 Parameters We can categorize languages in terms of their word order: SVO, SOV, VSO. This is a parameter by which languages differ. The dominant formal theory of first language acquisition holds that children have access to a set of parameters by which languages can differ; acquisition is the process of setting those parameters. What are the parameters? What are the “universal” principles of grammar?

27 The enterprise The data we will primarily be concerned with are native speaker intuitions. Native speakers, faced with a sentence S, know whether the sentence S is part of their language or isn’t. These intuitions are highly systematic. We want to uncover the system (which is unconscious knowledge) behind the intuitions of native speakers—their knowledge of language.

28 I-language We are studying the system behind one person’s pattern of intuitions. Speakers growing up in the same community have very similar knowledge, but language is an individual thing (“I-language”). One doesn’t need to ask the académie française whether Geneviève rapidement mange une pomme is a sentence of French. One knows. I-languages of a community is can be characterized, but it is external to the speaker (“E-language”), not any one person’s knowledge, a generalization over many people’s I-languages. For example, Parisian French.

29 Competence We are also concerned with what a person knows—what characterizes a person’s language competence. We are in general not concerned here with how a person ends up using this knowledge (performance). You still have your language competence when you are sleeping, in the absence of any performance. Being drunk doesn’t make one think “bought some John coffee” is English, though perhaps one might say it.

30 Prescriptive rules Another thing we need to be cautious of are prescriptive rules. Often prescriptive rules of “good grammar” turn out to be impositions on our native grammar which run counter to our native competence. After all, why did they need to be rules in the first place?

31 Prescriptive rules Prepositions are things you don’t end a sentence with. It is important to religiously avoid splitting infinitives. Remember: Capitalize the first word after a colon. Don’t be so immodest as to say I and John left; say John and I left instead. Impact is not a verb. The book which you just bought is offensive.

32 Prescriptive rules When making grammaticality judgments (or when asking others to make grammaticality judgments), we must do our best to factor out prescriptive rules (learned explicitly, e.g., in school). We’re not interested in studying the prescriptive rules; we could just look them up, and it isn’t likely to tell us anything deep about the makeup of the human mind. They’re really just a “secret handshake,” allowing educated people to detect one another.

33 Syntax as science Syntax, as practiced here, is a scientific enterprise. This means, in particular, approaching syntax using the scientific method. Step 1: Gather observations (data) Step 2: Make generalizations Step 3: Form hypotheses Step 4: Test predictions made by these hypotheses, returning to step 1.

34 Syntax as science This is pretty much the way other scientific disciplines work… biology, chemistry, physics. We may start out with a kind of “folk understanding” of a field. For example, you push something and it moves. You stop pushing, and it stops. The sun revolves around the earth from East to West, followed by the moon. Water is a basic element, like fire. Whales are very big fish, like dolphins, or tuna, but bigger. Ockham’s Razor: posit as few concepts and relations as we can get away with. A leaner theory is a better theory. A more easily falsifiable theory is a better theory too.

35 Levels of adequacy If our hypotheses can predict the existence of the grammatical sentences in a corpus (a set of grammatical sentences), it is observationally adequate. Note: the grammar described by “some number of words appear in some order” is observationally adequate, for pretty much any language. This is not a very difficult or satisfying level of adequacy to reach. Nor is it disprovable, but it hasn’t really advanced our understanding of the world. If our hypotheses can predict the native-speaker intuitions about which sentences are grammatical and which are ungrammatical, it is descriptively adequate.

36 Levels of adequacy If we can take a descriptively adequate set of hypotheses one step further and account not only for the native speaker judgments but also for how children come to have these judgments, our hypotheses are explanatorily adequate. It’s this last level that we are hoping to achieve. Basic principles Parameters of variation How to set the parameters from child’s input

English has an infinite number of sentences. Any natural language does. John said that English has an infinite number of sentences. Mary said that John said that English has an infinite number of sentences. Pat said that Mary said that John said that English has an infinite number of sentences. Tracy said that Pat said that Mary said that John said that English has an infinite number of sentences. Chris said that Tracy said that Pat said that Mary said that John said that English has an infinite number of sentences. If S is a sentence and N is a name, N said that S is also a sentence. S  N said that S Some of the earliest work in grammatical theory was done by trying to state rules of this form, the goal being to generate the sentences of a language.

Serious scientific study of sentence structure of this kind generally began in the 50’s, driven to a large extent by the work of Noam Chomsky. It’s now half a century later, and we have learned a lot about how syntax works.

Progress was incremental, and often required revising our assumptions about how sentences are really put together. Data was examined, generalizations were arrived at, hypotheses were formed, predictions were tested—and often led to revisions of the generalizations and the hypotheses, and so forth.

Two goals of the class: Think like a syntactician. Be able to read (relatively recent) books, articles, etc. about syntax. It’s not really enough to just know what people concluded, we need to understand why they concluded what they did.

S  NP VP VP  V (NP) Mid-70’s, X-Bar Theory (a generalization about what are possible PSRs). In the 80’s, a fairly significant shift to Government and Binding Theory (viewing grammar a little less like a computer program). Very productive. In the 90’s, another shift to the Minimalist Program (an attempt at simplification, as well as a change in philosophy).

