Presentation on theme: "Is natural language regular? Context –free? (chapter 13)"— Presentation transcript:
Is natural language regular? Context –free? (chapter 13)
Is English a regular language? (Chomsky, 1956) Consider the syntactic structures: If S 1, then S 2 Either S 3, or S 4 The man who said S 5 is arriving today Note that:“if” must be followed by “then” “either” must be followed by “or” We can embed these structures, one in another, to create more complicated sentences:
Is English a regular language? If either the man who said S 5 is arriving today or the man who said S 5 is arriving tomorrow, then the man who said S 6 is arriving the day after… Let’s relabel the words in this sentence using the rules: if → a then → a either → b or → b other words → ε We now have the sentence abba=(ab)(ab) R
Is English a regular language? Nesting “If/Then” and “Either/Or” clauses allows us create sentences in English that behave exactly like the language xx R, where x is a string of one or more a’s followed by one or more b’s, and x R is x reversed We know that xx R is not a regular language, so English can’t be a regular language!
English isn’t regular: Take Two (Partee et al., 1990) : Center-embedded structures The cat likes tuna fish. The cat the dog chased likes tuna fish. The cat the dog the rat bit chased likes tuna fish. The cat the dog the rat the elephant admired bit chased likes tuna fish. We’re building sentences of the form (the + noun) n (transitive verb) n-1 likes tuna fish. So we’re dealing with the language L=x n y n-1 likes tuna fish, but L isn’t regular, so English isn’t regular.
Is natural language context-free? English seems to be context-free Many attempts to prove natural languages aren’t context free; only 2 have survived publication (Huybregts, 1984; Shieber, 1985) based on the syntax of a dialect of Swiss German Intuition: Swiss German allows sentences like the following: (dative nouns) (accusative nouns) (dative-taking verbs) (accusative-taking verbs) This would give us a language of the form a n b m c n d m, which is not context-free!
Is natural language context-free? Examples of Swiss German: …we John DAT [the house] ACC [helped paint]. …we helped John paint the house. …we the children ACC [John] DAT [the house] ACC [have wanted to let help} paint. …we have wanted to let the children help Hans paint the house.
Complexity and Human Processing What do all of the examples of sentences we’ve looked at have in common? Complexity: The cat the dog the rat the elephant admired bit chased likes tuna fish. We’ve been dealing with nesting (if/then, either/or) and center- embedding (like the “zoo sentence”) Are these nested structures difficult because they’re ungrammatical? Probably not. The complex sentences are using the same structures as the simpler sentences.
Complexity and Human Processing The difficulty of these sentences seems to depend on the number of embeddings But how do we write a grammar that allows N embeddings, but not N+1? Humans just aren’t good at parsing multiple embeddings We have this problem in many languages, not just English Why? Humans may have a limited-sized stack (Yngve, 1960)
Complexity and Human Processing If our stack size (read: memory) really is limited, maybe English is a regular language! Reason: A context-free grammar with a finite limit on its stack-size can be modeled by a finite automaton. Swiss German example: a n b m c n d m If there’s a limit on the size of n and m, we can write down all of the grammar rules