Psych156A/Ling150: Psychology of Language Learning Lecture 19 Learning Structure with Parameters.

Slides:



Advertisements
Similar presentations
Reciprocal Teaching: Session 1. Twilight Course Overview Session 1: An Introduction to Reciprocal Teaching Introduction to the 4 key strategies used in.
Advertisements

How Children Acquire Language
The Nature of Language Learning
Psych 156A/ Ling 150: Acquisition of Language II Lecture 12 Poverty of the Stimulus I.
Foundations of psycholinguistics Week 3 The beginnings of language acquisition Vasiliki (Celia) Antoniou.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 17 Learning Language Structure.
Main points of Interlanguage, Krashen, and Universal Grammar
Language, Mind, and Brain by Ewa Dabrowska Chapter 9: Syntactic constructions, pt. 1.
Week 3a. UG and L2A: Background, principles, parameters CAS LX 400 Second Language Acquisition.
1 Language and kids Linguistics lecture #8 November 21, 2006.
Learning linguistic structure with simple recurrent networks February 20, 2013.
Second Language Acquisition Video series with Dr. Frank Tuzi
Psych 156A/ Ling 150: Acquisition of Language II Lecture 14 Poverty of the Stimulus III.
Module 14 Thought & Language. INTRODUCTION Definitions –Cognitive approach method of studying how we process, store, and use information and how this.
Explaining first language acquisition
Components important to the teaching of reading
Introduction to Linguistics and Basic Terms
PSY 369: Psycholinguistics Language Acquisition: Morphology.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 6 Words in Fluent Speech II.
Language, Mind, and Brain by Ewa Dabrowska Chapter 9: Syntactic constructions, pt. 2.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 11 Poverty of the Stimulus II.
Psych 56L/ Ling 51: Acquisition of Language Lecture 8 Phonological Development III.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 12 Poverty of the Stimulus II.
Psych 56L/ Ling 51: Acquisition of Language Lecture 13 Development of Syntax & Morphology III.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 14 Learning Language Structure.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 13 Learning Biases.
Psych156A/Ling150: Psychology of Language Learning Lecture 18 Language Structure II.
On learning a Language-21 Today Review theories on language learning: Behaviorist psychology (Skinner) Universal Grammar (Chomsky) Monitor Theory (Krashen)
Lecture 1 Introduction: Linguistic Theory and Theories
Psych156A/Ling150: Psychology of Language Learning Lecture 17 Language Structure.
Chapter 9: Thinking, Language, and Intelligence
Psych 56L/ Ling 51: Acquisition of Language Lecture 2 Introduction Continued.
Psych 56L/ Ling 51: Acquisition of Language Lecture 8 Phonological Development III.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 16 Language Structure II.
Cognitive Development: Language Infants and children face an especially important developmental task with the acquisition of language.
Infant Speech Perception & Language Processing. Languages of the World Similar and Different on many features Similarities –Arbitrary mapping of sound.
Language.  Our spoken, written, or signed words and the ways we combine them as we think and communicate  Human essence: the qualities of the mind are.
What is Language? Education 388 Lecture 3 January 23, 2008 Kenji Hakuta, Professor.
The Communicative Language Teaching Lecture # 18.
Adele E. Goldberg. How argument structure constructions are learned.
Statistical Learning in Infants (and bigger folks)
Simulated Evolution of Language By: Jared Shane I400: Artificial Life as an approach to Artificial Intelligence January 29, 2007.
Cognitive and Language Development Pertemuan 4 Matakuliah: E Psikologi Pendidikan Tahun: 2010.
First Language Acquisition
Total Physical Response (TPR)
Language.
Psych 156A/ Ling 150: Acquisition of Language II
Language Communication is part of cognition
Language Spoken, Gestured or Written words and the way we combine them as we think and communicate Does language truly set us apart from all other species?
Psych156A/Ling150: Psychology of Language Learning Lecture 15 Poverty of the Stimulus II.
First Language Acquisition
Language Mr. Koch AP Psychology Forest Lake High School.
1 Vocabulary acquisition from extensive reading: A case study Maria Pigada and Norbert Schmitt ( 2006)
Chapter 3 Language Acquisition: A Linguistic Treatment Jang, HaYoung Biointelligence Laborotary Seoul National University.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 9 Words in Fluent Speech II.
The development of speech production
PSYC 206 Lifespan Development Bilge Yagmurlu.
Language, Mind, and Brain by Ewa Dabrowska
Psych156A/Ling150: Psychology of Language Learning
2nd Language Learning Chapter 2 Lecture 4.
Theories of Language Development
Psych 156A/ Ling 150: Psychology of Language Learning
THE NATURE of LEARNER LANGUAGE
Language.
Over the past fifty years, three main theoretical positions have been advanced to explain language development from infancy through the early school years:
Psych156A/Ling150: Psychology of Language Learning
Over the past fifty years, three main theoretical positions have been advanced to explain language development from infancy through the early school years:
Traditional Grammar VS. Generative Grammar
Theories of Language Acquisition
Psychology Chapter 8 Section 5: Language.
Presentation transcript:

Psych156A/Ling150: Psychology of Language Learning Lecture 19 Learning Structure with Parameters

Announcements Next class: Review session for final - Review homework and quiz questions, come in with questions to go over - If you want, you may me which questions you would like to discuss in class. We’ll prioritize based on how many people want to discuss any given question. - Remember: review questions are available for the last 3 lectures (“Structure & Learning Structure”). These are fair game for the final. HW6: average 33.2 out of 43

Language Variation: Summary While languages may differ on many levels, they have many similarities at the level of language structure (syntax). Even languages with no shared history seem to share similar structural patterns. One way for children to learn the complex structures of their language is to have them already be aware of the ways in which human languages can vary. Then, they listen to their native language data to decide which patterns their native language follows. Languages can be thought to vary structurally on a number of linguistic parameters. One purpose of parameters is to explain how children learn some hard-to-notice structural properties.

Learning Structure with Statistical Learning: The Relation Between Parameters and Probability

Learning Complex Systems Like Language Only humans seem able to learn human languages Something in our biology must allow us to do this. Chomsky: this is what Universal Grammar is - innate biases for learning language that are available to humans because of our biological makeup (specifically, the biology of our brains).

Learning Complex Systems Like Language But obviously language is learned, not just prespecified beforehand. Children learn their native language, not just any old language. However, we see constrained variation across languages: sounds, words, structure. EnglishNavajo

Learning Complex Systems Like Language The big point: need both innate biases & probabilistic learning abilities We need to find a way to explicitly integrate them with each other, so that we can understand how learning language might work. It will likely involve both prior knowledge about language (which may come from the biology of our brains) as well as general-purpose learning strategies like probabilistic/statistical learning. EnglishNavajo

Combining Language-Specific Biases with Probabilistic Learning Statistics for word segmentation (remember Gambell & Yang (2006)) “Modeling shows that the statistical learning (Saffran et al. 1996) does not reliably segment words such as those in child- directed English. Specifically, precision is 41.6%, recall is 23.3%. In other words, about 60% of words postulated by the statistical learner are not English words, and almost 80% of actual English words are not extracted. This is so even under favorable learning conditions”. Unconstrained (simple) statistics: not so good.

Combining Language-Specific Biases with Probabilistic Learning If statistical learning is constrained by language-specific knowledge (Unique Stress Constraint: words have only one main stress), performance increases dramatically: 73.5% precision, 71.2% recall. Constrained statistics - much better! Statistics for word segmentation (remember Gambell & Yang (2006))

Combining Statistical Learning With Language-Specific Biases A big deal: “Although infants seem to keep track of statistical information, any conclusion drawn from such findings must presuppose that children know what kind of statistical information to keep track of.” Ex: Transitional Probability …of rhyming syllables? …of individual sounds (b, a, p, d, …)? …of stressed syllables? No…any syllable sequences. P(pa | da )? language-specific bias

Constraints for Structure-Learning Parameters = constraints on language variation. Only certain rules/patterns are possible. Grammar = combination of language rules. = combination of parameter values. So, use statistical learning to learn which value (for each parameter) that the native language uses for its grammar.

Yang (2004): Variational Learning Idea taken from evolutionary biology: Individual grammars compete against each other in a child’s mind to see which grammar can best analyze the available data. A grammar’s “fitness” is determined by how well the grammar fares with native language data. Llueve It-rains. “It’s raining.” Intuition: Most successful grammar will be the native language grammar. This grammar will “win”, once the child encounters enough native language data.

Yang (2004): Variational Learning Initially, each grammar is equally likely to be the native language grammar. A grammar will have a probability associated with it, which represents that grammar’s likelihood of being the native language grammar. So, initially, all grammars have the same probability. 1/3 3 grammars, G = 3 Initial probability for any given grammar = 1/G = 1/3

Yang (2004): Variational Learning After the child has encountered native language data, some grammars will have been more successful while other grammars will have been less successful. So, the probabilities associated with these grammars will reflect that. The more successful grammars will have a higher probability associated with them Intuition: Most successful grammar will be the native language grammar. This grammar will have a probability near 1.0 once the child encounters enough native language data.

Grammar Success How can some grammars be successful while other grammars are not? Example: Native language data is Vamos 1st-pl-come “We’re coming” One parameter may be whether it’s okay to leave off or drop the subject (+/- subject-drop). Value 1: Must always have a subject (-subject-drop) Value 2: May optionally drop the subject (+subject-drop)

Grammar Success How can some grammars be successful while other grammars are not? Example: Native language data is Vamos 1st-pl-come “We’re coming” Suppose a grammar with the -subject-drop value tried to analyze this data point. It would not be able to since this sentence does not have an overt subject. So, a -subject-drop grammar is not compatible with this data point. Its probability will go down.

Grammar Success How can some grammars be successful while other grammars are not? Example: Native language data is Vamos 1st-pl-come “We’re coming” > Suppose a grammar with the -subject-drop value tried to analyze this data point. It would not be able to since this sentence does not have an overt subject. So, a -subject-drop grammar is not compatible with this data point. Its probability will go down.

Grammar Success How can some grammars be successful while other grammars are not? Example: Native language data is Vamos 1st-pl-come “We’re coming” >.29 However, suppose a grammar with the +subject-drop value tried to analyze this data point. It would be able to since it allows sentences to not have an overt subject. So, a +subject-drop grammar is compatible with this data point. Its probability will go up. 0.5

Grammar Success How can some grammars be successful while other grammars are not? Example: Native language data is Vamos 1st-pl-come “We’re coming” > >.51 However, suppose a grammar with the +subject-drop value tried to analyze this data point. It would be able to since it allows sentences to not have an overt subject. So, a +subject-drop grammar is compatible with this data point. Its probability will go up.

Grammar Success How can some grammars be successful while other grammars are not? Example: Native language data is Vamos 1st-pl-come “We’re coming” >.29 Key point: This data is unambiguous for the +subject-drop value. Only grammars with the +subject-drop parameter value will be able to successfully analyze this data point >.51

Unambiguous Data Unambiguous data from the target language can only be analyzed by grammars that use the target language’s parameter value. This makes unambiguous data very influential data for the child to encounter, since it is incompatible with the parameter value that is incorrect for the target language. Ex: the -subject-drop value is not compatible with sentences that drop the subject subject like Vamos 1st-pl-come “We’re coming”

Unambiguous Data Idea (from Yang (2004)): The more unambiguous data there is, the faster the native language’s parameter value will “win” (reach a probability near 1.0). This means that the child will learn the associated structural pattern faster. Example: the more unambiguous +subject-drop data the child encounters, the faster a child should learn that the native language allows subjects to be dropped

Unambiguous Data Learning Examples Wh-fronting for questions Wh-word moves to the front (like English) Sarah will see who?

Unambiguous Data Learning Examples Wh-fronting for questions Wh-word moves to the front (like English) Who will Sarah will see who ?

Unambiguous Data Learning Examples Wh-fronting for questions Wh-word moves to the front (like English) Who will Sarah will see who ? Wh-word stays “in place” (like Chinese) Sarah will see who?

Unambiguous Data Learning Examples Wh-fronting for questions Parameter: +/- wh-fronting Native language value (English): +wh-fronting Unambiguous data: any (normal) wh-question, with wh-word in front (ex: “Who will Sarah see?”) Frequency of unambiguous data to children: 25% of input Age of +wh-fronting acquisition: very early (before 1 yr, 8 mos)

Unambiguous Data Learning Examples Verb raising Verb moves “above” (before) the adverb/negative word (French) Jean souvent voit Marie Jean often sees Marie Jean pas voit Marie Jean not sees Marie

Unambiguous Data Learning Examples Verb raising Verb moves “above” (before) the adverb/negative word (French) Jean voit souvent voit Marie Jean sees often Marie“Jean often sees Marie.” Jean voit pas voit Marie Jean sees not Marie“Jean doesn’t see Marie.”

Unambiguous Data Learning Examples Verb raising Verb moves “above” (before) the adverb/negative word (French) Jean voit souvent voit Marie Jean sees often Marie“Jean often sees Marie.” Jean voit pas voit Marie Jean sees not Marie“Jean doesn’t see Marie.” Verb stays “below” (after) the adverb/negative word (English) Jean often sees Marie. Jean does not see Marie.

Unambiguous Data Learning Examples Verb raising Parameter: +/- verb-raising Native language value (French): +verb-raising Unambiguous data: verb adverb/negative word data points (“Jean voit souvent Marie”) Frequency of unambiguous data to children: 7% of input Age of +verb-raising acquisition: 1 yr, 8 months

Unambiguous Data Learning Examples Verb Second Verb moves to second phrasal position, some other phrase moves to the first position (German) Sarah das Buch liest Sarah the book reads

Unambiguous Data Learning Examples Verb Second Verb moves to second phrasal position, some other phrase moves to the first position (German) Sarah liest Sarah das Buch liest Sarah reads the book “Sarah reads the book.”

Unambiguous Data Learning Examples Verb Second Verb moves to second phrasal position, some other phrase moves to the first position (German) Sarah liest Sarah das Buch liest Sarah reads the book “Sarah reads the book.” Sarah das Buch liest Sarah the book reads

Unambiguous Data Learning Examples Verb Second Verb moves to second phrasal position, some other phrase moves to the first position (German) Sarah liest Sarah das Buch liest Sarah reads the book “Sarah reads the book.” Das Buch liest Sarah das Buch liest The book reads Sarah“Sarah reads the book.”

Unambiguous Data Learning Examples Verb Second Verb moves to second phrasal position, some other phrase moves to the first position (German) Sarah liest Sarah das Buch liest Sarah reads the book “Sarah reads the book.” Das Buch liest Sarah das Buch liest The book reads Sarah“Sarah reads the book.” Verb does not move (English) Sarah reads the book.

Unambiguous Data Learning Examples Verb Second Parameter: +/- verb-second Native language value (German): +verb-second Unambiguous data: Object Verb Subject data points (“Das Buch liest Sarah”) Frequency of unambiguous data to children: 1.2% of input Age of +verb-second acquisition: ~3 yrs

Unambiguous Data Learning Examples Intermediate wh-words in complex questions (“scope marking”) (Hindi, German) … wer Recht hat? …who right has “…who has the right?”

Unambiguous Data Learning Examples Intermediate wh-words in complex questions (“scope marking”) (Hindi, German) Wer glaubst du wer Recht hat? Who think-2nd-sg you who right has “Who do you think has the right?”

Unambiguous Data Learning Examples Intermediate wh-words in complex questions (“scope marking”) (Hindi, German) Wer glaubst du wer Recht hat? Who think-2nd-sg you who right has “Who do you think has the right?” No intermediate wh-words in complex questions (English) Who do you think who has the right?

Unambiguous Data Learning Examples Intermediate wh-words in complex questions (“scope marking”) (Hindi, German) Wer glaubst du wer Recht hat? Who think-2nd-sg you who right has “Who do you think has the right?” No intermediate wh-words in complex questions (English) Who do you think has the right?

Unambiguous Data Learning Examples Intermediate wh-words in complex questions (“scope marking”) Parameter: +/- intermediate-wh Native language value (English): - intermediate-wh Unambiguous data: complex questions of a particular kind (“Who do you think has the right?”) Frequency of unambiguous data to children: 0.2% of input Age of -intermediate-wh acquisition: > 4 yrs

Unambiguous Data Examples Summary Parameter valueFrequency of unambiguous data Age of acquisition +wh-fronting (English)25%Before 1 yr, 8 months +verb-raising (French)7%1 yr, 8 months +verb-second (German)1.2%3 yrs -intermediate-wh (English)0.2%> 4 yrs The quantity of unambiguous data available in the child’s input seems to be a good indicator of when they will acquire the knowledge. The more there is, the sooner they learn the right parameter value for their native language.

Summary: Variational Learning for Language Structure Big idea: The time course of when a parameter is set depends on how frequent the necessary evidence is in child- directed speech. This falls out from the probabilistic learning framework, where unambiguous data for the native language parameter value punishes the non-native language value. Predictions of variational learning: Parameters set early: more unambiguous data Parameters set late: less unambiguous data These predictions seem to be born out by available data on when children learn certain structural patterns (parameter values) about their native language.

Questions?