1 LING 696B: Computational Models of Phonological Learning Ying Lin Department of Linguistics University of Arizona.

Slides:



Advertisements
Similar presentations
Cognitive Systems, ICANN panel, Q1 What is machine intelligence, as beyond pattern matching, classification and prediction. What is machine intelligence,
Advertisements

Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.
Psycholinguistic what is psycholinguistic? 1-pyscholinguistic is the study of the cognitive process of language acquisition and use. 2-The scope of psycholinguistic.
Infant sensitivity to distributional information can affect phonetic discrimination Jessica Maye, Janet F. Werker, LouAnn Gerken A brief article from Cognition.
Project Proposal.
CSE 5522: Survey of Artificial Intelligence II: Advanced Techniques Instructor: Alan Ritter TA: Fan Yang.
1 Language and kids Linguistics lecture #8 November 21, 2006.
Phonetic Detail in Developing Lexicon Daniel Swingley 2010/11/051Presented by T.Y. Chen in 599.
Chapter 20: Natural Language Generation Presented by: Anastasia Gorbunova LING538: Computational Linguistics, Fall 2006 Speech and Language Processing.
Module 14 Thought & Language. INTRODUCTION Definitions –Cognitive approach method of studying how we process, store, and use information and how this.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
Good Research Questions. A paradigm consists of – a set of fundamental theoretical assumptions that the members of the scientific community accept as.
Psych 56L/ Ling 51: Acquisition of Language Lecture 8 Phonological Development III.
Distributional Cues to Word Boundaries: Context Is Important Sharon Goldwater Stanford University Tom Griffiths UC Berkeley Mark Johnson Microsoft Research/
CPSC 322, Lecture 31Slide 1 Probability and Time: Markov Models Computer Science cpsc322, Lecture 31 (Textbook Chpt 6.5) March, 25, 2009.
Fall 2004 Cognitive Science 207 Introduction to Cognitive Modeling Praveen Paritosh.
Approaches to Representing and Recognizing Objects Visual Classification CMSC 828J – David Jacobs.
Lecture 1 Introduction: Linguistic Theory and Theories
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Writing a Research Proposal
IE 594 : Research Methodology – Discrete Event Simulation David S. Kim Spring 2009.
CHAPTER 3: DEVELOPING LITERATURE REVIEW SKILLS
CAREERS IN LINGUISTICS OUTSIDE OF ACADEMIA CAREERS IN INDUSTRY.
Psych 56L/ Ling 51: Acquisition of Language Lecture 8 Phonological Development III.
9/8/20151 Natural Language Processing Lecture Notes 1.
Linguistics, Pragmatics & Natural Grammar
2008 © ChengXiang Zhai Dragon Star Lecture at Beijing University, June 21-30, Prepare Yourself for IR Research ChengXiang Zhai Department of Computer.
Infant Speech Perception & Language Processing. Languages of the World Similar and Different on many features Similarities –Arbitrary mapping of sound.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 5 Sounds III.
Experimental Research Methods in Language Learning Chapter 2 Experimental Research Basics.
Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.
1 Computational Linguistics Ling 200 Spring 2006.
Big Idea 1: The Practice of Science Description A: Scientific inquiry is a multifaceted activity; the processes of science include the formulation of scientifically.
New Bulgarian University 9th International Summer School in Cognitive Science Simplicity as a Fundamental Cognitive Principle Nick Chater Institute for.
1 LING 696B: Gradient phonotactics and well- formedness.
Some Probability Theory and Computational models A short overview.
Linguistics The first week. Chapter 1 Introduction 1.1 Linguistics.
Statistical Learning in Infants (and bigger folks)
Introduction to CL & NLP CMSC April 1, 2003.
CS 445/545 Machine Learning Winter, 2012 Course overview: –Instructor Melanie Mitchell –Textbook Machine Learning: An Algorithmic Approach by Stephen Marsland.
IRCS/CCN Summer Workshop June 2003 Speech Recognition.
1 LING 696B: Midterm review: parametric and non-parametric inductive inference.
LOT 1: jan06 1 Language Acquisition 1. Elena Lieven, MPI-EVA, Leipzig School of Psychological Sciences, University of Manchester.
© 2008 The McGraw-Hill Companies, Inc. Chapter 8: Cognition and Language.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 6 Sounds of Words I.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
CP SC 428/628 D. E. Stevenson 10 Jan 07.
Acoustic Continua and Phonetic Categories Frequency - Tones.
Language Acquisition Computational Intelligence 4/7/05 LouAnn Gerken.
1 LING 696B: Categorical perception, perceptual magnets and neural nets.
Theoretic Frameworks for Data Mining Reporter: Qi Liu.
What infants bring to language acquisition Limitations of Motherese & First steps in Word Learning.
A Psycholinguistic Perspective on Child Phonology Sharon Peperkamp Emmanuel Dupoux Laboratoire de Sciences Cognitives et Psycholinguistique, EHESS-CNRS,
Welcome to English 101. To Do List for Today: Go over syllabus Discuss turnitin accounts and course website Prepare for success in Eng 101 Discuss the.
1. Central issues For both experimental and modeling field it is important to start from theory A common theory (and a common lexicon) will help building.
Bridging the gap between L2 speech perception research and phonological theory Paola Escudero & Paul Boersma (March 2002) Presented by Paola Escudero.
Helpful hints for planning your Wednesday investigation.
Against formal phonology (Port and Leary).  Generative phonology assumes:  Units (phones) are discrete (not continuous, not variable)  Phonetic space.
1 LING 696B: Final thoughts on nonparametric methods, Overview of speech processing.
1 LING 696B: Maximum-Entropy and Random Fields. 2 Review: two worlds Statistical model and OT seem to ask different questions about learning UG: what.
Pattern Recognition. What is Pattern Recognition? Pattern recognition is a sub-topic of machine learning. PR is the science that concerns the description.
Usage-Based Phonology Anna Nordenskjöld Bergman. Usage-Based Phonology overall approach What is the overall approach taken by this theory? summarize How.
Research Methods in Psychology PSY 311
Probability and Time: Markov Models
Probability and Time: Markov Models
Introduction to Linguistics
Probability and Time: Markov Models
Probability and Time: Markov Models
Generalized Diagnostics with the Non-Axiomatic Reasoning System (NARS)
Presentation transcript:

1 LING 696B: Computational Models of Phonological Learning Ying Lin Department of Linguistics University of Arizona

2 Instructor Education: Beijing University (Information Science), UCLA (Math, Linguistics) Research interest: Computational linguistics Theories and models of learning Phonetics and phonology Language evolution

3 And you? me: Name Background / department / what classes have you taken, etc. Why are you interested in this course? Topics that you would like to be discussed (c.f. the shopping list in syllabus) Rescheduling requests (office hour, etc)

4 Outline Motivation / inspiration: Why study learning? Why use computational models? Phonological acquisition in the first few years of life Statistical learning by infants and by machines Course business

5 Why study learning? Phonology: study of the knowledge of sound patterns (LING 510) Inventories, contrasts, features Phonotactics: blick, not bnick or rbick Alternations: in + possible -> impossible Traditional methodology: identify these patterns by eye, construct grammars Implicitly assuming these patterns are equally important for the learner

6 Why study learning? A different perspective: modeling speakers instead of languages Take a grammar, and ask: If this is what the speaker knows about her language, how can she learn the grammar from the evidence available to her? This can potentially be a test for theories

7 Why study learning? Phonetics: perception, production and acoustic properties of speech sounds (LING 515) Traditional division of labor: Phonology: dealing with form, symbolic Phonetics: dealing with substance, numerical The “interface” problem: how should they talk to each other?

8 Why study learning? The issue of representation 1. How is speech signal encoded in the speaker’s mind? 2. How do infants “crack the speech code”, and become someone who can use the code? Phonology provided many conjectures for Q1, but not enough work has been done to address Q2

9 Alternative methodologies Experimental work Extending knowledge of existing patterns to novel forms (many in this department) Miniature / artificial languages that contain relevant patterns of interest (Gerken, Gomez) This class -- computational work

10 Why computational models? Intuitive ideas about learning are often vague Inductive learning: can humans learn any type of generalizations? Answer: No The most convincing arguments are computational ones

11 Why computational models? Universal Grammar: the crucial initial bias / hypothesis space that a learner needs to know Computational modeling quantifies such a bias, and makes it assessable with empirical data

12 A quick overview of phonological acquisition Prenatal to newborn 4-day-old French babies prefer listening to French over Russian (Mehler et al, 1988) Prepared to learn any phonetic contrast in human languages (Eimas, 71)

13 A quick overview of phonological acquisition 6 months: Effect of experience begin to surface (Kuhl, 92) months: from a universal perceiver to a language-specific one (Werker & Tees, 84)

14 A quick overview of phonological acquisition months: knows the sequential patterns of their language English v.s. Dutch (Jusczyk, 93) “avoid” v.s. “waardig” Weird English v.s. better English (Jusczyk, 94) “mim” v.s. “thob” Impossible Dutch v.s. Dutch (Friederici, 93) “bref” v.s. “febr”

15 A quick overview of phonological acquisition 6 months: segment words out of speech stream based on transition probabilities (Saffran et al, 96) pabikutibudogolatupabikudaropi… 8 months: remembering words heard in stories read to them by graduate students (Jusczyk and Hohne, 97)

16 Mechanism: Statistical learning by infants Learning phonetic categories from bi-modal distributions (Maye, Gerken & Werker, 02) Phonotactics (Jusczyk et al, 93) Word segmentation (Saffran et al, 96)

17 Statistical learning by machines A booming enterprise, influencing a number of fields Computer Science Electrical Engineering (signal processing, communication, control, …) Biology Has generated sophisticated tools Motivated by real applications

18 What do those machines have to do with human? Two kinds of statistical machines 1. Architecture is based on an understanding of the domain. Variables and interaction have clear meanings 2. An input-output device that can be easily applied to any domain, by turning various knobs machine Input: text, signal, parse trees, … Output: Yes/no or some score

19 Why need statistics in the model? Sensory input to the learner is often noisy, ambiguous, and contains much variation The language of probability is the only coherent way of reasoning about uncertainty. The statistical perspective unifies a number of proposals, e.g. Rules v.s. analogy Exemplars v.s. categories

20 Statistics and UG My take on this: not an “either-or”, but a “both-and” relationship Statistics help build models that can be tested on realistic data Strong / weak assumptions about UG lead to different models Weak bias: exemplars, neural nets Stronger bias: Markov chain phonotactics Even stronger bias: stochastic OT

21 Statistics and UG But, given a set of data Lots of generalizations can be made Lots of descriptive statistics can be counted You must know what to count in order to draw a meaningful conclusion

22 The basic steps of computational modeling Identifying and formulating a problem, with your theoretical committments Gathering enough data that mirrors a learner’s input This is often in the form of a corpus, searchable by machine Providing an algorithm, carrying out computation, assessing results

23 The basic steps of computational modeling Identifying and formulating a problem Gathering enough data that mirrors a learner’s input This is often in the form of a corpus, searchable by machine Providing an algorithm, carrying out computation, assessing results We will do lots of this in LING 539!

24 Some examples (details to follow later) Learning phonological rules from morphological paradigms (Albright & Hayes, 03) Requires pairs of words e.g. spling / splung -> i -> u / spl_ng But are these words already on the workbench?

25 Some examples (details to follow later) Not most of the time: Maybe they are chopped out of longer stretch of sequences (Brent, 96) Chopping is better done with some idea of what sound sequences are allowed (Brent, Yang) Phonotactic learning can be modeled with OT (Hayes, 99) dujulaIkD  kIti chop Do you like the kitty?

26 Some examples (details to follow later) Not most of the time: Maybe they are chopped out of longer stretch of sequences (Brent, 96) Chopping is better done with some idea of what sound sequences are allowed (Brent, Yang) Phonotactic learning can be modeled with OT (Hayes, 99) dujulaIkD  kIti chop Do you like the kitty? But wait, is this what sounds like to toddlers?

27 Some examples (details to follow later) Probably not when they are young: Maybe they have to form categories first by chopping up waveforms that they have heard (Lin, 05)

28 Some examples (details to follow later) Phonological acquisition is a big problem Infants -- the best learning machine around -- take a large amount of input The whole picture will likely consist of many interacting parts We may have to focus on one smaller problem at a time unitsphonotacticslexiconmorphology

29 Course business Format of the course: Lectures In-class presentations Guest lectures Requirements: Readings: will be posted on the webpage (coming soon) 4 assignments (exercises / short papers) or a term project

30 Course business me Name Background / department / what classes have you taken, etc. Why are you interested in this course? Topics that you would like to be discussed (c.f. the shopping list in syllabus) Rescheduling requests (office hour, etc)