Modelling Language Evolution Lecture 2: Learning Syntax Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit.

Slides:



Advertisements
Similar presentations
Pattern Association.
Advertisements

Summer 2011 Tuesday, 8/ No supposition seems to me more natural than that there is no process in the brain correlated with associating or with.
Hopefully a clearer version of Neural Network. I1 O2 O1 H1 H2I2.
NEURAL NETWORKS Perceptron
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 12 Poverty of the Stimulus I.
Introduction to Computational Natural Language Learning Linguistics (Under: Topics in Natural Language Processing ) Computer Science (Under:
Automatic Speech Recognition II  Hidden Markov Models  Neural Network.
Learning linguistic structure with simple recurrent networks February 20, 2013.
PDP: Motivation, basic approach. Cognitive psychology or “How the Mind Works”
CSC321: Neural Networks Lecture 3: Perceptrons
Connectionist Simulation of the Empirical Acquisition of Grammatical Relations – William C. Morris, Jeffrey Elman Connectionist Simulation of the Empirical.
Lecture 14 – Neural Networks
Artificial Intelligence (CS 461D)
Neural Networks Basic concepts ArchitectureOperation.
9.012 Brain and Cognitive Sciences II Part VIII: Intro to Language & Psycholinguistics - Dr. Ted Gibson.
Contents Sequences Time Delayed I Time Delayed II Recurrent I CS 476: Networks of Neural Computation, CSD, UOC, 2009 Recurrent II Conclusions WK5 – Dynamic.
Symbolic Encoding of Neural Networks using Communicating Automata with Applications to Verification of Neural Network Based Controllers* Li Su, Howard.
Chapter Seven The Network Approach: Mind as a Web.
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 1 Creating Data Representations On the other hand, sets of orthogonal vectors.
Dr.-Ing. Erwin Sitompul President University Lecture 1 Introduction to Neural Networks and Fuzzy Logic President UniversityErwin SitompulNNFL 1/1
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Neural networks - Lecture 111 Recurrent neural networks (II) Time series processing –Networks with delayed input layer –Elman network Cellular networks.
Modeling Language Acquisition with Neural Networks A preliminary research plan Steve R. Howell.
Soft Computing Colloquium 2 Selection of neural network, Hybrid neural networks.
MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way
1 st Neural Network: AND function Threshold(Y) = 2 X1 Y X Y.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Modelling Language Evolution Lecture 5: Iterated Learning Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit.
Multiple-Layer Networks and Backpropagation Algorithms
 The most intelligent device - “Human Brain”.  The machine that revolutionized the whole world – “computer”.  Inefficiencies of the computer has lead.
Explorations in Neural Networks Tianhui Cai Period 3.
Appendix B: An Example of Back-propagation algorithm
Hybrid AI & Machine Learning Systems Using Ne ural Networks and Subsumption Architecture By Logan Kearsley.
February 22, 2010 Connectionist Models of Language.
Artificial Intelligence Lecture No. 29 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Jamie Alexandre. ≠ = would you like acookie jason.
Connectionist Models of Language Development: Grammar and the Lexicon Steve R. Howell McMaster University, 1999.
Modelling Language Evolution Lecture 1: Introduction to Learning Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit.
Methodology of Simulations n CS/PY 399 Lecture Presentation # 19 n February 21, 2001 n Mount Union College.
Theories of first language acquisition.  We are not born speaking!  Language must be acquired. ◦ Learning vs. acquisition  If we think of all that.
COSC 460 – Neural Networks Gregory Caza 17 August 2007.
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
What is Communicative Language Teaching??. Communicative Language: Blends listening, speaking, reading, and writing. Is the expression, interpretation,
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
AI & Machine Learning Libraries By Logan Kearsley.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Why Can't A Computer Be More Like A Brain?. Outline Introduction Turning Test HTM ◦ A. Theory ◦ B. Applications & Limits Conclusion.
AS Level Psychology The core studies
Artificial Neural Networks (ANN). Artificial Neural Networks First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.
The Language of Thought : Part II Joe Lau Philosophy HKU.
Neural Networks Lecture 11: Learning in recurrent networks Geoffrey Hinton.
Language learning Approaches & key theorists. Historical language approaches 1 Grammar/translation Formalised end 19 th C. Mind consisting of separate.
Artificial Neural Networks This is lecture 15 of the module `Biologically Inspired Computing’ An introduction to Artificial Neural Networks.
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.
An army of strawmen Input vs Nativism in language acquisition
Learning in Neural Networks
What is Language Acquisition?
James L. McClelland SS 100, May 31, 2011
Soft Computing Applied to Finite Element Tasks
Backpropagation in fully recurrent and continuous networks
Neural Networks Advantages Criticism
Chapter 3. Artificial Neural Networks - Introduction -
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 8, 2018.
XOR problem Input 2 Input 1
Modelling Language Evolution Lecture 3: Evolving Syntax
Learning linguistic structure with simple recurrent neural networks
The Network Approach: Mind as a Web
Lecture 09: Introduction Image Recognition using Neural Networks
Presentation transcript:

Modelling Language Evolution Lecture 2: Learning Syntax Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit

Multi-layer networks For many modelling problems, multi- layer networks are used Three layers are common: Input layer Hidden layer Output layer What do the hidden-node activations correspond to? Internal representation For some problems, networks need to compute an “intermediate” representation of the data

XOR network - step 1 XOR is the same as OR but not AND Calculate OR Calculate NOT AND AND the results NOT ANDOR AND

XOR network - step 2 OUTPUT BIAS NODE HIDDEN 1HIDDEN 2 INPUT 1INPUT NOT ANDOR AND

Simple example (Smith 2003) Smith wanted to model a simple language-using population Needed a model that learned vocabulary 3 “meanings” (1 0 0), (0 1 0), (0 0 1) 6 possible signals (0 0 0), (1 0 0), (1 1 0) … Used networks for reception and production: MEANINGSIGNAL MEANING After training, knowledge of language stored in the weights During reception/production, internal representation is in the activations of the hidden nodes Perform Train

Can a network learn syntax? (Elman 1993) Important question for the evolution of language: Modelling can tell us what we can do without Can we model the acquisition of syntax using a neural network? One problem… sentences can be arbitrarily long How much knowledge of grammar are we born with?

Representing time Imagine we presented words one at a time to a network Would it matter what order the words were give? No: Each word is a brand new experience The net has no way of relating each experience with what has gone before Needs some kind of working memory Intuitively: each word needs to be presented along with what the network was thinking about when it heard the previous word

The Simple Recurrent Net (SRN) At each time step, the input is: a new experience plus a copy of the hidden unit activations at the last time step Copy back connections Input Output Hidden Context

What inputs and outputs? How do we force the network to learning syntactic relations? Can we do it without an external “teacher”? Answer: the next-word prediction task Inputs: Current word (and context) Outputs: Predicted next word The error signal is implicit in the data

Long distance dependencies and hierarchy Elman’s question: how much is innate? Many argue: Long-distances dependencies and hierarchical embedding are “unlearnable” without innate language faculty How well can an SRN learn them? Examples: 1.boys who chase dogs see girls 2.cats chase dogs 3.dogs see boys who cats who mary feeds chase 4.mary walks

First experiments Each word encoded as a single unit “on” in the input.

Initial results How can we tell if the net has learned syntax? Check whether it predicts the correct number agreement Gets some things right, but makes many mistakes Seems not to have learned long-distance dependency. boys who girl chase see dog

Incremental input Elman tried teaching the network in stages Five stages: 1.10,000 simple sentences (x 5) 2.7,500 simple + 2,500 complex (x 5) 3.5,000 simple + 5,000 complex (x 5) 4.2,500 simple + 7,500 complex (x 5) 5.10,000 complex sentences (x 5) Surprisingly, this training regime lead to success!

Is this realistic? Elman reasons that this is in some ways like children’s behaviour Children seem to learn to produce simple sentences first Is this a reasonable suggestion? Where is the incremental input coming from? Developmental schedule appears to be a product of changing the input.

Another route to incremental learning Rather than the experimenter selecting simple, then complex sentences, could the network? Children’s data isn’t changing… children are changing Elman gets the network to change throughout its “life” What is a reasonable way for the network to change? One possibility: memory

Reducing the attention span of a network Destroy memory by setting context nodes to 0.5 Five stages of learning (with both simple and complex sentences): 1.Memory blanked every 3-4 words (x 12) 2.Memory blanked every 4-5 words (x 5) 3.Memory blanked every 5-6 words (x 5) 4.Memory blanked every 6-7 words (x 5) 5.No memory limitations (x 5) The network learned the task.

Counter-intuitive conclusion: starting small A fully-functioning network cannot learn syntax. A network that is initially limited (but matures) learns well. This seems a strange result, suggesting that networks aren’t good models of language learning after all On the other hand… Children mature during learning Infancy in humans is prolonged relative to other species Ultimate language ability seems to be related to how early learning starts i.e., there is a critical period for language acquisition.

Next lecture We’ve seen how we can model aspects of language learning in simulations What about evolution? Cultural evolution Individual learning Biological evolution