James L. McClelland SS 100, May 31, 2011

Slides:



Advertisements
Similar presentations
Artificial Intelligence 12. Two Layer ANNs
Advertisements

Artificial Neural Networks (1)
Perceptron Learning Rule
B.Macukow 1 Lecture 12 Neural Networks. B.Macukow 2 Neural Networks for Matrix Algebra Problems.
Learning linguistic structure with simple recurrent networks February 20, 2013.
PDP: Motivation, basic approach. Cognitive psychology or “How the Mind Works”
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Neural Networks Basic concepts ArchitectureOperation.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
Sentence Processing using a Simple Recurrent Network EE 645 Final Project Spring 2003 Dong-Wan Kang 5/14/2003.
November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 1 Learning in the BPN Gradients of two-dimensional functions:
Image Compression Using Neural Networks Vishal Agrawal (Y6541) Nandan Dubey (Y6279)
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
James L. McClelland Stanford University
Multiple-Layer Networks and Backpropagation Algorithms
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
 The most intelligent device - “Human Brain”.  The machine that revolutionized the whole world – “computer”.  Inefficiencies of the computer has lead.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Modelling Language Evolution Lecture 1: Introduction to Learning Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit.
The Past Tense Model Psych /719 Feb 13, 2001.
Introduction to Neural Networks and Example Applications in HCI Nick Gentile.
The PDP Approach to Understanding the Mind and Brain Jay McClelland Stanford University January 21, 2014.
The Emergent Structure of Semantic Knowledge
Reinforcement Learning AI – Week 22 Sub-symbolic AI Two: An Introduction to Reinforcement Learning Lee McCluskey, room 3/10
1 Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Intelligent Systems Laboratory.
Emergent Semantics: Meaning and Metaphor Jay McClelland Department of Psychology and Center for Mind, Brain, and Computation Stanford University.
Biological and cognitive plausibility in connectionist networks for language modelling Maja Anđel Department for German Studies University of Zagreb.
Connectionist Modelling Summer School Lecture Two.
The Emergentist Approach To Language As Embodied in Connectionist Networks James L. McClelland Stanford University.
NEURONAL NETWORKS AND CONNECTIONIST (PDP) MODELS Thorndike’s “Law of Effect” (1920’s) –Reward strengthens connections for operant response Hebb’s “reverberatory.
Information Processing
Big data classification using neural network
Multiple-Layer Networks and Backpropagation Algorithms
Convolutional Sequence to Sequence Learning
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.
RNNs: An example applied to the prediction task
Neural Networks.
Fall 2004 Perceptron CS478 - Machine Learning.
Learning in Neural Networks
Neural Networks.
Soft Computing Applied to Finite Element Tasks
Backpropagation in fully recurrent and continuous networks
Intelligent Information System Lab
Simple learning in connectionist networks
Emergence of Semantics from Experience
RNNs: Going Beyond the SRN in Language Prediction
General Aspects of Learning
Artificial Neural Network & Backpropagation Algorithm
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 8, 2018.
Neuro-Computing Lecture 4 Radial Basis Function Network
of the Artificial Neural Networks.
Competitive Networks.
network of simple neuron-like computing elements
Backpropagation.
Word Embedding Word2Vec.
An Introduction To The Backpropagation Algorithm
Artificial Intelligence Lecture No. 28
Backpropagation.
Competitive Networks.
Fundamentals of Neural Networks Dr. Satinder Bal Gupta
Artificial Intelligence 12. Two Layer ANNs
Learning linguistic structure with simple recurrent neural networks
Representation of Language Knowledge: Is it All in your Connections?
Simple learning in connectionist networks
CSC321: Neural Networks Lecture 11: Learning in recurrent networks
The Network Approach: Mind as a Web
David Kauchak CS158 – Spring 2019
Akram Bitar and Larry Manevitz Department of Computer Science
Presentation transcript:

James L. McClelland SS 100, May 31, 2011 Learning in Neural Networks, with Implications for Representation and Learning of Language James L. McClelland SS 100, May 31, 2011

A Pattern Associator Network Corresponding output (e.g., smell of a rose) Matrix of connections Summed input p(a=1) Pattern representing a given input (e.g. sight of a rose)

Learning Rule for the Pattern Associator network For each output unit: Determine activity of the unit based on its input and activation function. If the unit is active when target is not: Reduce each weight coming into the unit from each active input unit. If the unit is not active when the target is active: Increase the weight coming into the unit from each active input unit. Each connection weight adjustment is very small Learning is gradual and cumulative If a set of weights exists that can correctly assign the desired output for each input, the network will gradually home in on it. However, in many cases, no solution is actually possible with only one layer of modifiable weights.

Overcoming the Limitations of Associator Networks Without ‘hidden units’, many input-output mappings cannot be captured. However, with just one layer of units between input and output, it is possible to capture any deterministic input-output mapping. What was missing was a method for training connections on both sides of hidden units. In 1986, such a method was developed by David Rumelhart and others. The network uses units whose activation is a continuous, non-linear function of their input. The network is trained to produce the corresponding output (target) pattern for each input pattern. This is done by adjusting each weight in the network to reduce the sum of squared differences between the network’s output activation and the corresponding target output: Si (ti-ai)2 With this algorithm, neural networks can represent and learn any computable function. How to ensure networks generalize ‘correctly’ to unseen examples is a hard problem, in part because defining what is the correct generalization is unclear. Summed input activation

Standard Approach to the Past Tense (and other Aspects of Language) We form the past tense by using a (simple) rule. ‘add ‘ed’: played, cooked, raided If an item is an exception, the rule is blocked. So we say ‘took’ instead of ‘taked’ If you’ve never seen an item before, you use the rule If an item is an exception, but you forget the exceptional past tense, you apply the rule Predictions: Regular inflection of ‘nonce forms’ This man is tupping. Yesterday he … This girl is blinging. Yesterday she … Over-regularization errors: Goed, taked, bringed

The Learning-Based, Neural Networks Approach Language (like perception, etc) arises from the interactions of neurons, each of which operates according to a common set of simple principles of processing, representation and learning. Units and rules are useful to approximately describe what emerges from these interactions but have no mechanistic or explanatory role in language processing, language change, or language learning.

An Learning-Based, Connectionist Approach to the Past Tense Knowledge is in connections Experience causes connections to change Sensitivity to regularities emerges Regular past tense Sub-regularities Knowledge of exceptions co-exists with knowledge of regular forms in the same connections.

The RM Model Learns from verb [root, past tense] pairs [Like, liked]; [love, loved]; [carry, carried]; [take, took] Present and past are represented as patterns of activation over units that stand for phonological features.

Over-regularization errors in the RM network Most frequent past tenses in English: Felt Had Made Got Gave Took Came Went Looked Needed Here’s where 400 more words were introduced Trained with top ten words only.

Additional characteristics The model exploits gangs of related exceptions. dig-dug cling-clung swing-swung The ‘regular pattern’ infuses exceptions as well as regulars: say-said, do-did have-had keep-kept, sleep-slept Burn-burnt Teach-taught

Elman’s Simple Recurrent Network Task is to predict the next element of a sequence on the output, given the current element on the input units. Each element is represented by a pattern of activation. Each box represents a set of units. Each dotted arrow represents all-to-all connections. The solid arrow indicates that the previous pattern on the hidden units is copied back to provide context for the next prediction. Learning occurs through connection weight adjustment using an extended version of the error correcting learning rule.

Hidden Unit Patterns for Elman Net Trained on Word Sequences

Key Features of the Both Models No lexical entries and no rules No problem of rule induction or grammar selection Note: While this approach has been highly contravertial, and has not become dominant in AI or Linguistics, it underlies a large body of work in psycholinguistics and neuropsychology, and remains under active exploration in many laboratories.

Questions from Sections What is the bridge between the two perspectives? Why should someone who is interested in symbolic systems have to know about both fields? Aren't they fundamentally opposites of each other? If not, where aren't they? How has neuroscience (and findings from studying the brain) shaped and assisted research traditionally conducted under the psychology umbrella? Has computational linguistics really solved 'the biggest part of the problem of AI'?