Connectionist Models of Language Development: Grammar and the Lexicon Steve R. Howell McMaster University, 1999.

Slides:



Advertisements
Similar presentations
Introduction to Artificial Neural Networks
Advertisements

Summer 2011 Monday, 8/1. As you’re working on your paper Make sure to state your thesis and the structure of your argument in the very first paragraph.
Learning linguistic structure with simple recurrent networks February 20, 2013.
Learning in Recurrent Networks Psychology 209 February 25, 2013.
Artificial Neural Networks - Introduction -
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Classification Neural Networks 1
Modular Neural Networks CPSC 533 Franco Lee Ian Ko.
Machine Learning Neural Networks
Neural Networks Basic concepts ArchitectureOperation.
9.012 Brain and Cognitive Sciences II Part VIII: Intro to Language & Psycholinguistics - Dr. Ted Gibson.
What is Cognitive Science? … is the interdisciplinary study of mind and intelligence, embracing philosophy, psychology, artificial intelligence, neuroscience,
COGNITIVE NEUROSCIENCE
Neural Networks. R & G Chapter Feed-Forward Neural Networks otherwise known as The Multi-layer Perceptron or The Back-Propagation Neural Network.
Chapter Seven The Network Approach: Mind as a Web.
Reading. Reading Research Processes involved in reading –Orthography (the spelling of words) –Phonology (the sound of words) –Word meaning –Syntax –Higher-level.
Artificial Neural Networks
November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 1 Learning in the BPN Gradients of two-dimensional functions:
Neural networks - Lecture 111 Recurrent neural networks (II) Time series processing –Networks with delayed input layer –Elman network Cellular networks.
Fractal Composition of Meaning: Toward a Collage Theorem for Language Simon D. Levy Department of Computer Science Washington and Lee University Lexington,
Artificial Intelligence (AI) Addition to the lecture 11.
Modeling Language Acquisition with Neural Networks A preliminary research plan Steve R. Howell.
Modelling Language Evolution Lecture 2: Learning Syntax Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit.
Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Deep Learning Neural Network with Memory (1)
Artificial Neural Networks An Overview and Analysis.
Machine Learning Chapter 4. Artificial Neural Networks
Self organizing maps 1 iCSC2014, Juan López González, University of Oviedo Self organizing maps A visualization technique with data dimension reduction.
Analysis of a Neural Language Model Eric Doi CS 152: Neural Networks Harvey Mudd College.
February 22, 2010 Connectionist Models of Language.
Classification / Regression Neural Networks 2
Age of acquisition and frequency of occurrence: Implications for experience based models of word processing and sentence parsing Marc Brysbaert.
Modelling Language Evolution Lecture 1: Introduction to Learning Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit.
Methodology of Simulations n CS/PY 399 Lecture Presentation # 19 n February 21, 2001 n Mount Union College.
Modelling Language Acquisition with Neural Networks Steve R. Howell A preliminary research plan.
Introduction to Neural Networks and Example Applications in HCI Nick Gentile.
Artificial Neural Networks Students: Albu Alexandru Deaconescu Ionu.
COSC 460 – Neural Networks Gregory Caza 17 August 2007.
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
Lecture 5 Neural Control
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks (NN) Part 1 1.NN: Basic Ideas 2.Computational Principles 3.Examples of Neural Computation.
The Language of Thought : Part II Joe Lau Philosophy HKU.
A Hybrid Model of HMM and RBFN Model of Speech Recognition 길이만, 김수연, 김성호, 원윤정, 윤아림 한국과학기술원 응용수학전공.
Neural Networks Lecture 11: Learning in recurrent networks Geoffrey Hinton.
Biological and cognitive plausibility in connectionist networks for language modelling Maja Anđel Department for German Studies University of Zagreb.
Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.
Mastering the Pipeline CSCI-GA.2590 Ralph Grishman NYU.
Chapter 9 Knowledge. Some Questions to Consider Why is it difficult to decide if a particular object belongs to a particular category, such as “chair,”
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.
RNNs: An example applied to the prediction task
End-To-End Memory Networks
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.
Simple recurrent networks.
Efficient Estimation of Word Representation in Vector Space
RNNs: Going Beyond the SRN in Language Prediction
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 8, 2018.
Word Embedding Word2Vec.
Creating Data Representations
Learning linguistic structure with simple recurrent neural networks
RNNs: Going Beyond the SRN in Language Prediction
A connectionist model in action
Word2Vec.
Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler
CSC321: Neural Networks Lecture 11: Learning in recurrent networks
The Network Approach: Mind as a Web
Vector Representation of Text
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

Connectionist Models of Language Development: Grammar and the Lexicon Steve R. Howell McMaster University, 1999

Overview Description of Research Plan Explanation of Research Goals Examine Inspiration for this research, in both connectionist and language sub-fields Methods Results (Preliminary) Discussion & Future Directions

Overall Research Plan Pursuit of an Integrated, multi-level, connectionist model of language development “Multi-level” = dealing with several different levels or parts of the language task “Integrated” = non-modular; homogenous functioning throughout the multi-level design

Research Goals Better understanding of language development process Ability to test different interventions on a successful model, instead of children, including possibly lesioning the model Functional language-learning model for AI and Software (e.g. “Chatterbots” on net)

Connectionist Inspiration Work of Jeff Elman on models of grammar learning using Simple Recurrent Networks (SRN’s) Work of Landauer et. al. on acquisition of semantic information (i.e. the lexicon) through analysis of many weak word to word relations in real-world text

Language-Domain Inspiration Evidence against a sharp divide in the acquisition of the lexicon and grammar (e.g. Bates) Lexicon develops first, but grammar development overlaps, in relation to it, and seemingly in step with it) Hence the present focus on homogenous mechanisms to explain the two.

Method Computer Simulation of Connectionist (Neural Network) model Base algorithm and structure is Elman’s (1990) Simple Recurrent Network Modifications include sub-word-level input, multi-level architecture, and automated localist to distributed representation conversion

Diagram of SRN

Parts of an Elman SRN Input Layer of Units Larger (usually) Hidden Layer of Units Context Layer ‘memory’ connected to hidden layer Output Layer of units, same size as input layer Uses back-propagation learning algorithm Uses Prediction task to provide more plausible teaching signal Recurrent Context Units take copy of hidden units at each time step

Modifications: Sub-word Input Triples (Mozer, Wicklegren), or artificial phonemes Recently completed simulation demonstrating superiority of triples or phoneme-level word representations to whole-word localist representations for grammar learning (phonics?)

Representations of Words Localist [ ] [ ] Binary Distributed [ ] [ ] Fully Distributed [ ] [ ] Elman(1990) - Localist Triples - Binary distrib. Semantic Encoding - Fully Distributed

Route to Multi-level Architecture Elman SRN showed how word co- occurrence information could be used to learn word relationships (simple grammar) Learning was of previous words (context) to next word predicted Even with a sub-word distributed representation, prediction is still of the next word

Elman (1990) Clustering Results

Sub-word prediction If we use a ‘sliding window’ on the input text (e.g. five letters for three letter triples) then we are predicting the next triple from the previous triples; true sub-word prediction e.g. The dog chased the cat... Time 1 - “The_d” = The, he_, e_d, _d Time 2 - “he_do’’ = he_, e_d, _do

Sub-word Advantages Richer representations, accessing more of the data inherent in the text or speech stream Makes prediction/internal representation easier Eliminates need for artificial pre-processing of text into word vectors, just automatically translates letters into triple vectors.

Sub-word Disadvantages Cannot output words easily, just have a collection of triples Must stack a “clean-up” net on top in order to reach word representations from the existing triple representations Hence, the multi-layer approach: combine prediction at two time-scales and levels of granularity, but using the same method

Multi-layer SRN Diagram

Multi-layer SRN Triples or letters layer: Input Layer 1 Hidden Layer 1 Context Layer 1 Output Layer 1 Learns to predict triples/phonemes Word Layer: Input Layer 2 = Hidden Layer 1 Hidden Layer 2 Context Layer 2 Output Layer 2 Predicts words from triples/phonemes