Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004.

Slides:



Advertisements
Similar presentations
Multi-Layer Perceptron (MLP)
Advertisements

Bioinspired Computing Lecture 16
Perceptron Lecture 4.
Introduction to Neural Networks 2. Overview  The McCulloch-Pitts neuron  Pattern space  Limitations  Learning.
NEURAL NETWORKS Backpropagation Algorithm
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
Artificial Neural Networks (1)
1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.
Multilayer Perceptrons 1. Overview  Recap of neural network theory  The multi-layered perceptron  Back-propagation  Introduction to training  Uses.
Computer Science Department FMIPA IPB 2003 Neural Computing Yeni Herdiyeni Computer Science Dept. FMIPA IPB.
Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules.
Learning linguistic structure with simple recurrent networks February 20, 2013.
G5BAIM Artificial Intelligence Methods Graham Kendall Neural Networks.
Lecture 14 – Neural Networks
9.012 Brain and Cognitive Sciences II Part VIII: Intro to Language & Psycholinguistics - Dr. Ted Gibson.
Connectionist models. Connectionist Models Motivated by Brain rather than Mind –A large number of very simple processing elements –A large number of weighted.
Bernard Ans, Stéphane Rousset, Robert M. French & Serban Musca (European Commission grant HPRN-CT ) Preventing Catastrophic Interference in.
Chapter Seven The Network Approach: Mind as a Web.
Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004.
Artificial Neural Network
November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 1 Learning in the BPN Gradients of two-dimensional functions:
Yuki Osada Andrew Cannon 1.  Humans are an intelligent species. ◦ One feature is the ability to learn.  The ability to learn comes down to the brain.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Rohit Ray ESE 251. What are Artificial Neural Networks? ANN are inspired by models of the biological nervous systems such as the brain Novel structure.
1 Introduction to Artificial Neural Networks Andrew L. Nelson Visiting Research Faculty University of South Florida.
Modelling Language Evolution Lecture 2: Learning Syntax Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
APPLICATIONS OF CONTEXT FREE GRAMMARS BY, BRAMARA MANJEERA THOGARCHETI.
Artificial Neural Network Theory and Application Ashish Venugopal Sriram Gollapalli Ulas Bardak.
 The most intelligent device - “Human Brain”.  The machine that revolutionized the whole world – “computer”.  Inefficiencies of the computer has lead.
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy
NEURAL NETWORKS FOR DATA MINING
 Diagram of a Neuron  The Simple Perceptron  Multilayer Neural Network  What is Hidden Layer?  Why do we Need a Hidden Layer?  How do Multilayer.
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Artificial Intelligence Techniques Multilayer Perceptrons.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
An Instructable Connectionist/Control Architecture: Using Rule-Based Instructions to Accomplish Connectionist Learning in a Human Time Scale Presented.
Artificial Neural Networks An Introduction. What is a Neural Network? A human Brain A porpoise brain The brain in a living creature A computer program.
Modelling Language Evolution Lecture 1: Introduction to Learning Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit.
Introduction to Neural Networks and Example Applications in HCI Nick Gentile.
Akram Bitar and Larry Manevitz Department of Computer Science
Approaches to A. I. Thinking like humans Cognitive science Neuron level Neuroanatomical level Mind level Thinking rationally Aristotle, syllogisms Logic.
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
CITS7212: Computational Intelligence An Overview of Core CI Technologies Lyndon While.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
COSC 4426 AJ Boulay Julia Johnson Artificial Neural Networks: Introduction to Soft Computing (Textbook)
NEURAL NETWORKS LECTURE 1 dr Zoran Ševarac FON, 2015.
Introduction to Neural Networks Freek Stulp. 2 Overview Biological Background Artificial Neuron Classes of Neural Networks 1. Perceptrons 2. Multi-Layered.
Chapter 6 Neural Network.
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Computational Intelligence Semester 2 Neural Networks Lecture 2 out of 4.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Neural networks.
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
Artificial neural networks
Natural Language Processing with Qt
Learning in Neural Networks
Neural Network Implementations on Parallel Architectures
Backpropagation in fully recurrent and continuous networks
CSE 473 Introduction to Artificial Intelligence Neural Networks
Artificial Intelligence Methods
Fundamentals of Neural Networks Dr. Satinder Bal Gupta
August 8, 2006 Danny Budik, Itamar Elhanany Machine Intelligence Lab
The Network Approach: Mind as a Web
Akram Bitar and Larry Manevitz Department of Computer Science
Presentation transcript:

Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004

Overview Introduction –A brief overview of Artificial Neural Networks –The basic architecture Introduce Douglas Rohde's CSCP model –Overview –Penglish Language –Architecture –Semantic System –Comprehension, Prediction, and Production System –Training –Testing –Conclusions Bibliography

A Brief Overview oBasic definition of an Artificial Neural Network oA network of interconnected “neurons” inspired by the biological nervous system. oThe function of an Artificial Neural Network is to produce an output pattern from a given input. oFirst described by Warren McCulloch and Walter Pitts in 1943 in their seminal paper “A logical calculus of ideas imminent in nervous activity”.

Artificial neurons are modeled after biological neurons The architecture of an Artificial Neuron

Architecture -- Structure oNetwork Structure oMany types of neural network structures oEx Feedforward, Recurrent oFeedforward oCan be single layered or multi-layered oInputs are propagated forward to the output layer

Architecture -- Recurrent NN oRecurrent Neural Networks o Operate on an input space and an internal state space – they have memory. o Primary types of Recurrent neural networks osimple recurrent ofully recurrent oBelow is an example of a simple recurrent network (SRN)

Architecture -- Learning oLearning used in NN's oLearning = change in connection weights oSupervised networks: network is told about correct answer oex. back propagation, back propagation through time, reinforcement learning oUnsupervised networks: network has to find correct input. ocompetitive learning, self-organizing or Kohonen maps

Architecture -- Learning (BPTT) oBackpropagation Through Time (BPTT) is used in the CSCP Model and SRNs oIn BPTT the network runs ALL of its forward passes then performs ALL of the backward passes. oEquivalent to unrolling the network backwards through time

The CSCP Model oConnectionist Sentence Comprehension and Production model oPrimary Goal: learn to comprehend and produce sentences developed in the Penglish( Pseudo English) language. oSecondary Goal: to construct a model that will acount for a wide range of human sentence processing behaviours.

Basic Architecture oA Simple Recurrent NN is used oPenglish (Pseudo English) was used to train and test the model. oConsists of 2 separate parts contected by a “message layer” oSemantic System (Encoding/Decoding System) oCPP system oBackpropagation Through Time (BPTT) is the learning algorithm. omethod for learning temporal tasks

Penglish oGoal: to produce only sentences that are reasonably valid in english oBuilt around the framework of a stochastic context-free grammar. oGiven a SCFG it is easy to generate sentences, parse sentences, and perform optimal prediction oSubset of english some grammatical structures used are o56 verb stems o45 noun stems oadjectives, determiners, adverbs, subordinate clauses oseveral types of logical ambiguity.

Penglish oPenglish sentences do not always sound entirely natural even though constraints to avoid semantic violations were implemented oExample sentences are: o(1) We had played a trumpet for you o(2) A answer involves a nice school. o(3) The new teacher gave me a new book of baseball. o(4) Houses have had something the mother has forgotten

The CSCP Model Semantic System CPP System Start stores all propositions seen for current sentence

Semantic System Propositions loaded sequentially Propositions stored in Memory

Semantic System Error measure

Training (SS) oBackpropagation oTrained separate and prior to the rest of the model. oThe decoder: uses standard single-step backpropagation oThe encoder is trained using BPTT. oMajority of the running time is in the decoding stage.

Training (SS) Error is assessed here.

CPP System Error measure Phonologically encoded word. The CPP System

CPP System (cont.) Starts here by trying to predict next word in sentence. Goal to produce next word in sentence and pass it to Word Input Layer

4. BPTT The CPP System - Training 1. BPTT starts here. 2. Backpropagated to here. 3. Previously recorded output errors are injected here

Training o16 Penglish training sets oSet = 250,000 sentences, total = 4 million sentences o weight updates per set = 1 epoch oTotal of 16 epochs. oThe learning rate start at.2 for the first epoch and then was gradually reduced over the course of learning. oAfter the Semantic System the CPP system was similarily trained oTraining began with limited complexity sentences and complexity increased gradually. oTraining a single network took about 2 days on a 500Mhz alpha. Total training time took about two months. oOverall 3 networks were trained

Testing o50,000 sentences o33.8% of testing sentences also appeared in one of the training sets. oNearly all of the sentences had 1 or 2 propositions. o3 forms of measurement are used in measuring comprehension. omultiple choice measure oReading time measure oGrammaticality rating measure

Testing (Multiple Choice) oExample: “When the owner let go, the dog ran after the mailman.” oExpressed as [ran after, theme, ?] oPossible answers oMailman (correct answer) oowner, dog, girls, cats. (distractors) oError measure is oWhen applying four distractors, the chance performance is 20% correct.

Testing (Reading Time) oAlso known as Simulated Reading Time oIt’s a weighted average of 4 components. o1 and 2 “Measure the degree to which the current word was expected” o3 rd “The change in the message that occurred when the current word was read” o4 th “The average level of activation in the message layer” oThe four components are multiplied by scaling factors to achieve average values of close to 1.0 for each of them and a weighted average is then taken. oRanges from.4 for easy words to 2.5 or more for very hard words.

Testing (Grammaticality) oThe Grammaticality Method o(1) prediction accuracy (PE) oIndicator of syntactic complexity oInvolves the point in the sentence at which the worst two consecutive predictions occur. o(2) comprehension performance (CE) oAverage strict-criterion comprehension error rate on the sentence. oIntented to reflect the degree to which the sentence makes sense. oSimulated ungrammaticality rating (SUR) oSUR = (PE – 8) X (CE + 0.5) ocombines the two components into a single measure of ungrammaticality

Conclusions oGeneral Comprehension Results ofinal networks are able to provide complete, accurate answer oGiven NO choices 77% oGiven 5 choices 92% oSentential Complement Ambiguity oStrict criterion error rate 13.5% oMultiple choice 2% oSubordinate Clause Ambiguity oEx. Although the teacher saw a book was taken in the school. oIntransitive, weak bad, weak good condition, strong bad, and strong good all were under 20% error rate on multiple choice questions.

Bibliography 1.Artificial Intelligence 4 th ed, Luger G.F., Addison Wesley, Artificial Intelligence 2 nd ed, Russel & Norvig, Prentice Hall, Neural Networks 2 nd ed, Picton P., Palgrave, A connectionist model of sentence comprehension and production, Rohde D., MIT, March Finding Structure in Time, Elman J.L, UC San Diego, Cognitive Science, 14, , Fundamentals of Neural Networks, Fausett L, Pearson, 1994