Introduction to the TLearn Simulator n CS/PY 231 Lab Presentation # 5 n February 16, 2005 n Mount Union College.

Introduction to the TLearn Simulator n CS/PY 231 Lab Presentation # 5 n February 16, 2005 n Mount Union College

More Realistic Models n So far, our perceptron activation function is quite simplistic n f (x 1, x 2 ) = 1, if  x k ·w k > , or n = 0, if  x k ·w k <  n To more closely mimic actual neuronal function, our model needs to become more complex

Problem: Output too Simplistic n Perceptron output only changes when an input, weight or theta changes n Neurons don’t send a steady signal (a 1 output) until input stimulus changes, and keep the signal flowing constantly n Action potential is generated quickly when threshold is reached, and then charge dissipates rapidly

Problem: Output too Simplistic n when a stimulus is present for a long time, the neuron fires again and again at a rapid rate n when little or no stimulus is present, few if any signals are sent n over a fixed amount of time, neuronal activity is more of a firing frequency than a 1 or 0 value (a lot of firing or a little)

Problem: Output too Simplistic n to model this, we allow our artificial neurons to produce a graded activity level as output (some real number) n doesn’t affect the validity of the model (we could construct an equivalent network of 0/1 perceptrons) n advantage of this approach: same results with smaller network

Output Graph for 0/1 Perceptron Σ x k · w k 0 1 output θ

Sigmoid Functions: More Realistic n Actual neuronal activity patterns (observed by experiment) give rise to non-linear behavior between max & min n example: logistic function –f (x 1, x 2,..., x n ) = 1 / (1 + e -  xk·wk ), where e  2.71828... n example: arctangent function –f (x 1, x 2,..., x n ) = arctan(  x k ·w k ) / (  / 2)

Output Graph for Sigmoid ftn Σ x k · w k 0 1 output 0

TLearn Activation Function n The software simulator we will use in this course is called TLearn n Each artificial neuron (node) in our networks will use the logistic function as its activation function n gives realistic network performance over a wide range of possible inputs

TLearn Software n Developed by Cognitive Psychologists to study properties of connectionist models and learning –Kim Plunkett, Oxford Experimental Psychologist –Jeffrey Elman, U.C. San Diego Cognitive Psychologist n Simulates massively-parallel networks on serial computer platforms

Notational Conventions n TLearn uses a slightly different notation than that which we have been using n Input signals are treated as nodes in the network, and displayed on screen as squares n Other nodes (representing neurons) are displayed as circles n Input and output values can be any real numbers (decimals allowed)

Weight Adjustments: Learning n TLearn uses a more sophisticated rule than the simple one seen last week n Let t kp be the target (desired) output for node k on pattern p n Let o kp be the actual (obtained) output for node k on pattern p

Weight Adjustments: Learning n Error for node k on pattern p (  kp ) is the difference between target output and observed output, times the derivative of the activation function for node k –why? Don’t ask! (actually, this value simulates actual observed learning) n  kp = (t kp - o kp ) · [o kp · (1 - o kp ) ]

Weight Adjustments: Learning n This is used to calculate adjustments to weights n Let w kj be the weight on the connection from node j to node k (backwards notation is what the authors use) n Let  w kj be the change required for w kj due to training n  w kj is determined by: error for node k, input from node j, learning rate (  )

Weight Adjustments: Learning n  w kj =  ·  kp · o jp n  is small (< 1, usually 0.05 to 0.5), to keep weights from making wild swings that overshoot goals for all patterns n This actually makes sense... –a larger error (  kp ) should make  w kj larger –if o jp is large, it contributed a great deal to the error, so it should contribute a large value to the weight adjustment

Weight Adjustments: Learning n The preceding is called the delta rule n Used in Backpropagation Training –error adjustments are propagated backwards from output layer to previous layers when weight changes are calculated n Luckily, the simulator will perform these calculations for you! n Read more in Ch. 1 of Plunkett & Elman

TLearn Simulation Basics n For each problem on which you will work, the simulator maintains a PROJECT description file n Each project consists of three text files: –.CF file: configuration information about the network’s architecture –.DATA file: input for each of the network’s training cases –.TEACH file: output for each training case

TLearn Simulation Basics n Each file must contain information in EXACTLY the format TLearn expects, or else the simulation won’t work n Example: AND project from Chapter 3 folder –2 inputs, one outupt, output = 1 only if both inputs = 1

.DATA and.TEACH Files

.DATA File format n first line: distributed or localist –to start, we’ll always use distributed n second line: n = # of training cases n next n lines: inputs for each training case – a list of v values, separated by spaces, where v = # of inputs in network

.TEACH File format n first line: distributed or localist –must match mode used in.DATA file n second line: n = # of training cases n next n lines: outputs for each training case – a list of w values, separated by spaces, where w = # of outputs in network –a value may be *, meaning output is ignored during training for this pattern

.CF File

.CF File format n Three sections n NODES: section –nodes = # of non-input units in network –inputs = # of inputs to network –outputs = # of output units –output node is ___ <== which node is the output node? > 1 output node ==> syntax changes to “output nodes are”

.CF File format n CONNECTIONS: section –groups = 0 ( explained later ) –1 from i1-i2 (says that node # 1 gets values from input nodes i1 and i2) –1 from 0 (says that node # 1 gets values from the bias node -- explained below) n input nodes always start with i1, i2, etc. n non-input nodes start with 1, 2, etc.

.CF File format n SPECIAL: section –selected = 1 (special simulator results reporting) –weight-limit = 1.00 (range of random weight values to use in initial network creation)

Bias node n TLearn units all have same threshold –defined by logistic function n  values are represented by a bias node –connected to all non-input nodes –signal always = 1 –weight of the connection is -  –same as a perceptron with a threshold example on board

Network Arch. with Bias Node

.CF File Example (Draw it!) –NODES: nodes = 5 inputs = 3 outputs = 2 output nodes are 4-5 –CONNECTIONS: groups = 0 1-3 from i1-i3 4-5 from 1-3 1-5 from 0

Learning to use TLearn n Chapter 3 of the Plunkett and Elman text is a step-by-step description of several TLearn Training sessions. n Best way to learn: Hands-on! Try Lab Exercise # 5

Introduction to the TLearn Simulator n CS/PY 231 Lab Presentation # 5 n February 16, 2005 n Mount Union College

Introduction to the TLearn Simulator n CS/PY 231 Lab Presentation # 5 n February 16, 2005 n Mount Union College.

Similar presentations

Presentation on theme: "Introduction to the TLearn Simulator n CS/PY 231 Lab Presentation # 5 n February 16, 2005 n Mount Union College."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Introduction to the TLearn Simulator n CS/PY 231 Lab Presentation # 5 n February 16, 2005 n Mount Union College.

Similar presentations

Presentation on theme: "Introduction to the TLearn Simulator n CS/PY 231 Lab Presentation # 5 n February 16, 2005 n Mount Union College."— Presentation transcript:

Similar presentations

About project

Feedback