Presentation is loading. Please wait.

Presentation is loading. Please wait.

Connectionist Modelling Summer School Lecture Three.

Similar presentations


Presentation on theme: "Connectionist Modelling Summer School Lecture Three."— Presentation transcript:

1 Connectionist Modelling Summer School Lecture Three

2 Using an Error Signal Orthogonality Constraint Number of patterns limited by dimensionality of network. Input patterns must be orthogonal to each other Similarity effects. Perceptron Convergence Rule Learning in a single weight network Assume a teacher signal t out Adaptation of Connection and Threshold (Rosenblatt 1958) Note that threshold always changes if incorrect output. Blame is apportioned to a connection in proportion to the activity of the input line. x y z Input Neurons Output Neurons w a in a out

3 Using an Error Signal Perceptron Convergence Rule “ The perceptron convergence rule guarantees to find a solution to a mapping problem, provided a solution exists.” (Minsky & Papert 1969 ) An Example of Perceptron Learning Boolean Or Training the network InputOutput 000 101 011 111 a out w 20 w 21 InOutW 20 W 21 θ a out δ ΔθΔθ ΔwΔw 0000.20.11.00000 1010.20.11.00 -0.50.5 0110.70.10.501.0-0.50.5 1110.70.60.01000

4 Gradient Descent Least Mean Square Error (LMS) Define the error measure as the square of the discrepancy between the actual output and the desired output. (Widrow-Hoff 1960) Plot an error curve for a single weight network Make weight adjustments by performing gradient descent – always move down the slope. Calculating the Error Signal Note that Perceptron Convergence and LMS use similar learning algorithms – the Delta Rule Error Landscapes Gradient descent algorithms adapt by moving downhill in a multi- dimensional landscape – the error surface. Ball bearing analogy. In a smooth landscape, the bottom will always be reached. However, bottom may not correspond to zero error. Weight Value Error

5 Past Tense Revisited Vocabulary Discontinuity –Up to 10 epochs – 8 irregulars + 2 regulars. Thereafter – 420 verbs – mostly regular. –Justification: Irregulars are more frequent than regulars Lack of Evidence –Vocabulary spurt at 2 years whereas overregularizations occur at 3 years. Furthermore, vocabulary spurt consists mostly of nouns. –Pinker and Prince (1988) show that regulars and irregulars are relatively balanced in early productive vocabularies

6 Longitudinal evidence Stages or phases in development? –Initial error-free performance. –Protracted period of overregularisation but at low rates (typically < 10%). –Gradual recovery from error. –Rate of overregularisation is much less the rate of regularisation of regular verbs. 1992

7 Longitudinal evidence Error Characteristics –High frequency irregulars are robust to overregularisation. –Some errors seem to be phonologically conditioned. –Irregularisations.

8 Single system account Multi-layered Perceptrons –Hidden unit representation –Error correction technique –Plunkett & Marchman 1991 –Type/Token distinction –Continuous training set

9 Single system account Incremental Vocabularies –Plunkett & Marchman (1993) –Initial small training set –Gradual expansion Overregularisation –Initial error-free performance. –Protracted period of overregularisation but at low rates (typically < 5%). –High frequency irregulars are robust to overregularisation.


Download ppt "Connectionist Modelling Summer School Lecture Three."

Similar presentations


Ads by Google