Download presentation

Presentation is loading. Please wait.

Published byCorey Golden Modified over 2 years ago

1
A Statistical Mechanical Analysis of Online Learning: Can Student be more Clever than Teacher ? Seiji MIYOSHI Kobe City College of Technology miyoshi@kobe-kosen.ac.jp

2
2 Background (1) Batch Learning –Examples are used repeatedly –Correct answers for all examples –Long time –Large memory Online Learning –Examples used once are discarded –Cannot give correct answers for all examples –Large memory isn't necessary –Time variant teacher

3
3 Background (2) TeacherStudent

4
4 Simple Perceptron Output Inputs Connection weights +1

5
5 Background (2) TeacherStudent Learnable Case

6
6 Background (3) Teacher Student Unlearnable Case （ Inoue & Nishimori, Phys. Rev. E, 1997) （ Inoue, Nishimori & Kabashima, TANC-97, cond-mat/9708096, 1997)

7
7 Background (4) Hebbian Learning Perceptron Learning

8
8 Model (1) Moving Teacher Student True Teacher A

9
9 Model (2) Length of Student Length of Moving Teacher A B J

10
10 Model (3) A B J

11
11 Output Inputs Connection weights Simple Perceptron Linear Perceptron

12
12 Model (3) Linear Perceptrons with Noises A B J

13
13 f g Model (4) Squared Errors Gradient Method A B J

14
14 ErrorGaussian Generalization Error A B J

15
15 Differential equations for order parameters

16
16 f g Model (4) Squared Errors Gradient Method A B J

17
17 B m+1 = B m + g m x m + Nr B m+1 = Nr B m + g m y m Ndt Nr B m+2 = Nr B m+1 + g m+1 y m+1 Nr B m+Ndt = Nr B m+Ndt-1 + g m+Ndt-1 y m+Ndt-1 Nr B m+Ndt = Nr B m + Ndt N(r B +dr B ) = Nr B + Ndt dr B / dt =

18
18 Differential equations for order parameters

19
19 Sample Averages

20
20 Differential equations for order parameters

21
21 Analytical Solutions of Order Parameters

22
22 Differential equations for order parameters

23
23 ErrorGaussian Generalization Error A B J

24
24 Dynamical Behaviors of Generalization Errors η J ＝ 1.2 η J ＝ 0.3

25
25 Dynamical Behaviors of R and l η J ＝ 1.2η J ＝ 0.3

26
26 Analytical Solutions of Order Parameters

27
27 Steady State

28
28 ηJηJ 20

29
29 Conclusions Generalization errors of a model composed of a true teacher, a moving teacher, and a student that are all linear perceptrons with noises have been obtained analytically using statistical mechanics. Generalization errors of a student can be smaller than that of a moving teacher, even if the student only uses examples from the moving teacher.

Similar presentations

Presentation is loading. Please wait....

OK

NNs Adaline 1 Neural Networks - Adaline L. Manevitz.

NNs Adaline 1 Neural Networks - Adaline L. Manevitz.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on obesity diet program Ppt on business etiquettes training programs Ppt on recurrent abortion Ppt on digital television technology Ppt on 21st century skills rubric Cornea anatomy and physiology ppt on cells Ppt on building information modeling degree Mp ppt online form 2014 Ppt on north indian food Ppt on political parties and electoral process in the philippines