1 Statistical Mechanics of Online Learning for Ensemble Teachers Seiji Miyoshi Masato Okada Kobe City College of Tech. Univ. of Tokyo, RIKEN BSI.

1 Statistical Mechanics of Online Learning for Ensemble Teachers Seiji Miyoshi Masato Okada Kobe City College of Tech. Univ. of Tokyo, RIKEN BSI

2 S U M M A R Y We analyze the generalization performance of a student in a model composed of linear perceptrons: a true teacher, K teachers, and the student. Calculating the generalization error of the student analytically using statistical mechanics in the framework of on-line learning, we prove that when the learning rate satisfies η 1, the properties are completely reversed. If the variety of the K teachers is rich enough, the direction cosine between the true teacher and the student becomes unity in the limit of η→0 and K→∞.

3 B A C K G R O U N D (1/2) Batch learning –given examples are used more than once –student becomes to give correct answers for all examples –long time and large memory On-line learning –examples once used are discarded –cannot give correct answers for all examples used in training –large memory is not necessary –it is possible to follow a time variant teacher

4 B A C K G R O U N D (2/2) P U R P O S E In most cases in an actual human society, a student can observe examples from two or more teachers who differ from each other. To analyze generalization performance of a model composed of a student, a true teacher and K teachers (ensemble teachers) who exist around the true teacher To discuss the relationship between the number, the variety of ensemble teachers and the generalization error

5 M O D E L (1/4) True teacher Student J learns B 1,B 2, ・・・ in turn. J can not learn A directly. A, B 1,B 2, ・・・,J are linear perceptrons with noises. Ensemble teachers

6 M O D E L (2/4) Output of true teacher Outputs of ensemble teachers Output of student Linear perceptronGaussian noise Linear perceptrons Linear perceptron Gaussian noises Gaussian noise

7 M O D E L (3/4) Inputs: Initial value of student: True teacher: Ensemble teachers: N→∞ (Thermodynamic limit) Order parameters –Length of student –Direction cosines

8 M O D E L (4/4) fkmfkm Gradient method Squared errors Student learns K ensemble teachers in turn.

9 GENERALIZATION ERROR A goal of statistical learning theory is to obtain generalization error theoretically. Generalization error = mean of errors over the distribution of new input Error Multiple Gaussian Distribution

10 Differential equations, which describe the dynamical behaviors of order parameters, have been obtained based on self-averaging in the thermodynamic limits as follows: J m+1 = J m + f k m x m + Nr J m+1 = Nr J m + f k m y m Ndt inputs A is multiplied to both side of Nr J m+2 = Nr J m+1 + f k m+1 y m+1 Nr J m+Ndt = Nr J m+Ndt-1 + f k m+Ndt-1 y m+Ndt-1 １． To simplify the analysis, the following auxiliary order parameters are introduced: ２．３．

11 Simultaneous differential equations in deterministic forms, which describe dynamical behaviors of order parameters

12 Analytical solutions of order parameters

13 Dynamical behaviors of generalization error, R and l （ η=0.3, K=3, R B =0.7, σ A 2 =0.0, σ B 2 =0.1, σ J 2 =0.2 ） Student becomes cleverer than a member of ensemble teachers. The larger the variety of the ensemble teachers is, the nearer the student and true teacher are. Student Ensemble teachers

14 Steady state analysis （ t → ∞ ）・ If η ＜０ or η ＞２・ If ０＜ η ＜２ Generalization error and length of student diverge. If η ＜１, the more teachers exist or the richer the variety of teachers is, the cleverer the student can become. If η ＞１, the fewer teachers exist or the poorer the variety of teachers is, the cleverer the student can become.

15 Steady value of generalization error, R and l （ K=3, R B =0.7, σ A 2 =0.0, σ B 2 =0.1, σ J 2 =0.2 ） Rich variety is good !Poor variety is good !

16 Steady value of generalization error, R and l （ q=0.49, R B =0.7, σ A 2 =0.0, σ B 2 =0.1, σ J 2 =0.2 ） Many teachers are good !Few teachers are good !

1 Statistical Mechanics of Online Learning for Ensemble Teachers Seiji Miyoshi Masato Okada Kobe City College of Tech. Univ. of Tokyo, RIKEN BSI.

Similar presentations

Presentation on theme: "1 Statistical Mechanics of Online Learning for Ensemble Teachers Seiji Miyoshi Masato Okada Kobe City College of Tech. Univ. of Tokyo, RIKEN BSI."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Statistical Mechanics of Online Learning for Ensemble Teachers Seiji Miyoshi Masato Okada Kobe City College of Tech. Univ. of Tokyo, RIKEN BSI.

Similar presentations

Presentation on theme: "1 Statistical Mechanics of Online Learning for Ensemble Teachers Seiji Miyoshi Masato Okada Kobe City College of Tech. Univ. of Tokyo, RIKEN BSI."— Presentation transcript:

Similar presentations

About project

Feedback