Overfitting and Regularization Chapters 11 and 12 on amlbook.com.

Overfitting and Regularization Chapters 11 and 12 on amlbook.com

Over-fitting easy to recognize in 1D Parabolic target function 4 th order hypothesis 5 data points -> E in = 0

Origin of over-fitting can be analyzed in 1D: Bias/variance dilemma

Over-fitting easy to avoid in 1D: Results from HW2 Sum of squared deviations Degree of polynomial E val E in

Using E val to avoid over-fitting works in all dimensions but computation grows rapidly for large d EE E in E cv-1 E val d = 2 Terms in  5 (x) added successively Validation set needs to be large Does this compromise training?

What if we want to add higher order terms to a linear model but don’t have enough data a validation set? Solution: Augment the error function used to optimize weights Example Penalizes choices with large |w|. Called “weight decay”

Normal equations with weight decay essentially unchanged (Z T Z + I) w reg =Z T y

Best value is subjective In this case = 0.0001 large enough to suppress swings and data still important in determining optimum weights

Generation of in silico dataset y(x) = 1 + 9x 2 + N(0,1) with 5 randomly selected values of x between -1 and +1 Fit a 4 th degree polynomial to the data with and without regularization by choosing = 0, 0.0001, 0.001,0.01,1.0, and 10. Display results as in slide 8 of lecture on regularization Assignment 8: due 11-13-14

Overfitting and Regularization Chapters 11 and 12 on amlbook.com.

Similar presentations

Presentation on theme: "Overfitting and Regularization Chapters 11 and 12 on amlbook.com."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Overfitting and Regularization Chapters 11 and 12 on amlbook.com.

Similar presentations

Presentation on theme: "Overfitting and Regularization Chapters 11 and 12 on amlbook.com."— Presentation transcript:

Similar presentations

About project

Feedback