Presentation is loading. Please wait.

Presentation is loading. Please wait.

WEEK 2 SOFT COMPUTING & MACHINE LEARNING YOSI KRISTIAN Gradient Descent for Linear Regression.

Similar presentations


Presentation on theme: "WEEK 2 SOFT COMPUTING & MACHINE LEARNING YOSI KRISTIAN Gradient Descent for Linear Regression."— Presentation transcript:

1 WEEK 2 SOFT COMPUTING & MACHINE LEARNING YOSI KRISTIAN Gradient Descent for Linear Regression

2 Single Variable Linear Regression Gradient Descent

3 Have some function Want Outline: Start with some Keep changing to reduce until we hopefully end up at a minimum

4 Ilustration   J(     )

5 Ilustration   J(     )

6 The Algorithm Gradient descent algorithm Correct: Simultaneous updateIncorrect:

7 Algorithm Explained..   = Learning Rate  Following  = are the derivative Gradient descent algorithm

8  effects.. If α is too small, gradient descent can be slow. If α is too large, gradient descent can overshoot the minimum. It may fail to converge, or even diverge.

9 Fixed .. Gradient descent can converge to a local minimum, even with the learning rate α fixed. As we approach a local minimum, gradient descent will automatically take smaller steps. So, no need to decrease α over time.

10 Applying Gradient Descent for Linear Regresion Gradient descent algorithm Linear Regression Model

11 Gradient Descent Function..

12 Algorithm.. Gradient descent algorithm update and simultaneously

13 Remember Local Minimum Problem.   J(     )

14 It Wont Happened Here..

15 “Batch” Gradient Descent “Batch”: Each step of gradient descent uses all the training examples.

16 Visualization (for fixed, this is a function of x)(function of the parameters )

17 Contd. (for fixed, this is a function of x)(function of the parameters )

18 Contd.. (for fixed, this is a function of x)(function of the parameters )

19 Contd. (for fixed, this is a function of x)(function of the parameters )

20 Contd… (for fixed, this is a function of x)(function of the parameters )

21 (for fixed, this is a function of x)(function of the parameters )

22 (for fixed, this is a function of x)(function of the parameters )

23 (for fixed, this is a function of x)(function of the parameters )

24 (for fixed, this is a function of x)(function of the parameters )

25 Homework  Create a program to demonstrate Gradient Descent usage on One Variable Linear Regression Problem.  Use Diamond Data.  Input : 1 variable  Output : 1 variable.  Visualize your program. (MSE, Line Regression)  Able to manually initialize  0  1

26 Multiple features Linear Regression with multiple variables

27 Previously

28 Multiple Feature

29 Multiple features (variables). Notation: = number of features = input (features) of training example. = value of feature in training example.

30 Hypothesis: Previously:

31 Still Hypothesis… For convenience of notation, define. Multivariate linear regression.

32 Gradient descent for multiple variables linear regression Linear Regression with multiple variables

33 Hypothesis: Symplified Cost function: Parameters: (simultaneously update for every ) Repeat Gradient descent:

34 Gradient Descent (simultaneously update ) Repeat Previously (n=1): New algorithm : Repeat (simultaneously update for )

35 Gradient descent in practice I: Feature Scaling Linear Regression with multiple variables

36 Feature Scaling E.g. = size (0-2000 feet 2 ) = number of bedrooms (1-5 ) Idea: Make sure features are on a similar scale. size (feet 2 ) number of bedrooms

37 Feature Scaling Get every feature into approximately a range.

38 Mean normalization Replace with to make features have approximately zero mean (Do not apply to ). E.g.

39 Choosing Learning Rate Linear Regression with multiple variables

40 Making sure gradient descent is working correctly. Example automatic convergence test: Declare convergence if decreases by less than in one iteration. No. of iterations

41 Making sure gradient descent is working correctly. Gradient descent not working. Use smaller. No. of iterations -For sufficiently small, should decrease on every iteration. -But if is too small, gradient descent can be slow to converge.

42 Summary: -If is too small: slow convergence. -If is too large: may not decrease on every iteration; may not converge. To choose, try

43 Homework  Create a program to demonstrate Gradient Descent usage on Multiple Variable Linear Regression Problem.  Use Housing Data.  Input : 2 variable  Output : 1 variable.  Able to manually initialize  0  1   is customizable  Do the “Feature Scalling”

44 Features and polynomial regression Linear Regression with multiple variables

45 Housing prices prediction

46 Polynomial regression Price (y) Size (x)

47 Finally … Fin…


Download ppt "WEEK 2 SOFT COMPUTING & MACHINE LEARNING YOSI KRISTIAN Gradient Descent for Linear Regression."

Similar presentations


Ads by Google