# CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/

## Presentation on theme: "CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/"— Presentation transcript:

CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/ 5: Multivariate Regression 1 CSC 4510 - M.A. Papalaskari - Villanova University T he slides in this presentation are adapted from: Andrew Ng’s ML course http://www.ml-class.org/http://www.ml-class.org/

Regression topics so far Introduction to linear regression Intuition – least squares approximation Intuition – gradient descent algorithm Hands on: Simple example using excel How to apply gradient descent to minimize the cost function for regression linear algebra refresher CSC 4510 - M.A. Papalaskari - Villanova University 2

What’s next? Multivariate regression Gradient descent revisited – Feature scaling and normalization – Selecting a good value for α Non-linear regression Solving for analytically (Normal Equation) Using Octave to solve regression problems CSC 4510 - M.A. Papalaskari - Villanova University 3

Size (feet 2 )Number of bedrooms Number of floors Age of home (years) Price (\$1000) 121045145460 114163240232 115343230315 18522136178 What’s next? We are not in univariate regression anymore: 4 CSC 4510 - M.A. Papalaskari - Villanova University

Andrew Ng Size (feet 2 )Number of bedrooms Number of floors Age of home (years) Price (\$1000) 21045145460 14163240232 15343230315 8522136178 …………… Multiple features (variables). CSC 4510 - M.A. Papalaskari - Villanova University 5

Andrew Ng Size (feet 2 )Number of bedrooms Number of floors Age of home (years) Price (\$1000) 21045145460 14163240232 15343230315 8522136178 …………… Multiple features (variables). Notation: = number of features = input (features) of training example. = value of feature in training example. CSC 4510 - M.A. Papalaskari - Villanova University 6

Andrew Ng Size (feet 2 )Price (\$1000) 2104460 1416232 1534315 852178 …… Multiple features (variables). CSC 4510 - M.A. Papalaskari - Villanova University 7

8 For convenience of notation, define. Multivariate linear regression Hypothesis: Previously: Now:

CSC 4510 - M.A. Papalaskari - Villanova University 9 Hypothesis: Cost function: Parameters: (simultaneously update for every ) Repeat Gradient descent:

CSC 4510 - M.A. Papalaskari - Villanova University 10 (simultaneously update ) Gradient Descent Repeat Previously (n=1):

CSC 4510 - M.A. Papalaskari - Villanova University 11 (simultaneously update ) Gradient Descent Repeat Previously (n=1): New algorithm : Repeat (simultaneously update for )

CSC 4510 - M.A. Papalaskari - Villanova University 12 (simultaneously update ) Gradient Descent Repeat Previously (n=1): New algorithm : Repeat (simultaneously update for )

CSC 4510 - M.A. Papalaskari - Villanova University 13 E.g. = size (0-2000 feet 2 ) = number of bedrooms (1-5 ) Feature Scaling Idea: Make sure features are on a similar scale. size (feet 2 ) number of bedrooms Get every feature into range

CSC 4510 - M.A. Papalaskari - Villanova University 14 E.g. = size (0-2000 feet 2 ) = number of bedrooms (1-5 ) Feature Scaling Idea: Make sure features are on a similar scale. Replace with to make features have approximately zero mean (Do not apply to ). Mean normalization E.g.

CSC 4510 - M.A. Papalaskari - Villanova University 15 Gradient descent -“Debugging”: How to make sure gradient descent is working correctly. -How to choose learning rate.

CSC 4510 - M.A. Papalaskari - Villanova University 16 No. of iterations Making sure gradient descent is working correctly. -For sufficiently small, should decrease on every iteration. -But if is too small, gradient descent can be slow to converge. Declare convergence if decreases by less than in one iteration?

CSC 4510 - M.A. Papalaskari - Villanova University 17 Summary: Choosing -If is too small: slow convergence. -If is too large: may not decrease on every iteration; may not converge. To choose, try

Andrew Ng Housing prices prediction CSC 4510 - M.A. Papalaskari - Villanova University 18

Andrew Ng Polynomial regression Price (y) Size (x) CSC 4510 - M.A. Papalaskari - Villanova University 19

Andrew Ng Choice of features Price (y) Size (x) CSC 4510 - M.A. Papalaskari - Villanova University 20

Andrew Ng Gradient Descent Normal equation: Method to solve for analytically. CSC 4510 - M.A. Papalaskari - Villanova University 21

Andrew Ng Intuition: If 1D Solve for (for every ) CSC 4510 - M.A. Papalaskari - Villanova University 22

Andrew Ng Size (feet 2 )Number of bedrooms Number of floors Age of home (years) Price (\$1000) 121045145460 114163240232 115343230315 18522136178 Size (feet 2 )Number of bedrooms Number of floors Age of home (years) Price (\$1000) 21045145460 14163240232 15343230315 8522136178 Examples: CSC 4510 - M.A. Papalaskari - Villanova University 23

Andrew Ng Size (feet 2 )Number of bedrooms Number of floors Age of home (years) Price (\$1000) 121045145460 114163240232 115343230315 18522136178 1 Size (feet 2 )Number of bedrooms Number of floors Age of home (years) Price (\$1000) 21045145460 14163240232 15343230315 8522136178 30004138540 Examples: CSC 4510 - M.A. Papalaskari - Villanova University 24

Andrew Ng examples ; features. E.g. If CSC 4510 - M.A. Papalaskari - Villanova University 25

Andrew Ng is inverse of matrix. Octave: pinv(X’*X)*X’*y CSC 4510 - M.A. Papalaskari - Villanova University 26

Andrew Ng training examples, features. Gradient DescentNormal Equation No need to choose. Don’t need to iterate. Need to choose. Needs many iterations. Works well even when is large. Need to compute Slow if is very large. CSC 4510 - M.A. Papalaskari - Villanova University 27

CSC 4510 - M.A. Papalaskari - Villanova University 28 Notes on Supervised learning and Regression http://see.stanford.edu/materials/aimlcs229/cs229-notes1.pdf Octave http://www.gnu.org/software/octave/ Wiki: http://www.octave.org/wiki/index.php?title=Main_Pagehttp://www.octave.org/wiki/index.php?title=Main_Page documentation: http://www.gnu.org/software/octave/doc/interpreter/

CSC 4510 - M.A. Papalaskari - Villanova University 29 Exercise For next class: 1.Download and install Octave (Alternative: if you have MATLAB, you can use it instead.) 2.Verify that it is working by typing in an Octave command window: x = [0 1 2 3] y = [0 2 4 6] plot(x,y) This example defines two vectors, x y and should display a plot showing a straight line (the line y=2x). If you get an error at this point, it may be that gnuplot is not installed or cannot access your display. If you are unable to get this to work, you can still do the rest of this exercise, because it does not involve any plotting (just restart Octave). You might refer to the Octave wiki for installation help but if you are stuck, you can get some help troubleshooting this on Friday afternoon 3-4pm in the software engineering lab (mendel 159). 3.Create a few matrices and vectors, eg: A = [1 2; 3 4; 5 6] V = [3 5 -1 0 7] 4.Try some of the elementary matrix and vector operations from our linear algebra slides (adding, multiplying between matrices, vectors and scalars) 5.Print out a log of your session

Download ppt "CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/"

Similar presentations