Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regression UC Berkeley Fall 2004, E77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons.

Similar presentations


Presentation on theme: "Regression UC Berkeley Fall 2004, E77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons."— Presentation transcript:

1 Regression UC Berkeley Fall 2004, E77 http://jagger.me.berkeley.edu/~pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons Attribution-ShareAlike License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/2.0/ or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA. http://jagger.me.berkeley.edu/~pack/e77http://creativecommons.org/licenses/by-sa/2.0/

2 Info Midterm next Friday (11/5) 1-2 if you actually enrolled in my section. Check BlackBoard to see Room Location If you have a conflict (as before), bring a letter/scheduleprintout, etc to class next Monday so that we can make arrangements. Review Session Wednesday 11/3 evening. Check BlackBoard HW and Lab due this Friday (10/29) Mid-Course evaluation on BlackBoard. Do it by Thursday, 11/4 at noon. We can’t see your answers, but we know if you’ve done it. Get an extra point.

3 Regression: Curve-fitting with minimum error Given (x,y) data pairs (x 1,y 1 ), (x 2,y 2 ), …, (x N,y N ) From a prespecified collection of “simple” functions – for example: all linear functions find one that “explains the data pairs with minimum error.” e 1 = f(x 1 ) – y 1 e 2 = f(x 2 ) – y 2 … e k = f(x k ) – y k … e N = f(x N ) – y N For a given function f, the mismatch (error) is defined

4 Fitting data with a linear function -8-6-4-20246810 -6 -5 -4 -3 -2 0 1 2 X Y Data points Linear function Positive e i Negative e i

5 Straight-line functions How does it work if the function f is to be of the form f(x) = ax + b for to-be-chosen parameters a and b? Given (x,y) data pairs (x 1,y 1 ), (x 2,y 2 ), …, (x N,y N ) For fixed values of a and b, the mismatch (error) is e 1 = ax 1 +b – y 1 e 2 = ax 2 +b – y 2 … e N = ax N +b – y N “ data ” by choosing Goal: make this small

6 Measuring the “amount” of mismatch Several ways to quantify the amount of mismatch. All have the property that if one component of mismatch is “big”, then the measure-of-mismatch is big. This is motivated for a few reasons: –It will lead to least squares problems, which we have already been exposed. And, it makes sense. And… –By making “reasonable” assumptions about the cause of the mismatch (independent, random, zero-mean, identically distributed, Gaussian additive errors in observing y), then it is the best measure of how likely a candidate function led to the data observed. For convenience, we pick … “ noise ” in measurement

7 Euclidean Norms of vectors If v is an m-by-1 column (or row) vector, the “norm of v” is defined as Symbol to denote “norm of v” Square-root of sum-of-squares of components, generalizing Pythagorean ’ s theorem The norm of a vector is a measure of its length. Some facts: ||v||=0 if and only if every component of v is zero ||v + w|| ≤ ||v|| + ||w||

8 Straight-line functions Given (x,y) data pairs (x 1,y 1 ), (x 2,y 2 ), …, (x N,y N ) the “ e ” vector ||e|| This says: “ By choice of a and b, minimize the Euclidean norm of the mismatch. ”

9 The “Least Squares” Problem If A is an n-by-m array, and b is an n-by-1 vector, let c * be the smallest possible (over all choices of m-by-1 vectors x) mismatch between Ax and b (ie., pick x to make Ax as much like b as possible). “is defined as” “the minimum, over all m-by-1 vectors x” “ the length (ie., norm) of the difference/mismatch between Ax and b. ”

10 Four cases for Least Squares Recall least squares formulation There are 4 scenarios c * = 0: the equation Ax=b has at least one solution –only one x vector achieves this minimum –many different x vectors achieves the minimum c * > 0: the equation Ax=b has no solutions –only one x vector achieves this minimum –many different x vectors achieves the minimum In regression, this is almost always the case

11 The backslash operator If A is an n-by-m array, and b is an n-by-1 vector, then >> x = A\b solves the “least squares” problem. Namely –If there is an x which solves Ax=b, then this x is computed –If there is no x which solves Ax=b, then an x which minimizes the mismatch between Ax and b is computed. In the case where many x satisfy one of the criterion above, then a smallest (in terms of vector norm) such x is computed. So, mismatch is handled first. Among all equally suitable x vectors that minimize the mismatch, choose a smallest one.

12 Straight-line functions Given (x,y) data pairs (x 1,y 1 ), (x 2,y 2 ), …, (x N,y N ) the “ e ” vector ||e|| This says: “ By choice of a and b, minimize the Euclidean norm of the mismatch. ”

13 Linear Regression Code function [a,b] = linreg(Xdata,Ydata) % Fits a linear % function Y = aX + b % to the data given % by Xdata, Ydata % Verify Xdata and Ydata are column % vectors of same length. N = length(Xdata); optpara = [Xdata ones(N,1)]\Ydata; a = optpara(1); b = optpara(2);

14 Quadratic functions How does it work if the function f is to be of the form f(x) = ax 2 + bx + c for to-be-chosen parameters a, b and c? For fixed values of a, b and c, the error at (x k,y k ) is e k = ax k 2 + bx k + c – y k f(xk)f(xk)

15 Polynomial functions How does it work if the function f is to be of the form f(x) = a 1 x n + a 2 x n-1 + … + a n x + a n+1 for to-be-chosen parameters a 1, a 2,…,a n+1 ? For fixed values of a 1, a 2,…,a n+1, the error at (x k,y k ) is f(xk)f(xk)

16 Polynomial Regression Psuedo-Code function p=polyreg(Xdata,Ydata,nOrd) % Fits an nOrd’th order polynomial % to the data given by Xdata, Ydata N = length(Xdata); RM = zeros(N,nOrd+1); RM(:,end) = ones(N,1); for i=1:nOrd RM(:,end-i) = RM(:,end-i+1).*Xdata; end p = RM\Ydata; p = p.’;

17 General “basis” functions How does it work if the function f is to be of the form for fixed functions b i (called “basis” functions), and to-be- chosen parameters a 1, a 2,…,a n. For fixed values of a 1, a 2,…,a n, the error at (x k,y k ) is


Download ppt "Regression UC Berkeley Fall 2004, E77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons."

Similar presentations


Ads by Google