Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intro to Linear Methods Reading: DH&S, Ch 5.{1-4,8} hip to be hyperplanar...

Similar presentations


Presentation on theme: "Intro to Linear Methods Reading: DH&S, Ch 5.{1-4,8} hip to be hyperplanar..."— Presentation transcript:

1 Intro to Linear Methods Reading: DH&S, Ch 5.{1-4,8} hip to be hyperplanar...

2 Then & now... Last time: k -NN geometry Bayesian decision theory -- Bayes optimal classifiers & Bayes error NN in daily life Today: Intro to linear methods Formulation Math The linear regression problem

3 Linear methods Both methods we’ve seen so far: Are classification methods Use intersections of linear boundaries What about regression? What can we do with a single linear surface? Not as much as you’d like...... but still surprisingly a lot Linear regression is the proto-learning method

4 Warning! Change of notation! I usually write x for the data vector y for the class var; y for the class vector In this section, your book uses x for the data vector y for the “augmented” data vector g(x) for the class; b for the class vector

5 Linear regression prelims Basic idea: assume g(x) is a linear function of x : Our job: find best to fit g(x) “as well as possible”

6 Linear regression prelims Basic idea: assume g(x) is a linear function of x : Our job: find best to fit g(x) “as well as possible” Note: double-uomega little curly squiggle...

7 Linear regression prelims Basic idea: assume g(x) is a linear function of x : Our job: find best to fit g(x) “as well as possible” Note: Stupid math fonts... double-uomega little curly squiggle...

8 Linear regression prelims By “as well as possible”, we mean here, minimum squared error: Loss function

9 Useful definitions Definition: A trick is a clever mathematical hack Definition: A method is a trick you use more than once

10 A helpful “method” Recall Want to be able to easily write Introduce “pseudo-feature” of x, Now have:

11 A helpful “method” Now have: And: So: And our “loss function” becomes:

12 Minimizing loss Finally, can write: Want the “best” set of a : the weights that minimize the above Q: how do you find the minimum of a function w.r.t. some parameter?

13 Minimizing loss Back up to the 1-d case Suppose you had the function: And wanted to find w that minimizes l() Std answer: take derivative, set equal to 0, and solve: To be sure of a min, check 2nd derivative too...

14 5 minutes of math... Some useful linear algebra identities: If A and B are matrices, (for invertible square matrices)

15 5 minutes of math... What about derivatives of vectors/matrices? There’s more than one kind... For the moment, we’ll need the derivative of a vector function with respect to a vector If x is a vector of variables, y is a vector of constants, and A is a matrix of constants, then:

16 Exercise Derive the vector derivative expressions: Find an expression for the minimum squared error weight vector, a, in the loss function:

17 The LSE method The quantity is called a Gram matrix and is positive semidefinite and symmetric The quantity is the pseudoinverse of Y May not exist if Gram matrix is not invertable The complete “learning algorithm” is 2 whole lines of Matlab code


Download ppt "Intro to Linear Methods Reading: DH&S, Ch 5.{1-4,8} hip to be hyperplanar..."

Similar presentations


Ads by Google