Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linear Discriminant Functions Wen-Hung Liao, 11/25/2008.

Similar presentations


Presentation on theme: "Linear Discriminant Functions Wen-Hung Liao, 11/25/2008."— Presentation transcript:

1 Linear Discriminant Functions Wen-Hung Liao, 11/25/2008

2 Introduction: LDF Assume we know the proper form of the discriminant functions, instead of the underlying probability densities. Use samples to estimate the parameters of the classifier.(statistical or non-statistical) Will be concerned with discriminant functions that are either linear in the components of x, or linear in some given set of functions of x.

3 Why LDF? Simplicity vs. accuracy Attractive candidates for initial, trial classifiers Related to neural networks

4 Approach Find the LDF by minimizing a criterion function. Use gradient descent procedure for minimization Convergence property Computational complexities Example of criterion function: Sample risk, or training error. (Not appropriate, why?) Because a small training error does not guarantee a small test error.

5 LDF and Decision Surfaces A linear discriminant function: where w : weight vector w 0 : bias or threshold

6 Two-Category Case Decision rule: Decide w 1 if g(x) > 0, decide w 2 if g(x)<0 In other words, x is assigned to w 1 if the inner product w t x exceeds the threshold – w 0.

7 Decision Boundary A hyperplane H defined by g(x)=0 If x1 and x2 are both on the decision surface, then: w is normal to any vector lying on the hyperplane.

8 Distance Measure For any x, where x p is the normal projection of x onto H, and r is the algebraic distance.

9 Multi-category Case General case: c-1 2-class c(c-1)/2 linear discriminant

10 Use c linear discriminants

11 Distance Measure w i -w j is normal to H ij. Distance for x to H ij is given by:

12 Quadratic DF Add terms involving products of pairs of component of x to obtain the quadratic discriminant function: The separating surface defined by g(x)=0 is a hyperquadric function.

13 Hyperquadric Surfaces If W=[w ij ] is not singular, then the linear terms in g(x) can be eliminated by translating the axes. Define a scale matrix: Hypersphere Hyperellipsoid Hyperperboloid

14 Generalized LDF Polynomial discriminant functions Generalized LDF:

15 Augment Vectors Augment feature vector: Augment weight vector: Mapping a d-dimensional x-space to (d+1)-dimensional y-space

16 2-Category Separable Case Look for a weight vector that classifies all of the samples correctly. If such a weight does exist, then the samples are said to be linearly separable.

17 Gradient Descent Procedure Define a criterion function J(a) that is minimized if a is a solution vector. Step 1: Randomly pick a(1), and compute the gradient vector: Step 2: a(2) is obtained by moving some distance from a(1) in the direction of the steepest descent.

18 Setting the Learning Rate Second-order expansion of J(a): Substituting Minimized when

19 Newton Descent For nonsingular H Converges faster but more difficult to compute per step.

20 Perceptron Criterion Function where Y(a) is the set of samples misclassified by a. Since Update rule:

21 Convergence Proof Refer to page 229 to 232 of textbook.


Download ppt "Linear Discriminant Functions Wen-Hung Liao, 11/25/2008."

Similar presentations


Ads by Google