# Linear discriminant analysis (LDA) Katarina Berta

## Presentation on theme: "Linear discriminant analysis (LDA) Katarina Berta"— Presentation transcript:

Linear discriminant analysis (LDA) Katarina Berta katarinaberta@gmail.com bk113255m@student.etf.rs

Introduction Fisher’s Linear Discriminant Analysis Paper from 1936. (link)link Statistical technique for classification LDA = two classes MDA = multiple classes Used in statistics, pattern recognition, machine learning 2/15

Purpose Discriminant Analysis classifies objects in two or more groups according to linear combination of features Feature selection  Which set of features can best determine group membership of the object?  dimension reduction Classification  What is the classification rule or model to best separate those groups? 3/15

Method (1) Good separationBad separation 4/15

Method (2) Maximize the between-class scatter Difference of mean values (m1-m2) Minimize the within-class scatter Covariance Min Max 5/15

Formula Σ y = 0 = Σ y = 1 = Σ equal covarinaces Bayes' theorem Idea: x – object i, j – classes, groups Derivation: probability density functions -normaly distributet- QDA - quadratic discriminant analysis Mean valueCovarinace FLD 6/24

Example CurvatureDiameterQuality Control Result 2,956,63Passed 2,537,79Passed 3,575,65Passed 3,165,47Passed 2,584,46Not Passed 2,166,22Not Passed 3,273,52Not Passed Factory for high quality chip rings Training set 7/15

Normalization of data X1X2 2,8885,676 X1X2class 2,956,631 2,537,791 3,575,651 3,165,471 2,584,460 2,166,220 3,273,520 X1oX2oclass 0,0600,951 1 -0,3572,109 1 0,679-0,025 1 0,269-0,209 1 -0,305-1,218 0 -0,7320,547 0 0,386-2,155 0 Training data Mean corrected data Avrage 8/15

Covarinace 0,166-0,192 1,349 0,259-0,286 2,142 Covarinace for class i Covarinace class 1 – C 1 Covarinace class 2 – C 2 One entry of covarinace matrix - C 0,206-0,233 1,689 covarinace matrix - C 0,259-0,286 2,142 Inverse covarinace matrix C - S 9/15

Mean values NP(i)m(X1)m(X2) Class 1 40,5713,056,38m1 Class 2 30,4292,674,73m2 Sum 75,7211,12m1+m2 0,38 1,65 m1-m2 3,487916 1,456612 W= S*(m1-m2) W 0 = ln[P(1)\P(2)]-1\2*(m1+m2) = -17,7856 N – number of objects P(i) – prior probability m1 – mean value matrix of class 1 (m(x1), m(x2)) m2 – mean value matrix of class 2 (m(x1), m(x2)) 0,259-0,286 2,142 S- inverse covariance * = 10/15

Resault X1X2scoreclass 2,956,63 2,149 1 2,537,79 2,380 1 3,575,65 2,887 1 3,165,47 1,189 1 2,584,46 -2,285 0 2,166,22 -1,203 0 3,273,52 -1,240 0 score= X*W + W 0 X1X2 2,956,63 2,537,79 3,575,65 3,165,47 2,584,46 2,166,22 3,273,52 3,487916 1,456612 * = W0W0 + score 2,149 2,380 2,887 1,189 -2,285 -1,203 -1,240 11/15

Prediction New chip: curvature = 2.81, diameter = 5.46 Predicition: will not pass Prediction correct! score= X*W + W 0 W= S*(m1-m2) score= -0,036 If (score>0) then class1 else class2 score= -0,036 => class2 12/15

Pros & Cons Cons Old algorithm Newer algorithms - much better predicition Pros Simple Fast and portable Still beats some algorithms (logistic regression) when its assumptions are met Good to use when begining a project 13/15

Conclusion FisherFace one of the best algorithms for face recognition Often used for dimension reduction Basis for newer algorithms Good for beginig of data mining projects Thoug old, still worth trying 14/15

Thank you for your attention! Questions? 15