1 Linear Classification Problem Two approaches: -Fisher’s Linear Discriminant Analysis -Logistic regression model.
Published byModified over 4 years ago
Presentation on theme: "1 Linear Classification Problem Two approaches: -Fisher’s Linear Discriminant Analysis -Logistic regression model."— Presentation transcript:
1 Linear Classification Problem Two approaches: -Fisher’s Linear Discriminant Analysis -Logistic regression model
2 Linear Classifiers Classification For Several Groups All the ‘s are equal Group 1: Pop 1 1, Group k: Pop k k, (notice that the ‘s are equal). Next we calculate the distance from y to the center of each group: and assign y to the closet group center. This is equivalent to using the linear discriminant function: and assign y to the group with the largest L i (y).
3 Linear Classifiers Bayes Rule: Let 1,…, k ) be the prior probabilities for the k groups. Then L i (y) becomes: a i ’y + a i0 + ln i Misclassification Rate: Build a table of y vs predicted MSR=
7 Logistic Regression Note that LDA is linear in x: Linear logistic regression looks the same: But the estimation procedure for the co-efficients is different. LDA maximizes joint likelihood [y,X]; logistic regression maximizes conditional likelihood [y|X]. Usually similar predictions.
8 Logistic Regression MLE For the two-class case, the likelihood is: The maximize need to solve (non-linear) score equations:
9 Regularized Logistic Regression Ridge/LASSO logistic regression Successful implementation with over 100,000 predictor variables Can also regularize discriminant analysis
10 Simple Two-Class Perceptron Define: Classify as class 1 if h(x)>0, class 2 otherwise Score function: # misclassification errors on training data For training, replace class 2 x j ’s by -x j ; now need h(x)>0 Initialize weight vector Repeat one or more times: For each training data point x i If point correctly classified, do nothing Else Guaranteed to converge to a separating hyperplane (if exists)
12 “Optimal” Hyperplane The “optimal” hyperplane separates the two classes and maximizes the distance to the closest point from either class. Finding this hyperplane is a convex optimization problem. This notion plays an important role in support vector machines