Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis

Similar presentations


Presentation on theme: "Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis"— Presentation transcript:

1 Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis George Bebis

2 Final Exam Material Midterm Exam Material Feature Selection
Linear Discriminant Functions Support Vector Machines Expectation-Maximization Algorithm Case studies are also included in the final exam

3 Feature Selection What is the goal of feature selection?
Select features having high discrimination power while ignoring or paying less attention to the rest. What are the main steps in feature selection? Search the space of possible feature subsets. Pick the one that is optimal or near-optimal with respect to a certain criterion (evaluation).

4 Feature Selection What are the main search and evaluation strategies?
Search strategies: Optimal, Heuristic, Randomized Evaluation strategies: filter, wrapper What is the main difference between filter and wrapper methods? In filter methods, evaluation is independent of the classification algorithm. In wrapper methods, evaluation depends on the classification algorithm.

5 Feature Selection You need to be familiar with:
Exhaustive and Naïve search Sequential Forward/Backward Selection (SFS/SBS) Plus-L Minus-R Selection Bidirectional Search Sequential Floating Selection (SFFS and SFBS) Feature selection using GAs

6 Linear Discriminant Functions
General form of linear discriminant: What is the form of the decision boundary? The decision boundary is a hyperplane What is the meaning of w and w0? The orientation and location of the hyperplane are determined by w and w0 correspondingly.

7 Linear Discriminant Functions
What is the geometric interpretation of g(x)? Distance of x from the decision boundary (hyperplane)

8 Linear Discriminant Functions
How do we estimate w and w0? Apply learning using a set of labeled training examples What is the effect of each training example? Places a constraint on the solution solution space (ɑ1, ɑ2) feature space (y1, y2) a1 a2

9 Linear Discriminant Functions
Iterative optimization – what is the main idea? Minimize some error function J(α) iteratively α(k) α(k+1) search direction learning rate

10 Linear Discriminant Functions
Gradient descent method Newton method Perceptron rule

11 Support Vector Machines
What is the capacity of a classifier? What is the VC dimension of a classifier? What is structural risk minimization? Find solutions that (1) minimize the empirical risk and (2) have low VC dimension. It can be shown that: with probability (1-δ)

12 Support Vector Machines
What is the margin of separation? How is it defined? What is the relationship between VC dimension and margin of separation? VC dimension is minimized by maximizing the margin of separation. support vectors

13 Support Vector Machines
What is the criterion being optimized by SVMs? maximize margin:

14 Support Vector Machines
SVM solution depends only on the support vectors: Soft margin classifier – tolerate “outliers”

15 Support Vector Machines
Non-linear SVM – what is the main idea? Map data to a high dimensional space h

16 Support Vector Machines
What is the kernel trick? Compute dot products using a kernel function e.g., polynomial kernel: K(x,y)=(x . y) d

17 Support Vector Machines
Comments about SVMs: SVM is based on exact optimization (no local optima). Its complexity depends on the number of support vectors, not on the dimensionality of the transformed space. Performance depends on the choice of the kernel and its parameters.

18 Expectation-Maximization (EM)
What is the EM algorithm? An iterative method to perform ML estimation i.e., max p(D/ θ) When is EM useful? Most useful for problems where the data is incomplete or can be thought as being incomplete.

19 Expectation-Maximization (EM)
What are the steps of the EM algorithm? Initialization: θ0 Expectation Step: Maximization Step: Test for convergence: Convergence properties of EM ? Solution depends on the initial estimate θ0 No guarantee to find global maximum but stable

20 Expectation-Maximization (EM)
What is a Mixture of Gaussians (MoG)? How are the MoG parameters estimated? Using the EM algorithm How is EM used to estimate the MoGs parameters? Introduce “hidden variables:

21 Expectation-Maximization (EM)
Can you interpret the EM steps for MoGs?

22 Expectation-Maximization (EM)
Can you interpret the EM steps for MoGs?


Download ppt "Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis"

Similar presentations


Ads by Google