Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis

Slides:



Advertisements
Similar presentations
Introduction to Support Vector Machines (SVM)
Advertisements

Generative Models Thus far we have essentially considered techniques that perform classification indirectly by modeling the training data, optimizing.
Linear Classifiers/SVMs
CS479/679 Pattern Recognition Dr. George Bebis
An Introduction of Support Vector Machine
Support Vector Machines and Kernels Adapted from slides by Tim Oates Cognition, Robotics, and Learning (CORAL) Lab University of Maryland Baltimore County.
Support Vector Machines
SVM—Support Vector Machines
Support vector machine
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Supervised Learning Recap
Lecture 14 – Neural Networks
Support Vector Machines (and Kernel Methods in general)
Support Vector Machines and Kernel Methods
Fuzzy Support Vector Machines (FSVMs) Weijia Wang, Huanren Zhang, Vijendra Purohit, Aditi Gupta.
Support Vector Machines (SVMs) Chapter 5 (Duda et al.)
Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Announcements See Chapter 5 of Duda, Hart, and Stork. Tutorial by Burge linked to on web page. “Learning quickly when irrelevant attributes abound,” by.
Discriminant Functions Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
Lecture 10: Support Vector Machines
SVM (Support Vector Machines) Base on statistical learning theory choose the kernel before the learning process.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9
Linear Discriminant Functions Chapter 5 (Duda et al.)
Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.
Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.
This week: overview on pattern recognition (related to machine learning)
Based on: The Nature of Statistical Learning Theory by V. Vapnick 2009 Presentation by John DiMona and some slides based on lectures given by Professor.
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
SVM Support Vector Machines Presented by: Anas Assiri Supervisor Prof. Dr. Mohamed Batouche.
Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:
START OF DAY 5 Reading: Chap. 8. Support Vector Machine.
CS 478 – Tools for Machine Learning and Data Mining SVM.
Kernel Methods: Support Vector Machines Maximum Margin Classifiers and Support Vector Machines.
Ohad Hageby IDC Support Vector Machines & Kernel Machines IP Seminar 2008 IDC Herzliya.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Linear Models for Classification
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.
Kernel Methods: Support Vector Machines Maximum Margin Classifiers and Support Vector Machines.
1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.
Linear Discriminant Functions Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.
Linear machines márc Decison surfaces We focus now on the decision surfaces Linear machines = linear decision surface Non-optimal solution but.
Support Vector Machines (SVMs) Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.
Neural networks and support vector machines
Support vector machines
CS 9633 Machine Learning Support Vector Machines
PREDICT 422: Practical Machine Learning
Support Vector Machine
LECTURE 16: SUPPORT VECTOR MACHINES
An Introduction to Support Vector Machines
LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS
Support Vector Machines Introduction to Data Mining, 2nd Edition by
Linear machines 28/02/2017.
Statistical Learning Dong Liu Dept. EEIS, USTC.
Support Vector Machines Most of the slides were taken from:
LECTURE 17: SUPPORT VECTOR MACHINES
Support Vector Machines
Support Vector Machines and Kernels
Support vector machines
COSC 4368 Machine Learning Organization
Linear Discrimination
Learning to Rank using Language Models and SVMs
SVMs for Document Ranking
Presentation transcript:

Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis 2018-11-19 Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis George Bebis

Final Exam Material Midterm Exam Material Feature Selection Linear Discriminant Functions Support Vector Machines Expectation-Maximization Algorithm Case studies are also included in the final exam

Feature Selection What is the goal of feature selection? Select features having high discrimination power while ignoring or paying less attention to the rest. What are the main steps in feature selection? Search the space of possible feature subsets. Pick the one that is optimal or near-optimal with respect to a certain criterion (evaluation).

Feature Selection What are the main search and evaluation strategies? Search strategies: Optimal, Heuristic, Randomized Evaluation strategies: filter, wrapper What is the main difference between filter and wrapper methods? In filter methods, evaluation is independent of the classification algorithm. In wrapper methods, evaluation depends on the classification algorithm.

Feature Selection You need to be familiar with: Exhaustive and Naïve search Sequential Forward/Backward Selection (SFS/SBS) Plus-L Minus-R Selection Bidirectional Search Sequential Floating Selection (SFFS and SFBS) Feature selection using GAs

Linear Discriminant Functions General form of linear discriminant: What is the form of the decision boundary? The decision boundary is a hyperplane What is the meaning of w and w0? The orientation and location of the hyperplane are determined by w and w0 correspondingly.

Linear Discriminant Functions What is the geometric interpretation of g(x)? Distance of x from the decision boundary (hyperplane)

Linear Discriminant Functions How do we estimate w and w0? Apply learning using a set of labeled training examples What is the effect of each training example? Places a constraint on the solution solution space (ɑ1, ɑ2) feature space (y1, y2) a1 a2

Linear Discriminant Functions Iterative optimization – what is the main idea? Minimize some error function J(α) iteratively α(k) α(k+1) search direction learning rate

Linear Discriminant Functions Gradient descent method Newton method Perceptron rule

Support Vector Machines What is the capacity of a classifier? What is the VC dimension of a classifier? What is structural risk minimization? Find solutions that (1) minimize the empirical risk and (2) have low VC dimension. It can be shown that: with probability (1-δ)

Support Vector Machines What is the margin of separation? How is it defined? What is the relationship between VC dimension and margin of separation? VC dimension is minimized by maximizing the margin of separation. support vectors

Support Vector Machines What is the criterion being optimized by SVMs? maximize margin:

Support Vector Machines SVM solution depends only on the support vectors: Soft margin classifier – tolerate “outliers”

Support Vector Machines Non-linear SVM – what is the main idea? Map data to a high dimensional space h

Support Vector Machines What is the kernel trick? Compute dot products using a kernel function e.g., polynomial kernel: K(x,y)=(x . y) d

Support Vector Machines Comments about SVMs: SVM is based on exact optimization (no local optima). Its complexity depends on the number of support vectors, not on the dimensionality of the transformed space. Performance depends on the choice of the kernel and its parameters.

Expectation-Maximization (EM) What is the EM algorithm? An iterative method to perform ML estimation i.e., max p(D/ θ) When is EM useful? Most useful for problems where the data is incomplete or can be thought as being incomplete.

Expectation-Maximization (EM) What are the steps of the EM algorithm? Initialization: θ0 Expectation Step: Maximization Step: Test for convergence: Convergence properties of EM ? Solution depends on the initial estimate θ0 No guarantee to find global maximum but stable

Expectation-Maximization (EM) What is a Mixture of Gaussians (MoG)? How are the MoG parameters estimated? Using the EM algorithm How is EM used to estimate the MoGs parameters? Introduce “hidden variables:

Expectation-Maximization (EM) Can you interpret the EM steps for MoGs?

Expectation-Maximization (EM) Can you interpret the EM steps for MoGs?