Support Vector Classification (Linearly Separable Case, Primal) The hyperplanethat solves the minimization problem: realizes the maximal margin hyperplane.
Published byModified over 4 years ago
Presentation on theme: "Support Vector Classification (Linearly Separable Case, Primal) The hyperplanethat solves the minimization problem: realizes the maximal margin hyperplane."— Presentation transcript:
Support Vector Classification (Linearly Separable Case, Primal) The hyperplanethat solves the minimization problem: realizes the maximal margin hyperplane with geometric margin
Support Vector Classification (Linearly Separable Case, Dual Form) The dual problem of previous MP: subject to Applying the KKT optimality conditions, we have. But where is Don ’ t forget
Dual Representation of SVM (Key of Kernel Methods: ) The hypothesis is determined by
Compute the Geometric Margin via Dual Solution The geometric marginand, hence we can compute by using. Use KKT again (in dual)! Don ’ t forget
Soft Margin SVM (Nonseparable Case) If data are not linearly separable Primal problem is infeasible Dual problem is unbounded above Introduce the slack variable for each training point The inequality system is always feasible e.g.
Two Different Measures of Training Error 2-Norm Soft Margin: 1-Norm Soft Margin:
2-Norm Soft Margin Dual Formulation The Lagrangian for 2-norm soft margin: where The partial derivatives with respect to primal variables equal zeros
Dual Maximization Problem For 2-Norm Soft Margin Dual: The corresponding KKT complementarity: Use above conditions to find
Linear Machine in Feature Space Let be a nonlinear map from the input space to some feature space The classifier will be in the form ( Primal ): Make it in the dual form:
Kernel: Represent Inner Product in Feature Space The classifier will become: Definition: A kernel is a function such that where
Introduce Kernel into Dual Formulation Letbe a linearly separable training sample in the feature space implicitly defined by the kernel. The SV classifier is determined bythat solves subject to
The value of kernel function represents the inner product in feature space Kernel functions merge two steps 1. map input data from input space to feature space (might be infinite dim.) 2. do inner product in the feature space Kernel Technique Based on Mercer ’ s Condition (1909)
Mercer’s Conditions Guarantees the Convexity of QP and is a symmetric function on. be a finite space Let Then is a kernel function if and only if is positive semi-definite.
Introduce Kernel in Dual Formulation For 2-Norm Soft Margin Then the decision rule is defined by Use above conditions to find The feature space implicitly defined by Supposesolves the QP problem:
Introduce Kernel in Dual Formulation for 2-Norm Soft Margin for any is chosen so that with Because: and
Geometric Margin in Feature Space for 2-Norm Soft Margin The geometric margin in the feature space is defined by Why
Discussion about C for 2-Norm Soft Margin The only difference between “ hard margin ” and 2-norm soft margin is the objective function in the optimization problem Larger C will give you a smaller margin in the feature space Compare Smaller C will give you a better numerical condition