Download presentation

Presentation is loading. Please wait.

Published byAmia Erickson Modified over 2 years ago

1
Microsoft Research Ltd. Semidefinite Programming Machines Thore Graepel and Ralf Herbrich Microsoft Research Cambridge

2
Microsoft Research Ltd. Overview Invariant Pattern Recognition Semidefinite Programming (SDP) From Support Vector Machines (SVMs) to Semidefinite Programming Machines (SDPMs) Experimental Illustration Future Work

3
Microsoft Research Ltd. Typical Invariances for Images Translation Rotation Shear

4
Microsoft Research Ltd. Typical Invariances for Images Translation Rotation Shear

5
Microsoft Research Ltd. Toy Features for Handwritten Digits 1 = = =0.58

6
Microsoft Research Ltd. Warning: Highly Non-Linear Á1Á1 Á2Á2

7
Microsoft Research Ltd. Warning: Highly Non-Linear

8
Microsoft Research Ltd. Motivation: Classification Learning ( x ) 2 ( x ) Can we learn with infinitely many examples?

9
Microsoft Research Ltd. Motivation: Classification Learning ( x ) 2 ( x )

10
Microsoft Research Ltd. Motivation: Version Spaces Original patternsTransformed patterns

11
Microsoft Research Ltd. Semidefinite Programs (SDPs) Linear objective function Positive semidefinite (psd) constraints Infinitely many linear constraints

12
Microsoft Research Ltd. SVM as a Quadratic Program Given: A sample ((x 1,y 1 ),…,(x m,y m )). SVMs find the weight vector w that maximises the margin on the sample

13
Microsoft Research Ltd. SVM as a Semidefinite Program (I) A (block)-diagonal matrix is psd if and only if all its blocks are psd. A j := g 1,j gi,jgi,j gm,jgm,j B:=

14
Microsoft Research Ltd. SVM as a Semidefinite Program (I) A (block)-diagonal matrix is psd if and only if all its blocks are psd. A j := g 1,j gi,jgi,j gm,jgm,j B:=

15
Microsoft Research Ltd. SVM as a Semidefinite Program (II) Transform quadratic into linear objective Adds new (n+1)£(n+1) block to A j and B Use Schurs complement lemma

16
Microsoft Research Ltd. Taylor Approximation of Invariance Let T (x,µ) be an invariance transformation with parameter µ (e.g., angle of rotation). Taylor Expansion about 0 =0 gives Polynomial approximation to trajectory.

17
Microsoft Research Ltd. Extension to Polynomials Consider polynomial trajectory x(µ): Infinite number of constraints from training example (x (0),…, x (r),y):

18
Microsoft Research Ltd. Non-Negative Polynomials (I) Theorem (Nesterov,2000): If r=2l then 1. For every psd matrix P the polynomial p(µ)=µ T P µ is non-negative everywhere. 2. For every non-negative polynomial p there exists a psd matrix P such that p(µ)=µ T Pµ. Example:

19
Microsoft Research Ltd. Non-Negative Polynomials (II) (1) follows directly from psd definition (2) follows from sum-of-squares lemma. Note that (2) states the mere existence: Polynomial of degree r: r+1 parameters Coefficient matrix P:(r+2) (r+4)/8 parameters For r >2, we have to introduce another r(r-2)/8 auxiliary variables to find P.

20
Microsoft Research Ltd. Semidefinite Programming Machines Extension of SVMs as (non-trivial) SDP. A j := g 1,j gi,jgi,j gm,jgm,j B:= G 1,j Gi,jGi,j Gm,jGm,j

21
Microsoft Research Ltd. Semidefinite Programming Machines Extension of SVMs as (non-trivial) SDP. A j := g 1,j gi,jgi,j gm,jgm,j B:= G 1,j Gi,jGi,j Gm,jGm,j

22
Microsoft Research Ltd. Example: Second-Order SDPMs 2 nd order Taylor expansion: Resulting polynomial in µ: Set of constraint matrices:

23
Microsoft Research Ltd. Example: Second-Order SDPMs 2 nd order Taylor expansion: Resulting polynomial in µ: Set of constraint matrices:

24
Microsoft Research Ltd. Non-Negative on Segment Given a polynomial p of degree 2l, consider the polynomial Note that q is a polynomial of degree 4l. If q is positive everywhere, then p is positive everywhere in [-¿,+¿] f( )

25
Microsoft Research Ltd. Non-Negative on Segment f( )

26
Microsoft Research Ltd. Truly Virtual Support Vectors Dual complementarity yields expansion: The truly virtual support vectors are linear combinations of derivatives:

27
Microsoft Research Ltd. Truly Virtual Support Vectors

28
Microsoft Research Ltd. Visualisation: USPS 1 vs ¿ = 20 º

29
Microsoft Research Ltd. Results: Experimental Setup All 45 USPS classification tasks (1-v-1). 20 training images; 250 test images. Rotation is applied to all training images with ¿ = 10º. All results are averaged over 50 random training sets. Compared to SVM and virtual SVM.

30
Microsoft Research Ltd. Results: SDPM vs. SVM SVM error SDPM error

31
Microsoft Research Ltd. Results: SDPM vs. Virtual SVM VSVM error SDPM error

32
Microsoft Research Ltd. Results: Curse of Dimensionality

33
Microsoft Research Ltd. Results: Curse of Dimensionality 1 parameter 2 parameters

34
Microsoft Research Ltd. Extensions & Future Work Multiple parameters µ 1, µ 2,..., µ D. (Efficient) adaptation to kernel space. Semidefinite Perceptrons (NIPS poster with A. Kharechko and J. Shawe-Taylor). Sparsification by efficiently finding the example x and transformation µ with maximal information (idea of Neil Lawrence). Expectation propagation for BPMs (idea of Tom Minka).

35
Microsoft Research Ltd. Conclusions & Future Work Learning from infinitely many examples. Truly virtual support vectors x i (µ i *). Multiple parameters µ 1, µ 2,..., µ D. (Efficient) adaptation to kernel space. Semidefinite Perceptrons (NIPS poster with A. Kharechko and J. Shawe-Taylor).

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google