Support Vector Classifier (SVC) Support Vector Support Vector Support Vector a is a weighted linear combination of the support vectors
Some Equations oLinear classification is by y = f(a T x + b) owhere a is a weighting vector, x is the test data, b is an offset, and f(.) is a thresholding operation oa is a linear combination of SVs a = i w i x i oSo y = f( i w i x i T x + b)
Going Nonlinear oNonlinear classification is by y = f( i w i (x i,x)) owhere (x i,x) is some function of x i and x. oe.g. RBF classification (x i,x) = exp(-||x i -x|| 2 /(2 2 )) oRequires a matrix of distance measures (metrics) between each pair of images.
What is a Metric? oPositive oDist(A,B) 0 oDist(A,A) = 0 oSymmetric oDist(A,B) = Dist(B,A) oSatisfy triangle inequality oDist(A,B)+Dist(B,C) Dist(A,C) A B C
Partial Differential Equations Model one image as it deforms to match another. x(t) = V x(t) Its a bit like DCM but with much bigger V matrices (about 10,000,000 x 10,000,000 – instead of about 4x4). x(t+1) = e V x(t)
Matrix representations of diffeomorphisms x(1) = e V x(0) x(0) = e -V x(1) For large k e V (I+V/k) k
Compositions Large deformations generated from compositions of small deformations S 1 = S 1/8 o S 1/8 o S 1/8 o S 1/8 o S 1/8 o S 1/8 o S 1/8 o S 1/8 Recursive formulation S 1 = S 1/2 o S 1/2, S 1/2 = S 1/4 o S 1/4, S 1/4 = S 1/8 o S 1/8 Small deformation approximation S 1/8 I + V/8
The shape metric oDont use the straight distance (i.e. v T v ) oDistance = v T L T Lv oWhats the best form of L? oMembrane Energy oBending Energy oLinear Elastic Energy
Consistent registration A B C A B C µ Totally impractical for lots of scans Problem: How can the distance between e.g. A and B be computed? Inverse exponentiating is iterative and slow. Register to a mean shaped image
Metrics from residuals oMeasures of difference between tensors. oRelates to objective functions used for image registration. oCan the same principles be used?
Over-fitting Test data A simpler model can often do better...
Cross-validation oMethods must be able to generalise to new data oVarious control parameters oMore complexity -> better separation of training data oLess complexity -> better generalisation oOptimal control parameters determined by cross- validation oTest with data not used for training oUse control parameters that work best for these data
Two-fold Cross-validation Use half the data for training. and the other half for testing.
Two-fold Cross-validation Then swap around the training and test data.
Leave One Out Cross-validation Use all data except one point for training. The one that was left out is used for testing.
Leave One Out Cross-validation Then leave another point out. And so on...
Interpretation?? oSignificance assessed from accuracy based on cross-validation. oMain problems: oNo simple interpretation. oMechanism of classification is difficult to visualise oespecially for nonlinear classifiers oDifficult to understand (not like blobs) oMay be able to use the separation to derive simple (and more publishable hypotheses).
Group Theory oDiffeomorphisms (smooth continuous one-to-one mappings) form a Group. oClosure oA o B remains in the same group. oAssociativity o(A o B) o C = A o (B o C) oIdentity oIdentity transform I exists. oInverse oA -1 exists, and A -1 o A=A o A -1 = I oIt is a Lie Group. oThe group of diffeomorphisms constitute a smooth manifold. oThe operations are differentiable.
Lie Groups oSimple Lie Groups include various classes of affine transform matrices. oE.g. SO(2) : Special Orthogonal 2D (rigid-body rotation in 2D). oManifold is a circle oLie Algebra is exponentiated to give Lie group. For square matrices, this involves a matrix exponential.
Relevance to Diffeomorphisms oParameterise with velocities, rather than displacements. oVelocities are the Lie Algebra. These are exponentiated to a deformation by recursive application of tiny displacements, over a period of time=0..1. oA (1) = A (1/2) o A (1/2) oA (1/2) = A (1/4) o A (1/4) oDont actually use matrices. oFor tiny deformations, things are almost linear. ox (1/1024) x (0) + v x /1024 oy (1/1024) y (0) + v y /1024 oz (1/1024) z (0) + v z /1024 oRecursive application by ox (1/2) = x (1/4) (x (1/4), y (1/4),z (1/4) ) oy (1/2) = y (1/4) (x (1/4), y (1/4),z (1/4) ) oz (1/2) = z (1/4) (x (1/4), y (1/4),z (1/4) )
Working with Diffeomorphisms oAveraging Warps. oDistances on the manifold are given by geodesics. oAverage of a number of deformations is a point on the manifold with the shortest sum of squared geodesic distances. oE.g. average position of London, Sydney and Honolulu. oInversion. oNegate the velocities, and exponentiate. ox (1/1024) x (0) - v x /1024 oy (1/1024) y (0) - v y /1024 oz (1/1024) z (0) - v z /1024 oPriors for registration oBased on smoothness of the velocities. oVelocities relate to distances from origin.