Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet DokaniaPritish MohapatraC. V. Jawahar.

Similar presentations


Presentation on theme: "Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet DokaniaPritish MohapatraC. V. Jawahar."— Presentation transcript:

1 Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet DokaniaPritish MohapatraC. V. Jawahar

2 PASCAL VOC “Jumping” Classification Features Processing Training Classifier

3 PASCAL VOC Features Processing Training Classifier Think of a classifier !!! “Jumping” Classification ✗

4 PASCAL VOC Features Processing Training Classifier Think of a classifier !!! ✗ “Jumping” Ranking

5 Ranking vs. Classification Rank 1Rank 2Rank 3 Rank 4Rank 5Rank 6 Average Precision= 1

6 Ranking vs. Classification Rank 1Rank 2Rank 3 Rank 4Rank 5Rank 6 Average Precision= 1Accuracy= 1 = 0.92 = 0.67 = 0.81

7 Ranking vs. Classification Ranking is not the same as classification Average precision is not the same as accuracy Should we use 0-1 loss based classifiers? Or should we use AP loss based rankers?

8 Optimizing Average Precision (AP-SVM) High-Order Information Missing Information Yue, Finley, Radlinski and Joachims, SIGIR 2007 Outline

9 Problem Formulation Single Input X Φ(x i ) for all i  P Φ(x k ) for all k  N

10 Problem Formulation Single Output R R ik = +1 if i is better ranked than k -1 if k is better ranked than i

11 Problem Formulation Scoring Function s i (w) = w T Φ(x i ) for all i  P s k (w) = w T Φ(x k ) for all k  N S(X,R;w) = Σ i  P Σ k  N R ik (s i (w) - s k (w))

12 Ranking at Test-Time R(w) = max R S(X,R;w) x1x1 Sort samples according to individual scores s i (w) x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 x8x8

13 Learning Formulation Loss Function Δ(R*,R(w)) = 1 – AP of rank R(w) Non-convex Parameter cannot be regularized

14 Learning Formulation Upper Bound of Loss Function Δ(R*,R(w))S(X,R(w);w) +- S(X,R(w);w)

15 Learning Formulation Upper Bound of Loss Function Δ(R*,R(w))S(X,R(w);w) +- S(X,R*;w)

16 Learning Formulation Upper Bound of Loss Function Δ(R*,R)S(X,R;w) +- S(X,R*;w) max R ConvexParameter can be regularized min w ||w|| 2 + C ξ S(X,R;w) + Δ(R*,R) - S(X,R*;w) ≤ ξ, for all R

17 Optimization for Learning Cutting Plane Computation max R S(X,R;w) + Δ(R*,R) x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 x8x8 Sort positive samples according to scores s i (w) Sort negative samples according to scores s k (w) Find best rank of each negative sample independently

18 Optimization for Learning Cutting Plane Computation Training Time 0-1 AP 5x slower AP Slightly faster Mohapatra, Jawahar and Kumar, NIPS 2014

19 Experiments PASCAL VOC 2011 Jumping Phoning Playing Instrument Reading Riding Bike Riding Horse Running Taking Photo Using Computer Walking ImagesClasses 10 ranking tasks Cross-validation Poselets Features

20 AP-SVM vs. SVM PASCAL VOC ‘test’ Dataset Difference in AP Better in 8 classes, tied in 2 classes

21 AP-SVM vs. SVM Folds of PASCAL VOC ‘trainval’ Dataset Difference in AP AP-SVM is statistically better in 3 classes SVM is statistically better in 0 classes

22 Optimizing Average Precision High-Order Information (HOAP-SVM) Missing Information Dokania, Behl, Jawahar and Kumar, ECCV 2014 Outline

23

24

25 High-Order Information People perform similar actions People strike similar poses Objects are of same/similar sizes “Friends” have similar habits How can we use them for ranking? classification

26 Problem Formulation x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 Ψ(x,y) = Ψ1(x,y)Ψ1(x,y) Ψ2(x,y)Ψ2(x,y) Unary Features Pairwise Features

27 Learning Formulation x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 Δ(y*,y) = Fraction of incorrectly classified persons

28 Optimization for Learning x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 max y w T Ψ(x,y) + Δ(y*,y) Graph Cuts (if supermodular) LP Relaxation, or exhaustive search

29 Classification x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 max y w T Ψ(x,y) Graph Cuts (if supermodular) LP Relaxation, or exhaustive search

30 Ranking? x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 Use difference of max-marginals

31 Max-Marginal for Positive Class x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 mm + (i;w) = max y,y i =+1 w T Ψ(x,y) Best possible score when person i is positive Convex in w

32 Max-Marginal for Negative Class x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 mm - (i;w) = max y,y i =-1 w T Ψ(x,y) Best possible score when person i is negative Convex in w

33 Ranking x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 s i (w) = mm + (i;w) – mm - (i;w) Difference-of-Convex in w Use difference of max-marginals HOB-SVM

34 Ranking s i (w) = mm + (i;w) – mm - (i;w) Why not optimize AP directly? High Order AP-SVM HOAP-SVM

35 Problem Formulation Single Input X Φ(x i ) for all i  P Φ(x k ) for all k  N

36 Problem Formulation Single Input R R ik = +1 if i is better ranked than k -1 if k is better ranked than i

37 Problem Formulation Scoring Function s i (w) = mm + (i;w) – mm - (i;w) for all i  P s k (w) = mm + (k;w) – mm - (k;w) for all k  N S(X,R;w) = Σ i  P Σ k  N R ik (s i (w) - s k (w))

38 Ranking at Test-Time R(w) = max R S(X,R;w) x1x1 Sort samples according to individual scores s i (w) x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 x8x8

39 Learning Formulation Loss Function Δ(R*,R(w)) = 1 – AP of rank R(w)

40 Learning Formulation Upper Bound of Loss Function min w ||w|| 2 + C ξ S(X,R;w) + Δ(R*,R) - S(X,R*;w) ≤ ξ, for all R

41 Optimization for Learning Difference-of-convex program Kohli and Torr, ECCV 2006 Very efficient CCCP Linearization step by Dynamic Graph Cuts Update step equivalent to AP-SVM

42 Experiments PASCAL VOC 2011 Jumping Phoning Playing Instrument Reading Riding Bike Riding Horse Running Taking Photo Using Computer Walking ImagesClasses 10 ranking tasks Cross-validation Poselets Features

43 HOB-SVM vs. AP-SVM PASCAL VOC ‘test’ Dataset Difference in AP Better in 4, worse in 3 and tied in 3 classes

44 HOB-SVM vs. AP-SVM Folds of PASCAL VOC ‘trainval’ Dataset Difference in AP HOB-SVM is statistically better in 0 classes AP-SVM is statistically better in 0 classes

45 HOAP-SVM vs. AP-SVM PASCAL VOC ‘test’ Dataset Better in 7, worse in 2 and tied in 1 class Difference in AP

46 HOAP-SVM vs. AP-SVM Folds of PASCAL VOC ‘trainval’ Dataset HOAP-SVM is statistically better in 4 classes AP-SVM is statistically better in 0 classes Difference in AP

47 Optimizing Average Precision High-Order Information Missing Information (Latent-AP-SVM) Outline Behl, Jawahar and Kumar, CVPR 2014

48 Fully Supervised Learning

49 Weakly Supervised Learning Rank images by relevance to ‘jumping’

50 Use Latent Structured SVM with AP loss –Unintuitive Prediction –Loose Upper Bound on Loss –NP-hard Optimization for Cutting Planes Carefully design a Latent-AP-SVM –Intuitive Prediction –Tight Upper Bound on Loss –Optimal Efficient Cutting Plane Computation Two Approaches

51 Results

52 Questions? Code + Data Available


Download ppt "Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet DokaniaPritish MohapatraC. V. Jawahar."

Similar presentations


Ads by Google