Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet DokaniaPritish MohapatraC. V. Jawahar.

Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet DokaniaPritish MohapatraC. V. Jawahar

PASCAL VOC “Jumping” Classification Features Processing Training Classifier

PASCAL VOC Features Processing Training Classifier Think of a classifier !!! “Jumping” Classification ✗

PASCAL VOC Features Processing Training Classifier Think of a classifier !!! ✗ “Jumping” Ranking

Ranking vs. Classification Rank 1Rank 2Rank 3 Rank 4Rank 5Rank 6 Average Precision= 1

Ranking vs. Classification Rank 1Rank 2Rank 3 Rank 4Rank 5Rank 6 Average Precision= 1Accuracy= 1 = 0.92 = 0.67 = 0.81

Ranking vs. Classification Ranking is not the same as classification Average precision is not the same as accuracy Should we use 0-1 loss based classifiers? Or should we use AP loss based rankers?

Optimizing Average Precision (AP-SVM) High-Order Information Missing Information Yue, Finley, Radlinski and Joachims, SIGIR 2007 Outline

Problem Formulation Single Input X Φ(x i ) for all i  P Φ(x k ) for all k  N

Problem Formulation Single Output R R ik = +1 if i is better ranked than k -1 if k is better ranked than i

Problem Formulation Scoring Function s i (w) = w T Φ(x i ) for all i  P s k (w) = w T Φ(x k ) for all k  N S(X,R;w) = Σ i  P Σ k  N R ik (s i (w) - s k (w))

Ranking at Test-Time R(w) = max R S(X,R;w) x1x1 Sort samples according to individual scores s i (w) x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 x8x8

Learning Formulation Loss Function Δ(R*,R(w)) = 1 – AP of rank R(w) Non-convex Parameter cannot be regularized

Learning Formulation Upper Bound of Loss Function Δ(R*,R(w))S(X,R(w);w) +- S(X,R(w);w)

Learning Formulation Upper Bound of Loss Function Δ(R*,R(w))S(X,R(w);w) +- S(X,R*;w)

Learning Formulation Upper Bound of Loss Function Δ(R*,R)S(X,R;w) +- S(X,R*;w) max R ConvexParameter can be regularized min w ||w|| 2 + C ξ S(X,R;w) + Δ(R*,R) - S(X,R*;w) ≤ ξ, for all R

Optimization for Learning Cutting Plane Computation max R S(X,R;w) + Δ(R*,R) x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 x8x8 Sort positive samples according to scores s i (w) Sort negative samples according to scores s k (w) Find best rank of each negative sample independently

Optimization for Learning Cutting Plane Computation Training Time 0-1 AP 5x slower AP Slightly faster Mohapatra, Jawahar and Kumar, NIPS 2014

Experiments PASCAL VOC 2011 Jumping Phoning Playing Instrument Reading Riding Bike Riding Horse Running Taking Photo Using Computer Walking ImagesClasses 10 ranking tasks Cross-validation Poselets Features

AP-SVM vs. SVM PASCAL VOC ‘test’ Dataset Difference in AP Better in 8 classes, tied in 2 classes

AP-SVM vs. SVM Folds of PASCAL VOC ‘trainval’ Dataset Difference in AP AP-SVM is statistically better in 3 classes SVM is statistically better in 0 classes

Optimizing Average Precision High-Order Information (HOAP-SVM) Missing Information Dokania, Behl, Jawahar and Kumar, ECCV 2014 Outline

High-Order Information People perform similar actions People strike similar poses Objects are of same/similar sizes “Friends” have similar habits How can we use them for ranking? classification

Problem Formulation x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 Ψ(x,y) = Ψ1(x,y)Ψ1(x,y) Ψ2(x,y)Ψ2(x,y) Unary Features Pairwise Features

Learning Formulation x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 Δ(y*,y) = Fraction of incorrectly classified persons

Optimization for Learning x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 max y w T Ψ(x,y) + Δ(y*,y) Graph Cuts (if supermodular) LP Relaxation, or exhaustive search

Classification x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 max y w T Ψ(x,y) Graph Cuts (if supermodular) LP Relaxation, or exhaustive search

Ranking? x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 Use difference of max-marginals

Max-Marginal for Positive Class x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 mm + (i;w) = max y,y i =+1 w T Ψ(x,y) Best possible score when person i is positive Convex in w

Max-Marginal for Negative Class x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 mm - (i;w) = max y,y i =-1 w T Ψ(x,y) Best possible score when person i is negative Convex in w

Ranking x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 s i (w) = mm + (i;w) – mm - (i;w) Difference-of-Convex in w Use difference of max-marginals HOB-SVM

Ranking s i (w) = mm + (i;w) – mm - (i;w) Why not optimize AP directly? High Order AP-SVM HOAP-SVM

Problem Formulation Single Input X Φ(x i ) for all i  P Φ(x k ) for all k  N

Problem Formulation Single Input R R ik = +1 if i is better ranked than k -1 if k is better ranked than i

Problem Formulation Scoring Function s i (w) = mm + (i;w) – mm - (i;w) for all i  P s k (w) = mm + (k;w) – mm - (k;w) for all k  N S(X,R;w) = Σ i  P Σ k  N R ik (s i (w) - s k (w))

Ranking at Test-Time R(w) = max R S(X,R;w) x1x1 Sort samples according to individual scores s i (w) x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 x8x8

Learning Formulation Loss Function Δ(R*,R(w)) = 1 – AP of rank R(w)

Learning Formulation Upper Bound of Loss Function min w ||w|| 2 + C ξ S(X,R;w) + Δ(R*,R) - S(X,R*;w) ≤ ξ, for all R

Optimization for Learning Difference-of-convex program Kohli and Torr, ECCV 2006 Very efficient CCCP Linearization step by Dynamic Graph Cuts Update step equivalent to AP-SVM

Experiments PASCAL VOC 2011 Jumping Phoning Playing Instrument Reading Riding Bike Riding Horse Running Taking Photo Using Computer Walking ImagesClasses 10 ranking tasks Cross-validation Poselets Features

HOB-SVM vs. AP-SVM PASCAL VOC ‘test’ Dataset Difference in AP Better in 4, worse in 3 and tied in 3 classes

HOB-SVM vs. AP-SVM Folds of PASCAL VOC ‘trainval’ Dataset Difference in AP HOB-SVM is statistically better in 0 classes AP-SVM is statistically better in 0 classes

HOAP-SVM vs. AP-SVM PASCAL VOC ‘test’ Dataset Better in 7, worse in 2 and tied in 1 class Difference in AP

HOAP-SVM vs. AP-SVM Folds of PASCAL VOC ‘trainval’ Dataset HOAP-SVM is statistically better in 4 classes AP-SVM is statistically better in 0 classes Difference in AP

Optimizing Average Precision High-Order Information Missing Information (Latent-AP-SVM) Outline Behl, Jawahar and Kumar, CVPR 2014

Fully Supervised Learning

Weakly Supervised Learning Rank images by relevance to ‘jumping’

Use Latent Structured SVM with AP loss –Unintuitive Prediction –Loose Upper Bound on Loss –NP-hard Optimization for Cutting Planes Carefully design a Latent-AP-SVM –Intuitive Prediction –Tight Upper Bound on Loss –Optimal Efficient Cutting Plane Computation Two Approaches

Results

Questions? Code + Data Available

Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet DokaniaPritish MohapatraC. V. Jawahar.

Similar presentations

Presentation on theme: "Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet DokaniaPritish MohapatraC. V. Jawahar."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet DokaniaPritish MohapatraC. V. Jawahar.

Similar presentations

Presentation on theme: "Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet DokaniaPritish MohapatraC. V. Jawahar."— Presentation transcript:

Similar presentations

About project

Feedback