Download presentation

Presentation is loading. Please wait.

Published byMakayla Sinclair Modified over 3 years ago

1
Self-Paced Learning for Semantic Segmentation M. Pawan Kumar

2

3
Self-Paced Learning for Latent Structural SVM Daphne KollerBenjamin Packer M. Pawan Kumar

4
Aim To learn accurate parameters for latent structural SVM Input x Output y Y Deer Hidden Variable h H Y = {Bison, Deer, Elephant, Giraffe, Llama, Rhino }

5
Aim To learn accurate parameters for latent structural SVM Feature (x,y,h) (HOG, BoW) (y*,h*) = max y Y,h H w T (x,y,h) Parameters w

6
Motivation Real Numbers Imaginary Numbers e iπ +1 = 0 Math is for losers !! FAILURE … BAD LOCAL MINIMUM

7
Motivation Real Numbers Imaginary Numbers e iπ +1 = 0 Euler was a Genius!! SUCCESS … GOOD LOCAL MINIMUM

8
Motivation Start with easy examples, then consider hard ones Easy vs. Hard Expensive Easy for human Easy for machine Simultaneously estimate easiness and parameters Easiness is property of data sets, not single instances

9
Outline Latent Structural SVM Concave-Convex Procedure Self-Paced Learning Experiments

10
Latent Structural SVM Training samples x i Ground-truth label y i Loss Function (y i, y i (w), h i (w)) Felzenszwalb et al, 2008, Yu and Joachims, 2009

11
Latent Structural SVM (y i (w),h i (w)) = max y Y,h H w T (x,y,h) min ||w|| 2 + C i (y i, y i (w), h i (w)) Non-convex Objective Minimize an upper bound

12
Latent Structural SVM min ||w|| 2 + C i i max h i w T (x i,y i,h i ) - w T (x i,y,h) (y i, y, h) - i Still non-convexDifference of convex CCCP Algorithm - converges to a local minimum (y i (w),h i (w)) = max y Y,h H w T (x,y,h)

13
Outline Latent Structural SVM Concave-Convex Procedure Self-Paced Learning Experiments

14
Concave-Convex Procedure Start with an initial estimate w 0 Update Update w t+1 by solving a convex problem min ||w|| 2 + C i i w T (x i,y i,h i ) - w T (x i,y,h) (y i, y, h) - i h i = max h H w t T (x i,y i,h)

15
Concave-Convex Procedure Looks at all samples simultaneously Hard samples will cause confusion Start with easy samples, then consider hard ones

16
Outline Latent Structural SVM Concave-Convex Procedure Self-Paced Learning Experiments

17
Self-Paced Learning REMINDER Simultaneously estimate easiness and parameters Easiness is property of data sets, not single instances

18
Self-Paced Learning Start with an initial estimate w 0 Update Update w t+1 by solving a convex problem min ||w|| 2 + C i i w T (x i,y i,h i ) - w T (x i,y,h) (y i, y, h) - i h i = max h H w t T (x i,y i,h)

19
Self-Paced Learning min ||w|| 2 + C i i w T (x i,y i,h i ) - w T (x i,y,h) (y i, y, h) - i

20
Self-Paced Learning min ||w|| 2 + C i v i i w T (x i,y i,h i ) - w T (x i,y,h) (y i, y, h) - i v i {0,1} Trivial Solution

21
Self-Paced Learning v i {0,1} Large KMedium KSmall K min ||w|| 2 + C i v i i - i v i /K w T (x i,y i,h i ) - w T (x i,y,h) (y i, y, h) - i

22
Self-Paced Learning v i [0,1] min ||w|| 2 + C i v i i - i v i /K w T (x i,y i,h i ) - w T (x i,y,h) (y i, y, h) - i Large KMedium KSmall K Biconvex Problem Alternating Convex Search

23
Self-Paced Learning Start with an initial estimate w 0 Update Update w t+1 by solving a convex problem min ||w|| 2 + C i v i i - i v i /K w T (x i,y i,h i ) - w T (x i,y,h) (y i, y, h) - i h i = max h H w t T (x i,y i,h) Decrease K K/

24
Outline Latent Structural SVM Concave-Convex Procedure Self-Paced Learning Experiments

25
Object Detection Feature (x,y,h) - HOG Input x - Image Output y Y Latent h - Box - 0/1 Loss Y = {Bison, Deer, Elephant, Giraffe, Llama, Rhino }

26
Object Detection 271 images, 6 classes 90/10 train/test split 4 folds Mammals Dataset

27
Object Detection CCCP Self-Paced

28
Object Detection CCCP Self-Paced

29
Object Detection CCCP Self-Paced

30
Object Detection CCCP Self-Paced

31
Objective valueTest error Object Detection

32
Handwritten Digit Recognition Feature (x,y,h) - PCA + Projection Input x - Image Output y Y Y = {0, 1, …, 9} Latent h - Rotation MNIST Dataset - 0/1 Loss

33
Handwritten Digit Recognition - Significant Difference C C C SPL

34
Handwritten Digit Recognition - Significant Difference C C C SPL

35
Handwritten Digit Recognition - Significant Difference C C C SPL

36
Handwritten Digit Recognition - Significant Difference C C C SPL

37
Motif Finding Feature (x,y,h) - Ng and Cardie, ACL 2002 Input x - DNA Sequence Output y Y Y = {0, 1} Latent h - Motif Location - 0/1 Loss

38
Motif Finding 40,000 sequences 50/50 train/test split 5 folds UniProbe Dataset

39
Motif Finding Average Hamming Distance of Inferred Motifs SPL

40
Motif Finding Objective Value SPL

41
Motif Finding Test Error SPL

42
Noun Phrase Coreference Feature (x,y,h) - Yu and Joachims, ICML 2009 Input x - NounsOutput y - Clustering Latent h - Spanning Forest over Nouns

43
Noun Phrase Coreference 60 documents 50/50 train/test split 1 predefined fold MUC6 Dataset

44
Noun Phrase Coreference - Significant Improvement - Significant Decrement MITRE Loss Pairwise Loss

45
Noun Phrase Coreference MITRE Loss Pairwise Loss SPL

46
Noun Phrase Coreference MITRE Loss Pairwise Loss SPL

47
Summary Automatic Self-Paced Learning Concave-Biconvex Procedure Generalization to other Latent models – Expectation-Maximization – E-step remains the same – M-step includes indicator variables v i Kumar, Packer and Koller, NIPS 2010

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google