Download presentation

Presentation is loading. Please wait.

Published byMakayla Sinclair Modified over 3 years ago

1
**Self-Paced Learning for Semantic Segmentation**

M. Pawan Kumar

3
**Self-Paced Learning for Latent Structural SVM**

M. Pawan Kumar Benjamin Packer Daphne Koller

4
**Aim Input x Output y Y Hidden Variable h H**

To learn accurate parameters for latent structural SVM Input x Output y Y Hidden Variable h H “Deer” Y = {“Bison”, “Deer”, ”Elephant”, “Giraffe”, “Llama”, “Rhino” }

5
**Aim (y*,h*) = maxyY,hH wT(x,y,h) Feature (x,y,h) (HOG, BoW)**

To learn accurate parameters for latent structural SVM Feature (x,y,h) (HOG, BoW) Parameters w (y*,h*) = maxyY,hH wT(x,y,h)

6
**Motivation FAILURE … BAD LOCAL MINIMUM Real Numbers Imaginary Numbers**

Math is for losers !! Real Numbers Imaginary Numbers eiπ+1 = 0 FAILURE … BAD LOCAL MINIMUM

7
**Motivation SUCCESS … GOOD LOCAL MINIMUM Real Numbers Imaginary Numbers**

Euler was a Genius!! Real Numbers Imaginary Numbers eiπ+1 = 0 SUCCESS … GOOD LOCAL MINIMUM

8
**Motivation Simultaneously estimate easiness and parameters**

Start with “easy” examples, then consider “hard” ones Simultaneously estimate easiness and parameters Easiness is property of data sets, not single instances Easy vs. Hard Expensive Easy for human Easy for machine

9
**Outline Latent Structural SVM Concave-Convex Procedure**

Self-Paced Learning Experiments

10
**Latent Structural SVM Training samples xi Ground-truth label yi**

Felzenszwalb et al, 2008, Yu and Joachims, 2009 Training samples xi Ground-truth label yi Loss Function (yi, yi(w), hi(w))

11
**(yi(w),hi(w)) = maxyY,hH wT(x,y,h)**

Latent Structural SVM (yi(w),hi(w)) = maxyY,hH wT(x,y,h) min ||w||2 + C∑i(yi, yi(w), hi(w)) Non-convex Objective Minimize an upper bound

12
**Latent Structural SVM (yi(w),hi(w)) = maxyY,hH wT(x,y,h)**

min ||w||2 + C∑i i maxhiwT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i Still non-convex Difference of convex CCCP Algorithm - converges to a local minimum

13
**Outline Latent Structural SVM Concave-Convex Procedure**

Self-Paced Learning Experiments

14
**Concave-Convex Procedure**

Start with an initial estimate w0 Update hi = maxhH wtT(xi,yi,h) Update wt+1 by solving a convex problem min ||w||2 + C∑i i wT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i 14

15
**Concave-Convex Procedure**

Looks at all samples simultaneously “Hard” samples will cause confusion Start with “easy” samples, then consider “hard” ones 15

16
**Outline Latent Structural SVM Concave-Convex Procedure**

Self-Paced Learning Experiments

17
**Self-Paced Learning REMINDER**

Simultaneously estimate easiness and parameters Easiness is property of data sets, not single instances 17

18
**wT(xi,yi,hi) - wT(xi,y,h)**

Self-Paced Learning Start with an initial estimate w0 Update hi = maxhH wtT(xi,yi,h) Update wt+1 by solving a convex problem min ||w||2 + C∑i i wT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i 18

19
**wT(xi,yi,hi) - wT(xi,y,h)**

Self-Paced Learning min ||w||2 + C∑i i wT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i 19

20
**wT(xi,yi,hi) - wT(xi,y,h)**

Self-Paced Learning vi {0,1} min ||w||2 + C∑i vii wT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i Trivial Solution 20

21
**Self-Paced Learning min ||w||2 + C∑i vii - ∑ivi/K**

wT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i Large K Medium K Small K 21

22
**Self-Paced Learning min ||w||2 + C∑i vii - ∑ivi/K**

Alternating Convex Search Biconvex Problem vi [0,1] min ||w||2 + C∑i vii - ∑ivi/K wT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i Large K Medium K Small K 22

23
**Self-Paced Learning hi = maxhH wtT(xi,yi,h)**

Start with an initial estimate w0 hi = maxhH wtT(xi,yi,h) Update Update wt+1 by solving a convex problem min ||w||2 + C∑i vii - ∑i vi/K wT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i Decrease K K/ 23

24
**Outline Latent Structural SVM Concave-Convex Procedure**

Self-Paced Learning Experiments

25
**Object Detection Input x - Image Output y Y Latent h - Box**

- 0/1 Loss Y = {“Bison”, “Deer”, ”Elephant”, “Giraffe”, “Llama”, “Rhino” } Feature (x,y,h) - HOG

26
**Object Detection Mammals Dataset 271 images, 6 classes**

90/10 train/test split 4 folds

27
Object Detection CCCP Self-Paced

28
Object Detection CCCP Self-Paced

29
Object Detection CCCP Self-Paced

30
Object Detection CCCP Self-Paced

31
Object Detection Objective value Test error

32
**Handwritten Digit Recognition**

Input x - Image Output y Y Latent h - Rotation - 0/1 Loss MNIST Dataset Y = {0, 1, … , 9} Feature (x,y,h) - PCA + Projection

33
**Handwritten Digit Recognition**

SPL C C C - Significant Difference

34
**Handwritten Digit Recognition**

SPL C C C - Significant Difference

35
**Handwritten Digit Recognition**

SPL C C C - Significant Difference

36
**Handwritten Digit Recognition**

SPL C C C - Significant Difference

37
**Feature (x,y,h) - Ng and Cardie, ACL 2002**

Motif Finding Input x - DNA Sequence Output y Y Y = {0, 1} Latent h - Motif Location - 0/1 Loss Feature (x,y,h) - Ng and Cardie, ACL 2002

38
**Motif Finding UniProbe Dataset 40,000 sequences 50/50 train/test split**

5 folds

39
**Motif Finding Average Hamming Distance of Inferred Motifs SPL SPL SPL**

40
Motif Finding SPL Objective Value

41
Motif Finding SPL Test Error

42
**Noun Phrase Coreference**

Input x - Nouns Output y - Clustering Latent h - Spanning Forest over Nouns Feature (x,y,h) - Yu and Joachims, ICML 2009

43
**Noun Phrase Coreference**

MUC6 Dataset 60 documents 50/50 train/test split 1 predefined fold

44
**Noun Phrase Coreference**

MITRE Loss Pairwise Loss - Significant Improvement - Significant Decrement

45
**Noun Phrase Coreference**

SPL MITRE Loss SPL Pairwise Loss

46
**Noun Phrase Coreference**

SPL MITRE Loss SPL Pairwise Loss

47
**Summary Automatic Self-Paced Learning Concave-Biconvex Procedure**

Generalization to other Latent models Expectation-Maximization E-step remains the same M-step includes indicator variables vi Kumar, Packer and Koller, NIPS 2010

Similar presentations

OK

Loss-based Learning with Weak Supervision M. Pawan Kumar.

Loss-based Learning with Weak Supervision M. Pawan Kumar.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on internet services for class 8 Ppt on as 14 amalgamation of companies Ppt on management by objectives pdf Ppt on power generation by speed breaker Ppt on hotel industry in india Ppt on types of plants Ppt on area of plane figures in math Ppt on javascript events load Ppt on conceptual art books Ppt on waves tides and ocean currents diagram