Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11.

Similar presentations


Presentation on theme: "Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11."— Presentation transcript:

1 Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11

2 Goal: Detect all instances of objects Cars Faces Cats

3 Object model: last class Statistical Template in Bounding Box – Object is some (x,y,w,h) in image – Features defined wrt bounding box coordinates ImageTemplate Visualization Images from Felzenszwalb

4 Last class: sliding window detection … …

5 Last class: statistical template Object model = log linear model of parts at fixed positions +3+2-2-2.5= -0.5 +4+1+0.5+3+0.5= 10.5 > 7.5 ? ? Non-object Object

6 When are statistical templates useful? Caltech 101 Average Object Images

7 Deformable objects Images from Caltech-256 Slide Credit: Duan Tran

8 Deformable objects Images from D. Ramanan’s dataset Slide Credit: Duan Tran

9 Object models: this class Articulated parts model – Object is configuration of parts – Each part is detectable Images from Felzenszwalb

10 Parts-based Models Define object by collection of parts modeled by 1.Appearance 2.Spatial configuration Slide credit: Rob Fergus

11 How to model spatial relations? Star-shaped model Root Part

12 How to model spatial relations? Star-shaped model = X X X ≈ Root Part

13 How to model spatial relations? Tree-shaped model

14 How to model spatial relations? Fergus et al. ’03 Fei-Fei et al. ‘03 Leibe et al. ’04, ‘08 Crandall et al. ‘05 Fergus et al. ’05 Crandall et al. ‘05Felzenszwalb & Huttenlocher ‘05 Bouchard & Triggs ‘05Carneiro & Lowe ‘06Csurka ’04 Vasconcelos ‘00 from [Carneiro & Lowe, ECCV’06] O(N 6 )O(N 2 )O(N 3 )O(N 2 ) Many others...

15 Today’s class 1.Tree-shaped model – Example: Pictorial structures Felzenszwalb Huttenlocher 2005 2.Optimization with Dynamic Programming 3.Distance Transforms

16 Pictorial Structures Model Part = oriented rectangleSpatial model = relative size/orientation Felzenszwalb and Huttenlocher 2005

17 Part representation Background subtraction

18 Pictorial Structures Model Appearance likelihood Geometry likelihood

19 Modeling the Appearance Any appearance model could be used – HOG Templates, etc. – Here: rectangles fit to background subtracted binary map Train a detector for each part independently Appearance likelihood Geometry likelihood

20 Pictorial Structures Model Minimize Energy (-log of the likelihood): Appearance CostGeometry Cost

21 Optimization – Brute Force H T F Appearance CostDeformation Cost

22 Optimization H T F

23 Optimization - Dynamic Programming H T F

24 H T F

25 H T F

26 H T F

27 Optimization - Complexity H T F ? For all l T ? Each part can take N locations Overall? (for K parts)

28 For each pixel p, how far away is the nearest pixel q of set S – – G is often the set of edge pixels Distance Transform G: black pixels p1p1 q1q1 p2p2 q2q2

29 Computing the Distance Transform 1-Dimension

30 Extending to N-Dimensions Savings can be extended from 1-D to N-D cases if deformation cost is separable: Computing the Distance Transform

31 Distance Transform - Applications Set distances – e.g. Hausdorff Distance Image processing – e.g. Blurring Robotics – Motion Planning Alignment – Edge images – Motion tracks – Audio warping Deformable Part Models (of course!)

32 Generalized Distance Transform Original form: General form: – m(q): arbitrary function sampled at discrete points – For each p, find a nearby q with small m(q) – For original DT: – For part models: For some deformation costs,

33 How do we do it? Key idea: Construct lower “envelope” of data Given envelope, compute f(p) in O(N) time Goal: Compute envelope in O(N) time

34 Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = Inf

35 Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 3 Start(2) = 3 End(2) = Inf X

36 Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 3 Start(2) = 3 End(2) = 4 Start(3) = 4 End(3) = Inf X

37 Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope X Start(1) = -Inf End(1) = 3 Start(2) = 3 End(2) = 4 Start(3) = 4 End(3) = Inf

38 Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 3 Start(2) = 3 End(2) = 4 Start(3) = [] End(3) = [] Start(4) = 4 End(4) = Inf X

39 Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 3 Start(2) = 3 End(2) = 4 Start(3) = [] End(3) = [] Start(4) = 4 End(4) = Inf X

40 Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 3 Start(2) = 3 End(2) = 4 Start(3) = [] End(3) = [] Start(4) = 4 End(4) = Inf X

41 Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 3 Start(2) = [] End(2) = [] Start(3) = [] End(3) = [] Start(4) = 3 End(4) = Inf X

42 Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 2.6 Start(2) = [] End(2) = [] Start(3) = [] End(3) = [] Start(4) = 2.6 End(4) = Inf X

43 Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 2.6 Start(2) = [] End(2) = [] Start(3) = [] End(3) = [] Start(4) = 2.6 End(4) = 3.5 Start(5) = 3.5 End(5) = Inf X

44 Is This O(N)? Yes! N parabolas are added – For each addition, may remove up to O(N) parabolas But, each parabola can only be deleted once Thus, O(N) additions, and at most O(N) deletions

45 What distances can we use? Quadratic (with linear shift) Abs. diff Min-composition Bounded (Requires extra bookkeeping)

46 Back to the Pictorial structures model Problem: May need to infer more than one configuration a)May be more than one object Report scores for every root position, apply NMS b)Optimal solution may be incorrect Sampling – Sample root node, then each node given parent, until all parts are sampled

47 Sample poses from likelihood and choose best match with Chamfer distance to map

48 Results for person matching 48

49 Results for person matching - Mistakes Ambiguous Background subtracted image 49

50 Recently enhanced pictorial structures BMVC 2009

51 Deformable Latent Parts Model Useful parts discovered during training Detections Template Visualization Felzenszwalb et al. 2008

52 Things to Remember Rather than searching for whole object, can locate parts that compose the object – Better encoding of spatial variation Models can be broken down into part appearance and spatial configuration – Wide variety of models Efficient optimization is often tricky, but many tricks available – Dynamic programming, Distance transforms


Download ppt "Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11."

Similar presentations


Ads by Google