Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ankit Gupta CSE 590V: Vision Seminar. Goal Articulated pose estimation ( ) recovers the pose of an articulated object which consists of joints and rigid.

Similar presentations


Presentation on theme: "Ankit Gupta CSE 590V: Vision Seminar. Goal Articulated pose estimation ( ) recovers the pose of an articulated object which consists of joints and rigid."— Presentation transcript:

1 Ankit Gupta CSE 590V: Vision Seminar

2 Goal Articulated pose estimation ( ) recovers the pose of an articulated object which consists of joints and rigid parts Slide taken from authors, Yang et al.

3 Classic Approach Part Representation Head, Torso, Arm, Leg Location, Rotation, Scale Marr & Nishihara 1978 Slide taken from authors, Yang et al.

4 Classic Approach Part Representation Head, Torso, Arm, Leg Location, Rotation, Scale Fischler & Elschlager 1973 Felzenszwalb & Huttenlocher 2005 Marr & Nishihara 1978 Pictorial Structure Unary Templates Pairwise Springs Slide taken from authors, Yang et al.

5 Classic Approach Part Representation Head, Torso, Arm, Leg Location, Rotation, Scale Fischler & Elschlager 1973 Felzenszwalb & Huttenlocher 2005 Marr & Nishihara 1978 Andriluka etc Eichner etc Johnson & Everingham 2010 Sapp etc Singh etc Tran & Forsyth 2010 Epshteian & Ullman 2007 Ferrari etc Lan & Huttenlocher 2005 Ramanan 2007 Sigal & Black 2006 Wang & Mori 2008 Pictorial Structure Unary Templates Pairwise Springs Slide taken from authors, Yang et al.

6 Pictorial Structures for Object Recognition Pedro F. Felzenszwalb, Daniel P. Huttenlocher IJCV, 2005

7 Pictorial structure for Face

8 Pictorial Structure Model Slide taken from authors, Yang et al.

9 Pictorial Structure Model Slide taken from authors, Yang et al.

10 Pictorial Structure Model Slide taken from authors, Yang et al.

11 Using this Model

12 Test phaseTrain phase

13 Given: - Images (I) - Known locations of the parts (L) Need to learn - Unary templates - Spatial features Test phaseTrain phase

14 Given: - Images (I) - Known locations of the parts (L) Need to learn - Unary templates - Spatial features Test phaseTrain phase Standard Structural SVM formulation - Standard solvers available (SVMStruct)

15 Given: - Images (I) - Known locations of the parts (L) Need to learn - Unary templates - Spatial features Test phaseTrain phase Standard Structural SVM formulation - Standard solvers available (SVMStruct) Given: - Image (I) Need to compute - Part locations (L) Algorithm - L* = arg max (S(I,L))

16 Given: - Images (I) - Known locations of the parts (L) Need to learn - Unary templates - Spatial features Test phaseTrain phase Standard Structural SVM formulation - Standard solvers available (SVMStruct) Given: - Image (I) Need to compute - Part locations (L) Algorithm - L* = arg max (S(I,L)) Standard inference problem - For tree graphs, can be exactly computed using belief propagation

17 Yi Yang & Deva Ramanan University of California, Irvine

18 Problems with previous methods: Wide Variations In-plane rotationForeshortening ScalingOut-of-plane rotation Intra-category variationAspect ratio Slide taken from authors, Yang et al.

19 Problems with previous methods: Wide Variations In-plane rotationForeshortening ScalingOut-of-plane rotation Intra-category variationAspect ratio Naïve brute-force evaluation is expensive Slide taken from authors, Yang et al.

20 Our Method – “Mini-Parts” Key idea: “mini part” model can approximate deformations Slide taken from authors, Yang et al.

21 Example: Arm Approximation Slide taken from authors, Yang et al.

22 Pictorial Structure Model Slide taken from authors, Yang et al.

23 The Flexible Mixture Model Slide taken from authors, Yang et al.

24 Our Flexible Mixture Model Slide taken from authors, Yang et al.

25 Our Flexible Mixture Model Slide taken from authors, Yang et al.

26 Our Flexible Mixture Model Slide taken from authors, Yang et al.

27 Co-occurrence “Bias” Slide taken from authors, Yang et al.

28 Co-occurrence “Bias”: Example Let part i : eyes, mixture m i = {open, closed} part j : mouth,mixture m j = {smile, frown}

29 Co-occurrence “Bias”: Example b (closed eyes, smiling mouth) b (open eyes, smiling mouth) vs Let part i : eyes, mixture m i = {open, closed} part j : mouth,mixture m j = {smile, frown}

30 Co-occurrence “Bias”: Example b (closed eyes, smiling mouth) b (open eyes, smiling mouth) < learnt Let part i : eyes, mixture m i = {open, closed} part j : mouth,mixture m j = {smile, frown}

31 Using this Model

32 Test phaseTrain phase

33 Given: - Images (I) - Known locations of the parts (L) Need to learn - Unary templates - Spatial features - Co-occurrence Test phaseTrain phase Standard Structural SVM formulation - Standard solvers available (SVMStruct)

34 Given: - Images (I) - Known locations of the parts (L) Need to learn - Unary templates - Spatial features - Co-occurrence Test phaseTrain phase Standard Structural SVM formulation - Standard solvers available (SVMStruct) Given: - Image (I) Need to compute - Part locations (L) - Part mixtures (M) Algorithm - (L*,M*) = arg max (S(I,L,M)) Standard inference problem - For tree graphs, can be exactly computed using belief propagation

35 Results

36 Achieving articulation Slide taken from authors, Yang et al.

37 Achieving articulation Slide taken from authors, Yang et al.

38 Qualitative Results

39

40

41 Failure cases To be put

42 Benchmark Datasets PARSE Full-body nan/papers/parse/index.html BUFFY Upper-body gg/data/stickmen/index.html Slide taken from authors, Yang et al.

43 Quantitative Results on PARSE 1 second per image Image Parse Testset MethodHeadTorsoU. LegsL. LegsU. ArmsL. ArmsTotal Ramanan Andrikluka Johnson Singh Johnson Our Model % of correctly localized limbs Slide taken from authors, Yang et al.

44 Quantitative Results on BUFFY All previous work use explicitly articulated models Subset of Buffy Testset MethodHeadTorsoU. ArmsL. ArmsTotal Tran Andrikluka Eichner Sapp 2010a Sapp 2010b Our Model % of correctly localized limbs Slide taken from authors, Yang et al.

45 More Parts and Mixtures Help 14 parts (joints)27 parts (joints + midpoints) Slide taken from authors, Yang et al.

46 Discussion Possible limitations? Something other than human pose estimation? Can do more useful things with the Kinect? Can this encode occlusions well?

47 References Code and benchmark datasets available


Download ppt "Ankit Gupta CSE 590V: Vision Seminar. Goal Articulated pose estimation ( ) recovers the pose of an articulated object which consists of joints and rigid."

Similar presentations


Ads by Google