Presentation is loading. Please wait.

Presentation is loading. Please wait.

1.Introduction 2.Article [1] Real Time Motion Capture Using a Single TOF Camera (2010) 3.Article [2] Real Time Human Pose Recognition In Parts Using a.

Similar presentations


Presentation on theme: "1.Introduction 2.Article [1] Real Time Motion Capture Using a Single TOF Camera (2010) 3.Article [2] Real Time Human Pose Recognition In Parts Using a."— Presentation transcript:

1

2 1.Introduction 2.Article [1] Real Time Motion Capture Using a Single TOF Camera (2010) 3.Article [2] Real Time Human Pose Recognition In Parts Using a Single Depth Images(2011)

3 Fig From [2]

4 Why do we need this? Robotics Smart surveillance virtual reality motion analysis Gaming - Kinect

5 Microsoft Xbox 360 console “You are the controller” Launched - 04/11/10 In the first 60 days on the market sold over 8M units! (Guinness world record) http://www.youtube.com/watch?v=p2qlHoxPioM http://www.youtube.com/watch?v=p2qlHoxPioM

6

7 mocap using markers – expensive Multi View camera systems – limited applicability. Monocular – simplified problems.

8 Time Of Flight Camera. (TOF) Dense depth High frame rate (100 Hz) Robust to: Lighting shadows other problems.

9

10 2.1 previous work 2.2 What’s new? 2.3 Overview 2.4 results 2.5 limitations & future work 2.6 Evaluation

11 Many many many articles… (Moeslund et al 2006–covered 350 articles…) (2006) (2006) (1998)

12  TOF technology  Propagating information up the kinematic chain.  Probabilistic model using the unscented transform.  Multiple GPUs.

13 1. Probabilistic Model 2. Algorithm Overview:  Model Based Hill Climbing Search  Evidence Propagation  Full Algorithm

14 15 body parts DAG – Directed Acyclic Graph pose speed range scan DBN– Dynamic Bayesian Network

15 dynamic Bayesian network (DBN) Assumptions  Use ray casting to evaluate distance from measurement.  Goal: F ind the most likely states, given previous frame MAP, i.e.: Fig From [1]

16 1.Hill climbing search (HC) 2.Evidence Propagation –EP

17 Fig From [1] Calculate evaluate likelihood choose best point! Grid around Sample Coarse to fine Grids.

18 The good: Simple Fast run in parallel in GPUS The Bad: Local optimum Ridges, Plateau, Alleys Can lose track when motion is fast,or occlusions occur.

19 Also has 3 stages: 1.Body part detection (C. Plagemann et al 2010) 2.Probabilistic Inverse Kinematics 3.Data association and inference

20 Bottom up approach: 1.Locate interest points with AGEX – Accumulative Geodesic Extrema. 2.Find orientation. 3.Classify the head, foots and hands using local shape descriptors. Fig From [3]

21 Results: Fig From [3]

22  Assume Correspondence  Need new MAP conditioned on.  Problem – isn’t linear!  Solution: Linearize with the unscented Kalman filter.  Easy to determine.

23 X’>X best ?

24 Experiments: 28 real depth image sequences. Ground Truth - tracking markers., – real marker position – estimated position perfect tracks. fault tracking. Compared 3 algorithms: EP, HC, HC+EP.

25 best – HC+EP, worse – EP. Runs close to real time. HC: 6 frames per second. HC+EP: 4-6 frames per second. Fig From [1]

26 HC HC+EP Lose track Extreme case – 27: Fig From [1]

27 Limitations: Manual Initialization. Tracking more than one person at a time. Using temporal data – consume more time, reinitialization problem. Future work: improving the speed. combining with color cameras fully automatic model initialization. Track more than 1 person.

28 Well Written Self Contained Novel combination of existing parts New technology Achieving goals (real time) Missing examples on probabilistic model. Not clear how is defined Extensively validated: Data set and code available not enough visual examples in article No comparison to different algorithms

29

30 2.1 previous work 2.2 What’s new? 2.3 Overview 2.4 results 2.5 limitations & future work 2.6 Evaluation

31  Same as Article [1].

32  Using no temporal information – robust and fast (200 frames per second).  Object recognition approach.  per pixel classification.  Large and highly varied training dataset. Fig From [2]

33 1. Database construction 2. Body part inference and joint proposals: Goals: computational efficiency and robustness

34 Pose estimation is often overcome lack of training data… why??? Huge color and texture variability. Computer simulation don’t produce the range of volitional motions of a human subject.

35 Fig From [2]

36

37 1. Body part labeling 2. Depth image features 3. Randomized decision forests 4. Joint position proposals

38 31 body parts labeled. The problem now can be solved by an efficient classification algorithms. Fig From [2]

39 Simple depth comparison features :(1) – depth at pixel x in image I, offset normalization - depth invariant. computational efficiency: no preprocessing. Fig From [2]

40 How does it work? Node = feature Classify pixel x: Fig From [2] Pixel x

41 Training Algorithm: 1M Images – 2000 pixels Per image *H-antropy  Training 3 trees, depth 20, 1M images~ 1 day (1000 core cluster) 1M images*2000pixels*2000 *50 =

42 Fig From [2] Trained tree:

43 Local mode finding approach based on mean shift with a weighted Gaussian kernel. Density estimator: Fig From [4]

44 Experiments: 8800 frames of real depth images. 5000 synthetic depth images. Also evaluate Article [1] dataset. Measures : 1. Classification accuracy – confusion matrix. 2. joint accuracy –mean Average Precision (mAP) results within D=0.1m –TP.

45 Fig From [2]

46 high correlation between real and synthetic. Depth of tree – most effective Fig From [2]

47 Comparing the algorithm on: real set (red) – mAP 0.731 ground truth set (blue) – mAP 0.914 mAP 0.984 – upper body Fig From [2]

48 Comparing algorithm to ideal Nearest Neighbor matching, and realistic NN - Chamfer NN. Fig From [2]

49 Comparison to Article[1]: Run on the same dataset Better results (even without temporal data) Runs 10x faster. Fig From [2]

50 Full rotations and multiple people Right-left ambiguity mAP of 0.655 ( good for our uses) Result Video Fig From [2]

51 Faster proposals When using simple bottom-up clustering instead of mean shift: Mean shift: 50fps 0.731 mAP. Simple cluster: 200fps 0.677 mAP.

52 Future work: better synthesis pipeline Is there efficient approach that directly regress joint positions? (already done in future work - Efficient offset regression of body joint positions) Efficient offset regression of body joint positions

53 Well Written Self Contained Novel combination of existing parts New technology Achieving goals (real time) Extensively validated: Used in real console Many results graphs and examples (Another pdf of supplementary material) Broad comparison to other algorithms data set and code not available

54 [1] Real Time Motion Capture Using a Single TOF Camera (V. Ganapathi et al. 2010) [2] Real Time Human Pose Recognition In Parts Using a Single Depth Images(Shotton et al. & Xbox Incubation 2011) [3] Real time identification and localization of body parts from depth images (C. Plagemann et al. 2010) [4] Computer Graphics course (046746), Technion.

55


Download ppt "1.Introduction 2.Article [1] Real Time Motion Capture Using a Single TOF Camera (2010) 3.Article [2] Real Time Human Pose Recognition In Parts Using a."

Similar presentations


Ads by Google