Presentation is loading. Please wait.

Presentation is loading. Please wait.

Po-Hsiang Chen Advisor: Sheng-Jyh Wang 2/13/2012.

Similar presentations


Presentation on theme: "Po-Hsiang Chen Advisor: Sheng-Jyh Wang 2/13/2012."— Presentation transcript:

1 Po-Hsiang Chen Advisor: Sheng-Jyh Wang 2/13/2012

2 Shotton, J., A. Fitzgibbon, et al. (2011). "Real-Time Human Pose Recognition in Parts from Single Depth Images." Microsoft Research Cambridge & Xbox Incubation CVPR 2011 Best Paper Freedman, B., A. Shpunt, et al. (2008). Depth mapping using projected patterns, US 2010/ A1 PrimeSense Patent 2/13/20122

3 3 What is Kinect? Kinect Architecture From IR to depth image History of Structured Light PrimeSense Invented Structured Light From depth image to joint positions Body Part Interference Joint Proposals Experiments and Results Conclusion References

4 2/13/20124 What is Kinect? Kinect Architecture From IR to depth image History of Structured Light PrimeSense Invented Structured Light From depth image to joint positions Body Part Interference Joint Proposals Experiments and Results Conclusion References

5 2/13/20125 Motion sensing input device by Microsoft Depth camera tech. developed by PrimeSense Invented in 2005 Software tech. developed by Rare First announced at E as “Project Natal” Windows SDK Releases /en-us/kinectforwindows/ discover/features.aspx

6 2/13/20126

7 7 What is Kinect? Kinect Architecture From IR to depth image History of Structured Light PrimeSense Invented Structured Light From depth image to joint positions Body Part Interference Joint Proposals Experiments and Results Conclusion References

8 2/13/20128 Depth Image Body Parts Joint Position IR Structured Light Random Decision Forest Mean Shift

9 2/13/20129 What is Kinect? Kinect Architecture From IR to depth image History of Structured Light PrimeSense Invented Structured Light From depth image to joint positions Body Part Interference Joint Proposals Experiments and Results Conclusion References

10 2/13/201210

11 2/13/ Main Problem To recover shape from multiple views, need CORRESPONDENCES between the images Matching/Correspondence problem is hard Occlusions, Texture, Colors.. Etc. Solution: Structured light Idea: Simplify matching Strategy: Use illumination to create your own correspondences

12 2/13/ Basic Principle Use a projector to create unambiguous correspondences Light projection If we project a single point, matching is unique

13 2/13/ Line projection ( Line Scan ) For calibrated cameras, the epipolar geometry is known Project a line instead of a single point

14 2/13/ Project Multiple Stripes or Grids Which stripe matches which? Correspondence Again

15 2/13/ Answer 1: Assume Surface Continuity Ordering Constraint

16 2/13/ Answer 2: Coloured stripes (De Bruijn) Difficult to use for coloured surfaces

17 2/13/ Answer 2: Coloured dots (M-array) Difficult to use for coloured surfaces

18 2/13/ Answer 3: Pattern dots (M-array) Difficult for industrial manufacturing

19 2/13/ Answer 4: Time-coded light patterns (Time multiplexing) Use a sequence of binary patterns → (log N) images Each stripe has a unique binary illumination code

20 2/13/ All of the above are categorized as Discrete Methods There are a lot more Continuous Structured Light Methods such as Phase shifting and etc. Salvi, J., S. Fernandez, et al. (2010). "A state of the art in structured light patterns for surface profilometry." Pattern Recognition 43(8):

21 2/13/ All of the above are human designed patterns. Random Speckle Structured light using randomly generated patterns May obtain denser depth information by solving correspondence problem

22 2/13/ A Projector is just an inverse of a camera One projector and one camera is enough for triangulation Need Calibration

23 2/13/ US 2010/ Projector-Camera system Already calibrated structure δZ results in δX in 32

24 2/13/ US 2010/ Structured Light-1 Pseudo-random distribution Local: Random Global: Gray level decreases Can make a rough estimate in a low resolution image

25 2/13/ US 2010/ Structured Light-2 Quasi-periodic pattern Five-fold symmetry Results in distinct peaks in freq. domain Contain no unit cell repeats over spatial domain Use to reduce noise and ambient light in environment

26 2/13/201226

27 2/13/ US 2010/

28 2/13/ US 2010/ Uses a special (“astigmatic”) lens with different focal length in x- and y- directions Orientation of the circle indicates depth

29 2/13/ What is Kinect? Kinect Architecture From IR to depth image History of Structured Light PrimeSense Invented Structured Light From depth image to joint positions Body Part Interference Joint Proposals Experiments and Results Conclusion References

30 2/13/ Shotton, J., A. Fitzgibbon, et al. (2011). "Real-Time Human Pose Recognition in Parts from Single Depth Images." Microsoft Research Cambridge & Xbox Incubation Treat body segmentation as a per-pixel classification task ( No pairwise term or CRF is used ) Algorithms runs 5ms per frame on Xbox GPU Novelty: Intermediate body parts representation

31 2/13/ Body part labeling 31 body parts Distinct parts for left and right allow classifier to disambiguate the left and right sides of the body

32 2/13/ Depth image features dI(x) is the depth at pixel x in image I θ=(u,v) describe offsets u and v Each feature need only read at most 3 image pixels and perform at most 5 arithmetic operations

33 2/13/ Fast and effective multi-class classifier Each split node consists of a feature fθ and a threshold τ At the leaf node in tree t, given a learned Final classification

34 2/13/ Multiple classifiers work together Committees E.g. Averaging the predictions of a set of individual models E.g. Majority votes Boosting Classifiers trained in sequence E.g. AdaBoost Decision Tree Binary selection corresponding to the traversal of a tree

35 2/13/ Three major aspect A splitting criterion A stop-splitting rule A rule to assign each leaf to a specific class Decision Forests A Decision Tree Committee

36 2/13/ Fast and effective multi-class classifier Each split node consists of a feature fθ and a threshold τ At the leaf node in tree t, given a learned Final classification How to train?

37 2/13/ Training Each tree train on different images Each image pick 2000 example pixels Algorithm

38 2/13/ Algorithm(cont.) Shannon entropy given Z on Y

39 2/13/ Algorithm(cont.) Training takes a lot of efforts 3 trees with depth 20 from 1 million images takes about a day on a 1000 core cluster Where are those training data?

40 2/13/ Depth imaging Simplify the task of background subtraction Most important: easy to synthesize!!! Take Real Images Learning Synthesize Parameters Generate Lots of training data

41 2/13/ Depth Image Body Parts Joint Position IR Structured Light Random Decision Forest Mean Shift

42 2/13/ From the previous section, Use Mean Shift with a weighted Gaussian kernel

43 2/13/ Kernel density estimator Discrete points -> Continuous function Calculate the gradient at initial point and shift Iterate till stop

44 2/13/ What is Kinect? Kinect Architecture From IR to depth image History of Structured Light PrimeSense Invented Structured Light From depth image to joint positions Body Part Interference Joint Proposals Experiments and Results Conclusion References

45 2/13/ Synthetic Real

46 2/13/ Failure

47 2/13/ Training parameters vs. classification accuracy

48 2/13/ Comparisons

49 2/13/ What is Kinect? Kinect Architecture From IR to depth image History of Structured Light PrimeSense Invented Structured Light From depth image to joint positions Body Part Interference Joint Proposals Experiments and Results Conclusion References

50 2/13/ Depth images may contain enough information to solve human pose problems Depth images are color and texture invariant, which simplifies a lot of the corresponding problem A deep combining model with sufficient training data can become a good classifier even with simple features Buy a Kinect for LAB

51 2/13/ What is Kinect? Kinect Architecture From IR to depth image History of Structured Light PrimeSense Invented Structured Light From depth image to joint positions Body Part Interference Joint Proposals Experiments and Results Conclusion References

52 Shotton, J., A. Fitzgibbon, et al. (2011). "Real-Time Human Pose Recognition in Parts from Single Depth Images." Microsoft Research Cambridge & Xbox Incubation Freedman, B., A. Shpunt, et al. (2008). Depth mapping using projected patterns, US 2010/ A1 Freedman, B., A. Shpunt, et al. (2008). Distance-Varying Illumination and Imaging Techniques for Depth Mapping, US 2010/ A1 2/13/201252

53 2/13/ Salvi, J., S. Fernandez, et al. (2010). "A state of the art in structured light patterns for surface profilometry." Pattern Recognition 43(8): Albitar, I., P. Graebling, et al. (2007). “Robust structured light coding for 3D reconstruction,” IEEE. Scharstein, D. and R. Szeliski (2003). “High-accuracy stereo depth maps using structured light,” IEEE. Breiman, L. (2001). "Random forests." Machine learning 45(1): Amit, Y. and D. Geman (1997). "Shape quantization and recognition with randomized trees." Neural computation 9(7):

54 2/13/ John MacCormick, “How does the Kinect work? ” users.dickinson.edu/~jmac/selected-talks/kinect.pdf “Structured Light”, structured.pdf structured.pdf the-anandtech-review/2 the-anandtech-review/2 Chen, Y. S. and B. T. Chen (2003). "Measuring of a three- dimensional surface by use of a spatial distance computation." Applied optics 42(11):


Download ppt "Po-Hsiang Chen Advisor: Sheng-Jyh Wang 2/13/2012."

Similar presentations


Ads by Google