DISCRIMINATIVELY TRAINED DENSE SURFACE NORMAL ESTIMATION ANDREW SHARP.

DISCRIMINATIVELY TRAINED DENSE SURFACE NORMAL ESTIMATION ANDREW SHARP

OUTLINE Problem Why Approach Overall implementation Experimental Evaluation Alternative Methods Remaining Work Discussion

PROBLEM STATEMENT How to estimate dense surface normals from a single RGB image?

WHY DO THIS? My research is focused on semi autonomous robotic manipulator behaviors One of my applications is Variable Normal Surface Virtual Fixtures If the object is known ahead of time calculating offset surfaces is easy Calculating surface normals from sensor data would greatly expand applicability

WHY IS IT DIFFICULT? Trying to extract 3D data from a 2D image Ground truth data is required for learning approaches Previous methods relied on knowledge of underlying physics of light and shading Not applicable to general problems Multiple datasets are now available due to cheap and effective depth sensors such as the Kinect, RealSense, etc.

APPROACH Pixel-based labelling Context-based representations Classifier typically noisy and does not follow object boundaries Segment-based labeling Based on feature statistics in segments Segments label-consistent Only one segmentation used J oint approach Segment representation converted to pixel representation Representation of a pixel the same as of the segment it belongs to Equivalent to weighted segment based approach

TRAINING To simplify regression problem Normals clustered using K-means clustering Projected onto unit half sphere Each normal is represented as weighted sums of cluster centers using local coding Learning formulated as a regression into local coding coordinates Standard Ada-boost extended to build a strong classifier as a sum of weak classifiers for continuous ground truth labels L. Ladicky (https://www.inf.ethz.ch/personal/ladickyl/normals_eccv14poster.pdf )https://www.inf.ethz.ch/personal/ladickyl/normals_eccv14poster.pdf

PREDICTION The most probable triangle found by maximizing: Local coding coefficients found as an expected value of probabilistic interpretation: Normal recovered by projecting weighted sum to the unit sphere and using Delaunay Triangulation: By Gustavo - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php? curid=31819292 https://commons.wikimedia.org/w/index.php? curid=31819292 L. Ladicky (https://www.inf.ethz.ch/personal/ladickyl/normals_eccv14poster.pdf )https://www.inf.ethz.ch/personal/ladickyl/normals_eccv14poster.pdf

GROUND TRUTH ACQUSITION Obtain underlying 3D scene laser scanners, commodity depth, sensors, stereo cameras Denoised with Total Generalized Variation (http://www.uni-graz.at/imawww/optcon/projects/bredies/tgv.html )http://www.uni-graz.at/imawww/optcon/projects/bredies/tgv.html Normals are then computed on the 3D point cloud for each point in a local 3D spatial neighborhood

OVERALL IMPLEMENTATION No geometric priors (vanishing points, Manhattan world) Strong classifier consisted of 5000 weak classifiers Pixel and multiple segment representations concatenated into one vector 4 Dense features (Texton, SIFT, LTP, Self- Similarity) 4x4 Segmentation methods (MeanShift, SLIC, Normalized Cut, GC-based) BOW representations over context rectangles and segments Evaluated for 5 color spaces (Grey, Lab, Luv, Rgb, Opponent ) L. Ladicky (https://www.inf.ethz.ch/personal/ladickyl/normals_eccv14.pdf )https://www.inf.ethz.ch/personal/ladickyl/normals_eccv14.pdf

EXPERIMENTAL EVALUATION NYU2 data set (indoor scenes) 795 training & 654 test images Resolution 640 × 480 Depth data obtained by a Kinect sensor Training took 3 weeks on 5 8 core machines KITTI data set (outdoor scenes) 194 training & 195 test images outdoor Depth data obtained by a Velodyne laser scanner Training took 5 days on 5 8 core machines

EXPERIMENTAL RESULTS Limited quantitative success Reference variation in ground truth data L. Ladicky (https://www.inf.ethz.ch/personal/ladickyl/normals_eccv14.pdf )https://www.inf.ethz.ch/personal/ladickyl/normals_eccv14.pdf

ALTERNATIVE METHODS Not a well researched area Early approaches relied on strong assumptions about Underlying physics of light and shading Knowledge of locations of light sources Material properties. Other approaches used 3D primitives, iterative optimization, and a SVM-based detector to determine surface normals (https://www.cs.cmu.edu/~dfouhey/dfouhey_primitives.pdf).https://www.cs.cmu.edu/~dfouhey/dfouhey_primitives.pdf

REMAINING WORK Significantly outperform the state-of-the-art but is it actually useful? Author’s future research plans: Single-view or 3D volumetric reconstruction, as a geometric prior for their regularization Other possible future work: Other learning approaches such as neural nets Use this information to augment point cloud analysis of surface normals Possibility of integrating Point Cloud Library’s Viewpoint Feature Histogram ([VFH]) for ground truth analysis[VFH]

OUTLINE Problem Why Approach Overall implementation Experimental Evaluation Alternative Methods Remaining Work Discussion

DISCUSSION Method efficiency / training & testing speeds Training of artificially generated data Deep models Thanks

DISCRIMINATIVELY TRAINED DENSE SURFACE NORMAL ESTIMATION ANDREW SHARP.

Similar presentations

Presentation on theme: "DISCRIMINATIVELY TRAINED DENSE SURFACE NORMAL ESTIMATION ANDREW SHARP."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

DISCRIMINATIVELY TRAINED DENSE SURFACE NORMAL ESTIMATION ANDREW SHARP.

Similar presentations

Presentation on theme: "DISCRIMINATIVELY TRAINED DENSE SURFACE NORMAL ESTIMATION ANDREW SHARP."— Presentation transcript:

Similar presentations

About project

Feedback