Computing the Stereo Matching Cost with a Convolutional Neural Network

Slides:

Advertisements

Similar presentations

1Ellen L. Walker Stereo Vision Why? Two images provide information to extract (some) 3D information We have good biological models (our own vision system)

Advertisements

The fundamental matrix F

Lecture 11: Two-view geometry

Efficient High-Resolution Stereo Matching using Local Plane Sweeps Sudipta N. Sinha, Daniel Scharstein, Richard CVPR 2014 Yongho Shin.

Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.

875: Recent Advances in Geometric Computer Vision & Recognition

www-video.eecs.berkeley.edu/research

Gratuitous Picture US Naval Artillery Rangefinder from World War I (1918)!!

Stereo Many slides adapted from Steve Seitz. Binocular stereo Given a calibrated binocular stereo pair, fuse it to produce a depth image Where does the.

MASKS © 2004 Invitation to 3D vision Lecture 7 Step-by-Step Model Buidling.

Lecture 8: Stereo.

Last Time Pinhole camera model, projection

Scene Planes and Homographies class 16 Multiple View Geometry Comp Marc Pollefeys.

CS6670: Computer Vision Noah Snavely Lecture 17: Stereo

Multiple View Geometry : Computational Photography Alexei Efros, CMU, Fall 2005 © Martin Quinn …with a lot of slides stolen from Steve Seitz and.

Stereo Matching Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski.

Multi-view stereo Many slides adapted from S. Seitz.

High-Quality Video View Interpolation

Stereopsis Mark Twain at Pool Table", no date, UCR Museum of Photography.

The plan for today Camera matrix

3D from multiple views : Rendering and Image Processing Alexei Efros …with a lot of slides stolen from Steve Seitz and Jianbo Shi.

CSE473/573 – Stereo Correspondence

Multiple View Geometry : Computational Photography Alexei Efros, CMU, Fall 2006 © Martin Quinn …with a lot of slides stolen from Steve Seitz and.

Accurate, Dense and Robust Multi-View Stereopsis Yasutaka Furukawa and Jean Ponce Presented by Rahul Garg and Ryan Kaminsky.

Stereo matching “Stereo matching” is the correspondence problem –For a point in Image #1, where is the corresponding point in Image #2? C1C1 C2C2 ? ? C1C1.

Stereo matching Class 10 Read Chapter 7 Tsukuba dataset.

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry Topics: Basics of projective geometry Points and hyperplanes in projective space Homography.

3-D Scene u u’u’ Study the mathematical relations between corresponding image points. “Corresponding” means originated from the same 3D point. Objective.

Computer Vision Spring ,-685 Instructor: S. Narasimhan WH 5409 T-R 10:30am – 11:50am Lecture #15.

Camera Calibration & Stereo Reconstruction Jinxiang Chai.

Stereo Matching Information Permeability For Stereo Matching – Cevahir Cigla and A.Aydın Alatan – Signal Processing: Image Communication, 2013 Radiometric.

Lecture 12 Stereo Reconstruction II Lecture 12 Stereo Reconstruction II Mata kuliah: T Computer Vision Tahun: 2010.

A Local Adaptive Approach for Dense Stereo Matching in Architectural Scene Reconstruction C. Stentoumis 1, L. Grammatikopoulos 2, I. Kalisperakis 2, E.

Recap from Monday Image Warping – Coordinate transforms – Linear transforms expressed in matrix form – Inverse transforms useful when synthesizing images.

Epipolar geometry Epipolar Plane Baseline Epipoles Epipolar Lines

Stereo Vision Reading: Chapter 11 Stereo matching computes depth from two or more images Subproblems: –Calibrating camera positions. –Finding all corresponding.

Stereo Many slides adapted from Steve Seitz.

CS 4487/6587 Algorithms for Image Analysis

Stereo Many slides adapted from Steve Seitz. Binocular stereo Given a calibrated binocular stereo pair, fuse it to produce a depth image image 1image.

Computer Vision, Robert Pless

Computer Vision Lecture #10 Hossam Abdelmunim 1 & Aly A. Farag 2 1 Computer & Systems Engineering Department, Ain Shams University, Cairo, Egypt 2 Electerical.

Geometric Transformations

Lecture 16: Stereo CS4670 / 5670: Computer Vision Noah Snavely Single image stereogram, by Niklas EenNiklas Een.

Solving for Stereo Correspondence Many slides drawn from Lana Lazebnik, UIUC.

Project 2 due today Project 3 out today Announcements TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAA.

John Morris Stereo Vision (continued) Iolanthe returns to the Waitemata Harbour.

Advanced Computer Vision Chapter 11 Stereo Correspondence Presented by: 蘇唯誠指導教授 : 傅楸善博士.

Energy minimization Another global approach to improve quality of correspondences Assumption: disparities vary (mostly) smoothly Minimize energy function:

Correspondence and Stereopsis. Introduction Disparity – Informally: difference between two pictures – Allows us to gain a strong sense of depth Stereopsis.

CSE 185 Introduction to Computer Vision Stereo 2.

Stereo CS4670 / 5670: Computer Vision Noah Snavely Single image stereogram, by Niklas EenNiklas Een.

Yann LeCun Other Methods and Applications of Deep Learning Yann Le Cun The Courant Institute of Mathematical Sciences New York University

CS 6501: 3D Reconstruction and Understanding Stereo Cameras

A Plane-Based Approach to Mondrian Stereo Matching

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry

Learning to Compare Image Patches via Convolutional Neural Networks

Summary of “Efficient Deep Learning for Stereo Matching”

Semi-Global Matching with self-adjusting penalties

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

CS4670 / 5670: Computer Vision Kavita Bala Lec 27: Stereo.

Jure Zbontar, Yann LeCun

Semi-Global Stereo Matching with Surface Orientation Priors

SoC and FPGA Oriented High-quality Stereo Vision System

Geometry 3: Stereo Reconstruction

3D Photography: Epipolar geometry

Computer Vision James Hays

Single Image Rolling Shutter Distortion Correction

Stereo vision Many slides adapted from Steve Seitz.

SFNet: Learning Object-aware Semantic Correspondence

Presentation transcript:

Computing the Stereo Matching Cost with a Convolutional Neural Network Markus Herb Seminar Recent Trends in 3D Computer Vision Supervisor: Benjamin Busam Computing the Stereo Matching Cost with a Convolutional Neural Network Žbontar & LeCun, CVPR 2015

Contents Stereo Vision Background State-of-the-Art MC-CNN Results Stereo Pipeline Matching Cost Computation MC-CNN Network Architecture Network Training & Prediction Results Conclusion & Future Work Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

Stereo Vision Capture scene with two cameras from different viewpoints Left View Right View Capture scene with two cameras from different viewpoints Reconstruct 3D depth information from input images alone Cheap, passive sensors only Sophisticated processing needed Large number of applications in robotics, medical, … Disparity Map Images from Middlebury 2014 Dataset [3] Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

Background: Epipolar Geometry Epipolar line l Epipolar line l’ Images by Hartley et al. [4] Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

Background: Rectification and Disparity Original Images Epipolar lines in arbitrary directions Perform Rectification for easier matching Rectified Images All Epipolar lines horizontal and aligned Matches defined by horizontal offset (Disparity) Depth inversely proportional to Disparity Images by Loop et al. [5] Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

State-of-the-Art: Stereo Pipeline Matching Cost Computation Cost Aggregation Disparity Computation Disparity Refinement Compute Matching cost at each pixel (x,y) for each disparity d Disparity Space Image (DSI): Store cost in 3D cost volume C(x,y,d) Aggregate and optimize cost in DSI Compute and refine disparity map D(x,y) from DSI Taxonomy according to Scharstein et al. [6] Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

State-of-the-Art: Matching Cost Computation Absolute Differences (AD) / Squared Differences (SD) Normalized Cross Correlation (NCC) (e.g. [7]) Census Transform [8] Binary Feature vector of intensity comparisons Cost given by Hamming distance Mutual Information (MI) [9] Cost based on joint entropy of image intensities Combined Costs, e.g. AD + Census [10] Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

MC-CNN: Network Architecture Network computes patch-similarity Input patches from left and right image at fixed disparity Parallel paths for both patches with tied weights ReLU after each layer Softmax output for good/bad match Matching cost given by bad-match output Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

MC-CNN: Network Training & Prediction Training with ground truth data from datasets (KITTI, Middlebury) Training Data Positive Examples: Matching patches + small random offset Negative Examples: Matching patches + large random offset Cost Prediction & Disparity Computation Network computes Disparity Space Image Postprocessing of Disparity Space Image [11, 12] Minimize cost in DSI along disparity for disparity map Additional refinements and consistency checks Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

Results: Disparity Space Image Left Image DSI after CNN DSI after Postprocessing Input images from KITTI 2015 Dataset [17] Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

Results: Disparity Map Left Image Disparity after CNN Disparity after Postprocessing Input images from KITTI 2015 Dataset [17] Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

Results: KITTI 2012 Benchmark State-of-the-Art Stereo and Optical Flow Benchmark Autonomous Driving scenario Ground-truth obtained using 3D LiDAR Rank Method Out-Noc Avg-Noc Runtime Environment 1 MC-CNN-acrt [2] 2.43% 0.7px 67s GTX Titan X 2 Displets [14] 2.47% 265s > 8 core @ 3.0GHz 3 MC-CNN [1] 2.61% 0.8px 100s GTX Titan 4 PRSM [15] 2.78% 300s 1 core @ 2.5GHz 5 SPS-StFl [16] 2.83% 35s 1 core @ 3.5GHz KITTI 2012 [13] Ranking on Nov 5, 2015, 3px Threshold, All Pixels Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

Conclusions & Future Work Entirely novel approach for matching cost computation First method to introduce CNNs into stereo pipeline Post-processing needed for competitive results Top results in major stereo benchmarks Large computational effort Revised Journal Version Improved feature extraction Accurate and Fast network variants Potential future work Multiscale approaches Larger training sets (synthetic ground truth data) Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

References [1] Žbontar & LeCun: Computing the stereo matching cost with a convolutional neural network, CVPR 2015 [2] Žbontar & LeCun: Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches, arXiv preprint arXiv:1510.05970 [3] Scharstein et al.: High-resolution stereo datasets with subpixel-accurate ground truth, GCPR 2014 [4] Hartley & Zisserman: Multiple View Geometry in Computer Vision, 2003 [5] Loop & Zhang: Computing rectifying homographies for stereo vision, CVPR 1999 [6] Scharstein et al.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms, SMBV Workshops 2001 [7] Sinha et al.: Efficient high-resolution stereo matching using local plane sweeps, CVPR 2014 [8] Zabih & Woodfill: Non-parametric local transforms for computing visual correspondence, ECCV 1994 [9] Kim et al.: Visual correspondence using energy minimization and mutual information, ICCV 2003 [10] Mei et al.: On building an accurate stereo matching system on graphics hardware, ICCV Workshops 2011 [11] Zhang et al.: Cross-based local stereo matching using orthogonal integral images, TCSVT 2009 [12] Hirschmüller: Stereo processing by semiglobal matching and mutual information, PAMI 2008 [13] Geiger et al.: Are we ready for autonomous driving? The KITTI Vision Benchmark Suite, CVPR 2012 [14] Güney et al.: Displets: Resolving stereo ambiguities using object knowledge, CVPR 2015 [15] Vogel et al.: 3D Scene Flow estimation with a piecewise rigid scene model. IJCV 2015 [16] Yamaguchi et al.: Efficient joint segmentation, occlusion labeling, stereo and flow estimation, ECCV 2014 [17] Menze & Geiger: Object scene flow for autonomous vehicles, CVPR 2015 Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

Q&A Session

Revised Journal Architectures [2] Accurate Fast Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

Postprocessing: Cross-Based-Cost-Aggregation Left Image Disparity after CNN Disparity after CBCA Input images from KITTI 2015 Dataset [17] Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

Postprocessing: Semiglobal Matching Left Image Disparity after CBCA Disparity after SGM Input images from KITTI 2015 Dataset [17] Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015

Middlebury 2014 Dataset [3] Results Image Disparity bad 2.0 error Markus Herb, Computing Stereo Matching Cost with CNNs November 12, 2015