Stanford CS223B Computer Vision, Winter 2006 Lecture 8 Structure From Motion Professor Sebastian Thrun CAs: Dan Maynes-Aminzade, Mitul Saha, Greg Corrado.

Slides:



Advertisements
Similar presentations
Structure from motion.
Advertisements

The fundamental matrix F
Last 4 lectures Camera Structure HDR Image Filtering Image Transform.
MASKS © 2004 Invitation to 3D vision Lecture 7 Step-by-Step Model Buidling.
Robot Vision SS 2005 Matthias Rüther 1 ROBOT VISION Lesson 3: Projective Geometry Matthias Rüther Slides courtesy of Marc Pollefeys Department of Computer.
3D Reconstruction – Factorization Method Seong-Wook Joo KG-VISA 3/10/2004.
Computer vision: models, learning and inference
Two-View Geometry CS Sastry and Yang
Lecture 11: Structure from Motion
Active Contours / Planes Sebastian Thrun, Gary Bradski, Daniel Russakoff Stanford CS223B Computer Vision Some slides.
Camera calibration and epipolar geometry
Structure from motion.
Projective structure from motion
Stanford CS223B Computer Vision, Winter 2005 Lecture 5: Stereo I Sebastian Thrun, Stanford Rick Szeliski, Microsoft Hendrik Dahlkamp and Dan Morris, Stanford.
Stanford CS223B Computer Vision, Winter 2005 Lecture 11: Structure From Motion 2 Sebastian Thrun, Stanford Rick Szeliski, Microsoft Hendrik Dahlkamp and.
Today Feature Tracking Structure from Motion on Monday (1/29)
Stanford CS223B Computer Vision, Winter 2007 Lecture 8 Structure From Motion Professors Sebastian Thrun and Jana Košecká CAs: Vaibhav Vaish and David Stavens.
Stanford CS223B Computer Vision, Winter 2006 Lecture 5 Stereo I
Epipolar geometry. (i)Correspondence geometry: Given an image point x in the first view, how does this constrain the position of the corresponding point.
Structure from motion. Multiple-view geometry questions Scene geometry (structure): Given 2D point matches in two or more images, where are the corresponding.
Uncalibrated Geometry & Stratification Sastry and Yang
Stanford CS223B Computer Vision, Winter 2006 Lecture 6 Stereo II Professor Sebastian Thrun CAs: Dan Maynes-Aminzade, Mitul Saha, Greg Corrado Stereo.
Multiple-view Reconstruction from Points and Lines
3D reconstruction of cameras and structure x i = PX i x’ i = P’X i.
Many slides and illustrations from J. Ponce
Uncalibrated Epipolar - Calibration
Structure From Motion Sebastian Thrun, Gary Bradski, Daniel Russakoff
Sebastian Thrun and Jana Kosecha CS223B Computer Vision, Winter 2007 Stanford CS223B Computer Vision, Winter 2007 Lecture 4 Camera Calibration Professors.
CMPUT 412 3D Computer Vision Presented by Azad Shademan Feb , 2007.
Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.
Computer Vision Structure from motion Marc Pollefeys COMP 256 Some slides and illustrations from J. Ponce, A. Zisserman, R. Hartley, Luc Van Gool, …
Stanford CS223B Computer Vision, Winter 2006 Lecture 11 Filters / Motion Tracking Professor Sebastian Thrun CAs: Dan Maynes-Aminzade, Mitul Saha, Greg.
Multiple View Reconstruction Class 23 Multiple View Geometry Comp Marc Pollefeys.
1 Stanford CS223B Computer Vision, Winter 2006 Lecture 7 Optical Flow Professor Sebastian Thrun CAs: Dan Maynes-Aminzade, Mitul Saha, Greg Corrado Slides.
CSCE 641 Computer Graphics: Image-based Modeling (Cont.) Jinxiang Chai.
Affine structure from motion
Stereo Sebastian Thrun, Gary Bradski, Daniel Russakoff Stanford CS223B Computer Vision (with slides by James Rehg and.
Global Alignment and Structure from Motion Computer Vision CSE455, Winter 2008 Noah Snavely.
CSCE 641 Computer Graphics: Image-based Modeling (Cont.) Jinxiang Chai.
Sebastian Thrun CS223B Computer Vision, Winter Stanford CS223B Computer Vision, Winter 2006 Lecture 4 Camera Calibration Professor Sebastian Thrun.
Computer vision: models, learning and inference
Chapter 6 Feature-based alignment Advanced Computer Vision.
Introduction à la vision artificielle III Jean Ponce
Euclidean cameras and strong (Euclidean) calibration Intrinsic and extrinsic parameters Linear least-squares methods Linear calibration Degenerate point.
Geometry and Algebra of Multiple Views
Sebastian Thrun CS223B Computer Vision, Winter Stanford CS223B Computer Vision, Winter 2005 Lecture 2 Lenses and Camera Calibration Sebastian Thrun,
CSCE 643 Computer Vision: Structure from Motion
Affine Structure from Motion
Camera Calibration Sebastian Thrun, Gary Bradski, Daniel Russakoff Stanford CS223B Computer Vision (with material from.
EECS 274 Computer Vision Affine Structure from Motion.
776 Computer Vision Jan-Michael Frahm & Enrique Dunn Spring 2013.
Structure from Motion ECE 847: Digital Image Processing
MASKS © 2004 Invitation to 3D vision Uncalibrated Camera Chapter 6 Reconstruction from Two Uncalibrated Views Modified by L A Rønningen Oct 2008.
Reconstruction from Two Calibrated Views Two-View Geometry
Structure from Motion Paul Heckbert, Nov , Image-Based Modeling and Rendering.
Structure from motion Multi-view geometry Affine structure from motion Projective structure from motion Planches : –
EECS 274 Computer Vision Projective Structure from Motion.
Structure from Motion. For now, static scene and moving cameraFor now, static scene and moving camera – Equivalently, rigidly moving scene and static.
Reconstruction of a Scene with Multiple Linearly Moving Objects Mei Han and Takeo Kanade CISC 849.
Lecture 16: Image alignment
Digital Visual Effects, Spring 2007 Yung-Yu Chuang 2007/4/17
Epipolar geometry.
Structure from motion Input: Output: (Tomasi and Kanade)
Professor Sebastian Thrun CAs: Dan Maynes-Aminzade and Mitul Saha
Uncalibrated Geometry & Stratification
George Mason University
Multi-view geometry.
Structure from motion.
Structure from motion Input: Output: (Tomasi and Kanade)
Lecture 15: Structure from motion
Presentation transcript:

Stanford CS223B Computer Vision, Winter 2006 Lecture 8 Structure From Motion Professor Sebastian Thrun CAs: Dan Maynes-Aminzade, Mitul Saha, Greg Corrado Slides by: Gary Bradski, Intel Research and Stanford SAIL

Sebastian Thrun Stanford University CS223B Computer Vision Structure From Motion camera features Recover: structure (feature locations), motion (camera extrinsics)

Sebastian Thrun Stanford University CS223B Computer Vision Structure From Motion (1) [Tomasi & Kanade 92]

Sebastian Thrun Stanford University CS223B Computer Vision Structure From Motion (2) [Tomasi & Kanade 92]

Sebastian Thrun Stanford University CS223B Computer Vision Structure From Motion (3) [Tomasi & Kanade 92]

Sebastian Thrun Stanford University CS223B Computer Vision Structure From Motion (4a): Images Marc Pollefeys

Sebastian Thrun Stanford University CS223B Computer Vision Structure From Motion (4b) Marc Pollefeys

Sebastian Thrun Stanford University CS223B Computer Vision Structure From Motion n Problem 1: –Given n points p ij =(x ij, y ij ) in m images –Reconstruct structure: 3-D locations P j =(x j, y j, z j ) –Reconstruct camera positions (extrinsics) M i =(A j, b j ) n Problem 2: –Establish correspondence: c(p ij )

Sebastian Thrun Stanford University CS223B Computer Vision SFM: General Formulation fZ X O -x

Sebastian Thrun Stanford University CS223B Computer Vision SFM: Bundle Adjustment fZ X O -x

Sebastian Thrun Stanford University CS223B Computer Vision Bundle Adjustment n SFM = Nonlinear Least Squares problem n Minimize through –Gradient Descent –Conjugate Gradient –Gauss-Newton –Levenberg Marquardt (!) n Prone to local minima

Sebastian Thrun Stanford University CS223B Computer Vision Count # Constraints vs #Unknowns n m camera poses n n points n 2mn point constraints n 6m+3n unknowns n Suggests: need 2mn  6m + 3n n But: Can we really recover all parameters???

Sebastian Thrun Stanford University CS223B Computer Vision How Many Parameters Can’t We Recover? nmnm Place Your Bet! We can recover all but…

Sebastian Thrun Stanford University CS223B Computer Vision Count # Constraints vs #Unknowns n m camera poses n n points n 2mn point constraints n 6m+3n unknowns n Suggests: need 2mn  6m + 3n n But: Can we really recover all parameters??? –Can’t recover origin, orientation (6 params) –Can’t recover scale (1 param) n Thus, we need 2mn  6m + 3n - 7

Sebastian Thrun Stanford University CS223B Computer Vision Are done? n No, bundle adjustment has many local minima.

Sebastian Thrun Stanford University CS223B Computer Vision The “Trick Of The Day” n Replace Perspective by Orthographic Geometry n Replace Euclidean Geometry by Affine Geometry n Solve SFM linearly (“closed” form, globally optimal) n Post-Process to make solution Euclidean n Post-Process to make solution perspective By Tomasi and Kanade, 1992

Sebastian Thrun Stanford University CS223B Computer Vision Orthographic Camera Model Limit of Pinhole Model: Extrinsic Parameters Rotation Orthographic Projection

Sebastian Thrun Stanford University CS223B Computer Vision Orthographic Projection Limit of Pinhole Model: Orthographic Projection

Sebastian Thrun Stanford University CS223B Computer Vision The Orthographic SFM Problem subject to

Sebastian Thrun Stanford University CS223B Computer Vision The Affine SFM Problem subject to drop the constraints

Sebastian Thrun Stanford University CS223B Computer Vision Count # Constraints vs #Unknowns n m camera poses n n points n 2mn point constraints n 8m+3n unknowns n Suggests: need 2mn  8m + 3n n But: Can we really recover all parameters???

Sebastian Thrun Stanford University CS223B Computer Vision How Many Parameters Can’t We Recover? nmnm Place Your Bet! We can recover all but…

Sebastian Thrun Stanford University CS223B Computer Vision The Answer is (at least): 12

Sebastian Thrun Stanford University CS223B Computer Vision Points for Solving Affine SFM Problem n m camera poses n n points n Need to have: 2mn  8m + 3n-12

Sebastian Thrun Stanford University CS223B Computer Vision Affine SFM Fix coordinate system by making p 0 =origin Proof: Rank Theorem: Q has rank 3

Sebastian Thrun Stanford University CS223B Computer Vision The Rank Theorem n elements 2m elements

Sebastian Thrun Stanford University CS223B Computer Vision Singular Value Decomposition

Sebastian Thrun Stanford University CS223B Computer Vision Affine Solution to Orthographic SFM Gives also the optimal affine reconstruction under noise

Sebastian Thrun Stanford University CS223B Computer Vision Back To Orthographic Projection Find C and d for which constraints are met Search in 12-dim space (instead of 8m + 3n-12)

Sebastian Thrun Stanford University CS223B Computer Vision Back To Projective Geometry Orthographic (in the limit) Projective

Sebastian Thrun Stanford University CS223B Computer Vision Back To Projective Geometry fZ X O -x Optimize Using orthographic solution as starting point

Sebastian Thrun Stanford University CS223B Computer Vision The “Trick Of The Day” n Replace Perspective by Orthographic Geometry n Replace Euclidean Geometry by Affine Geometry n Solve SFM linearly (“closed” form, globally optimal) n Post-Process to make solution Euclidean n Post-Process to make solution perspective By Tomasi and Kanade, 1992

Sebastian Thrun Stanford University CS223B Computer Vision Structure From Motion n Problem 1: –Given n points p ij =(x ij, y ij ) in m images –Reconstruct structure: 3-D locations P j =(x j, y j, z j ) –Reconstruct camera positions (extrinsics) M i =(A j, b j ) n Problem 2: –Establish correspondence: c(p ij )

Sebastian Thrun Stanford University CS223B Computer Vision The Correspondence Problem View 1View 3View 2

Sebastian Thrun Stanford University CS223B Computer Vision Correspondence: Solution 1 n Track features (e.g., optical flow) n …but fails when images taken from widely different poses

Sebastian Thrun Stanford University CS223B Computer Vision Correspondence: Solution 2 n Start with random solution A, b, P n Compute soft correspondence: p(c|A,b,P) n Plug soft correspondence into SFM n Reiterate See Dellaert/Seitz/Thorpe/Thrun, Machine Learning Journal, 2003

Sebastian Thrun Stanford University CS223B Computer Vision Example

Sebastian Thrun Stanford University CS223B Computer Vision Results: Cube

Sebastian Thrun Stanford University CS223B Computer Vision Animation

Sebastian Thrun Stanford University CS223B Computer Vision Tomasi’s Benchmark Problem

Sebastian Thrun Stanford University CS223B Computer Vision Reconstruction with EM

Sebastian Thrun Stanford University CS223B Computer Vision 3-D Structure

Sebastian Thrun Stanford University CS223B Computer Vision Correspondence: Alternative Approach n Ransac [Fisher/Bolles] = Random sampling and consensus

Sebastian Thrun Stanford University CS223B Computer Vision Summary SFM n Problem –Determine feature locations (=structure) –Determine camera extrinsic (=motion) n Two Principal Solutions –Bundle adjustment (nonlinear least squares, local minima) –SVD (through orthographic approximation, affine geometry) n Correspondence –(RANSAC) –Expectation Maximization