Noah Snavely.

Slides:

Advertisements

Similar presentations

Recognising Panoramas M. Brown and D. Lowe, University of British Columbia.

Advertisements

Structure from motion.

Assignment 4 Cloning Yourself

The fundamental matrix F

Lecture 11: Two-view geometry

Building Rome in a Day Sameer Agarwal1 Noah Snavely2 Ian Simon1 Steven M. Seitz1 Richard Szeliski3 1University of Washington 2Cornell University 3Microsoft.

Two-View Geometry CS Sastry and Yang

Chapter 6 Feature-based alignment Advanced Computer Vision.

Neurocomputing,Neurocomputing, Haojie Li Jinhui Tang Yi Wang Bin Liu School of Software, Dalian University of Technology School of Computer Science,

A Global Linear Method for Camera Pose Registration

Discrete-Continuous Optimization for Large-scale Structure from Motion David Crandall, Andrew Owens, Noah Snavely, Dan Huttenlocher Presented by: Rahul.

Structure from motion.

Lecture 23: Structure from motion and multi-view stereo

Visual Odometry Michael Adams CS 223B Problem: Measure trajectory of a mobile platform using visual data Mobile Platform (Car) Calibrated Camera.

Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.

Epipolar geometry. (i)Correspondence geometry: Given an image point x in the first view, how does this constrain the position of the corresponding point.

Structure from motion. Multiple-view geometry questions Scene geometry (structure): Given 2D point matches in two or more images, where are the corresponding.

Automatic Panoramic Image Stitching using Local Features Matthew Brown and David Lowe, University of British Columbia.

Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely.

Lecture 16: Single-view modeling, Part 2 CS6670: Computer Vision Noah Snavely.

CS664 Lecture #19: Layers, RANSAC, panoramas, epipolar geometry Some material taken from:  David Lowe, UBC  Jiri Matas, CMP Prague

CSCE 641 Computer Graphics: Image-based Modeling (Cont.) Jinxiang Chai.

Camera parameters Extrinisic parameters define location and orientation of camera reference frame with respect to world frame Intrinsic parameters define.

Global Alignment and Structure from Motion Computer Vision CSE455, Winter 2008 Noah Snavely.

CSCE 641 Computer Graphics: Image-based Modeling (Cont.) Jinxiang Chai.

Lecture 12: Structure from motion CS6670: Computer Vision Noah Snavely.

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry Topics: Basics of projective geometry Points and hyperplanes in projective space Homography.

Mosaics CSE 455, Winter 2010 February 8, 2010 Neel Joshi, CSE 455, Winter Announcements  The Midterm went out Friday  See to the class.

Single-view modeling, Part 2 CS4670/5670: Computer Vision Noah Snavely.

776 Computer Vision Jan-Michael Frahm, Enrique Dunn Spring 2013.

Automatic Camera Calibration

CSCE 643 Computer Vision: Structure from Motion

Acquiring 3D models of objects via a robotic stereo head David Virasinghe Department of Computer Science University of Adelaide Supervisors: Mike Brooks.

© 2005 Martin Bujňák, Martin Bujňák Supervisor : RNDr.

Announcements Project 3 due Thursday by 11:59pm Demos on Friday; signup on CMS Prelim to be distributed in class Friday, due Wednesday by the beginning.

IIIT HYDERABAD Image-based walkthroughs from partial and incremental scene reconstructions Kumar Srijan Syed Ahsan Ishtiaque C. V. Jawahar Center for Visual.

Scene Reconstruction Seminar presented by Anton Jigalin Advanced Topics in Computer Vision ( )

3D reconstruction from uncalibrated images

Bundle Adjustment A Modern Synthesis Bill Triggs, Philip McLauchlan, Richard Hartley and Andrew Fitzgibbon Presentation by Marios Xanthidis 5 th of No.

COS429 Computer Vision =++ Assignment 4 Cloning Yourself.

CSE 185 Introduction to Computer Vision Feature Matching.

Lecture 9 Feature Extraction and Motion Estimation Slides by: Michael Black Clark F. Olson Jean Ponce.

3D Reconstruction Using Image Sequence

Reconstruction from Two Calibrated Views Two-View Geometry

Announcements No midterm Project 3 will be done in pairs same partners as for project 2.

EECS 274 Computer Vision Projective Structure from Motion.

Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely.

IIIT HYDERABAD Techniques for Organization and Visualization of Community Photo Collections Kumar Srijan Faculty Advisor : Dr. C.V. Jawahar.

Geometric Camera Calibration

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry

Think-Pair-Share What visual or physiological cues help us to perceive 3D shape and depth?

Miniature faking In close-up photo, the depth of field is limited.

Discrete-Continuous Optimization for Large-scale Structure from Motion

Answering ‘Where am I?’ by Nonlinear Least Squares

Digital Visual Effects, Spring 2006 Yung-Yu Chuang 2005/4/19

Stereo and Structure from Motion

Miniature faking In close-up photo, the depth of field is limited.

Epipolar geometry.

Modeling the world with photos

Structure from motion Input: Output: (Tomasi and Kanade)

Advanced Computer Vision

Multi-view geometry.

Single-view geometry Odilon Redon, Cyclops, 1914.

Structure from motion.

CSE 185 Introduction to Computer Vision

Computational Photography

Calibration and homographies

Structure from motion Input: Output: (Tomasi and Kanade)

Lecture 15: Structure from motion

Presentation transcript:

Noah Snavely

Large-scale structure from motion Dubrovnik, Croatia. 4,619 images (out of an initial 57,845). Total reconstruction time: 23 hours Number of cores: 352

Structure from motion Given many images, how can we a) figure out where they were all taken from? b) build a 3D model of the scene? This is (roughly) the structure from motion problem

Structure from motion Reconstruction (side) (top) Input: images with points in correspondence pi,j = (ui,j,vi,j) Output structure: 3D location xi for each point pi motion: camera parameters Rj , tj possibly Kj Objective function: minimize reprojection error

Camera calibration and triangulation Suppose we know 3D points And have matches between these points and an image How can we compute the camera parameters? Suppose we have know camera parameters, each of which observes a point How can we compute the 3D location of that point?

Structure from motion SfM solves both of these problems at once A kind of chicken-and-egg problem (but solvable)

Photo Tourism Here’s how the same set of photos appear in our photo explorer. Our system takes the set of photos and automatically determines the relative positions and orientations from which each photo was taken. We can then load the photos into our immersive 3D browser where the user can visualize and explore the photos using spatial relationships.

First step: how to get correspondence? Feature detection and matching

Feature detection Detect features using SIFT [Lowe, IJCV 2004] So, we begin by detecting SIFT features in each photo. [We detect SIFT features in each photo, resulting in a set of feature locations and 128-byte feature descriptors…]

Feature detection Detect features using SIFT [Lowe, IJCV 2004]

Feature matching Match features between each pair of images Next, we match features across each pair of photos using approximate nearest neighbor matching.

Feature matching Refine matching using RANSAC to estimate fundamental matrix between each pair The matches are refined by using RANSAC, which is a robust model-fitting technique, to estimate a fundamental matrix between each pair of matching images and keeping only matches consistent with that fundamental matrix. We then link connected components of pairwise feature matches together to form correspondences across multiple images. Once we have correspondences, we run structure from motion to recover the camera and scene geometry. Structure from motion solves the following problem:

Image connectivity graph (graph layout produced using the Graphviz toolkit: http://www.graphviz.org/)

R1,t1 R3,t3 R2,t2 x4 x1 x3 x2 x5 x7 x6 p1,1 p1,3 p1,2 Image 1 Image 3

Structure from motion f (R, T, P) minimize R1,t1 R3,t3 R2,t2 p4 p1 p3 Structure from motion solves the following problem: Given a set of images of a static scene with 2D points in correspondence, shown here as color-coded points, find… a set of 3D points P and a rotation R and position t of the cameras that explain the observed correspondences. In other words, when we project a point into any of the cameras, the reprojection error between the projected and observed 2D points is low. This problem can be formulated as an optimization problem where we want to find the rotations R, positions t, and 3D point locations P that minimize sum of squared reprojection errors f. This is a non-linear least squares problem and can be solved with algorithms such as Levenberg-Marquart. However, because the problem is non-linear, it can be susceptible to local minima. Therefore, it’s important to initialize the parameters of the system carefully. In addition, we need to be able to deal with erroneous correspondences. Camera 1 Camera 3 R1,t1 R3,t3 Camera 2 R2,t2

Problem size Trevi Fountain collection 466 input photos + > 100,000 3D points = very large optimization problem

R1,t1 R3,t3 R2,t2 x4 x1 x3 x2 x5 x7 x6 p1,1 p1,3 p1,2 Image 1 Image 3

SfM objective function Given point x and rotation and translation R, t Minimize sum of squared reprojection errors: predicted image location observed image location

Solving structure from motion Minimizing g is difficult g is non-linear due to rotations, perspective division lots of parameters: 3 for each 3D point, 6 for each camera difficult to initialize gauge ambiguity: error is invariant to a similarity transform (translation, rotation, uniform scale) Many techniques use non-linear least-squares (NLLS) optimization (bundle adjustment) Levenberg-Marquardt is one common algorithm for NLLS Lourakis, The Design and Implementation of a Generic Sparse Bundle Adjustment Software Package Based on the Levenberg-Marquardt Algorithm, http://www.ics.forth.gr/~lourakis/sba/ http://en.wikipedia.org/wiki/Levenberg-Marquardt_algorithm

Extensions to SfM Can also solve for intrinsic parameters (focal length, radial distortion, etc.) Can use a more robust function than squared error, to avoid fitting to outliers For more information, see: Triggs, et al, “Bundle Adjustment – A Modern Synthesis”, Vision Algorithms 2000.

Incremental structure from motion To help get good initializations for all of the parameters of the system, we reconstruct the scene incrementally, starting from two photographs and the points they observe.

Photo Explorer

Questions?