Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely.

Slides:



Advertisements
Similar presentations
Structure from motion.
Advertisements

Assignment 4 Cloning Yourself
The fundamental matrix F
Lecture 11: Two-view geometry
Building Rome in a Day Sameer Agarwal1 Noah Snavely2 Ian Simon1 Steven M. Seitz1 Richard Szeliski3 1University of Washington 2Cornell University 3Microsoft.
Multiple View Reconstruction Class 24 Multiple View Geometry Comp Marc Pollefeys.
Discrete-Continuous Optimization for Large-scale Structure from Motion David Crandall, Andrew Owens, Noah Snavely, Dan Huttenlocher Presented by: Rahul.
Structure from motion.
Lecture 14: Panoramas CS4670: Computer Vision Noah Snavely What’s inside your fridge?
Announcements Project 2 Out today Sign up for a panorama kit ASAP! –best slots (weekend) go quickly...
Lecture 23: Structure from motion and multi-view stereo
Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.
Lecture 21: Multiple-view geometry and structure from motion
Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.
Announcements Project 1 artifact voting Project 2 out today panorama signup help session at end of class Guest lectures next week: Li Zhang, Jiwon Kim.
CS664 Lecture #19: Layers, RANSAC, panoramas, epipolar geometry Some material taken from:  David Lowe, UBC  Jiri Matas, CMP Prague
Lecture 20: Two-view geometry CS6670: Computer Vision Noah Snavely.
Lecture 8: Image Alignment and RANSAC
Camera Calibration CS485/685 Computer Vision Prof. Bebis.
Announcements Project 1 Grading session this afternoon Artifacts due Friday (voting TBA) Project 2 out (online) Signup for panorama kits ASAP (weekend.
Image Stitching II Ali Farhadi CSE 455
Global Alignment and Structure from Motion Computer Vision CSE455, Winter 2008 Noah Snavely.
Lecture 12: Structure from motion CS6670: Computer Vision Noah Snavely.
55:148 Digital Image Processing Chapter 11 3D Vision, Geometry Topics: Basics of projective geometry Points and hyperplanes in projective space Homography.
Mosaics CSE 455, Winter 2010 February 8, 2010 Neel Joshi, CSE 455, Winter Announcements  The Midterm went out Friday  See to the class.
Multi-view geometry. Multi-view geometry problems Structure: Given projections of the same 3D point in two or more images, compute the 3D coordinates.
Computer vision: models, learning and inference
Multi-view geometry.
Projective cameras Motivation Elements of Projective Geometry Projective structure from motion Planches : –
Mosaics Today’s Readings Szeliski, Ch 5.1, 8.1 StreetView.
CSCE 643 Computer Vision: Structure from Motion
Multiview Geometry and Stereopsis. Inputs: two images of a scene (taken from 2 viewpoints). Output: Depth map. Inputs: multiple images of a scene. Output:
Lec 22: Stereo CS4670 / 5670: Computer Vision Kavita Bala.
Announcements Project 3 due Thursday by 11:59pm Demos on Friday; signup on CMS Prelim to be distributed in class Friday, due Wednesday by the beginning.
Lecture 03 15/11/2011 Shai Avidan הבהרה : החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
CS4670 / 5670: Computer Vision KavitaBala Lecture 17: Panoramas.
Geometry of Multiple Views
IIIT HYDERABAD Image-based walkthroughs from partial and incremental scene reconstructions Kumar Srijan Syed Ahsan Ishtiaque C. V. Jawahar Center for Visual.
Scene Reconstruction Seminar presented by Anton Jigalin Advanced Topics in Computer Vision ( )
Mosaics part 3 CSE 455, Winter 2010 February 12, 2010.
55:148 Digital Image Processing Chapter 11 3D Vision, Geometry Topics: Basics of projective geometry Points and hyperplanes in projective space Homography.
COS429 Computer Vision =++ Assignment 4 Cloning Yourself.
CS 534 Homework #4 References. Cylindrical Projection Alpha Blending.
3D Reconstruction Using Image Sequence
Structure from motion Multi-view geometry Affine structure from motion Projective structure from motion Planches : –
Image Stitching II Linda Shapiro CSE 455. RANSAC for Homography Initial Matched Points.
Announcements No midterm Project 3 will be done in pairs same partners as for project 2.
Mosaics Today’s Readings Szeliski and Shum paper (sections 1 and 2, skim the rest) –
Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely.
CSE 185 Introduction to Computer Vision Stereo 2.
Multi-view geometry. Multi-view geometry problems Structure: Given projections of the same 3D point in two or more images, compute the 3D coordinates.
Image Stitching II Linda Shapiro EE/CSE 576.
CS4670 / 5670: Computer Vision Kavita Bala Lec 27: Stereo.
COSC579: Image Align, Mosaic, Stitch
Modeling the world with photos
Structure from motion Input: Output: (Tomasi and Kanade)
Idea: projecting images onto a common plane
Image Stitching II Linda Shapiro EE/CSE 576.
Image Stitching II Linda Shapiro CSE 455.
Noah Snavely.
Lecture 23: Structure from motion 2
Announcements Project 1 Project 2 Vote for your favorite artifacts!
Multi-view geometry.
Structure from motion.
Image Stitching II Linda Shapiro EE/CSE 576.
Structure from motion Input: Output: (Tomasi and Kanade)
Lecture 15: Structure from motion
CS 534 Homework #4 References
CS Homework #4 References
Image Stitching II Linda Shapiro ECE P 596.
Presentation transcript:

Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely

Readings Szeliski, Chapter 7.1 – 7.4

Announcements Project 2b due on Tuesday by 10:59pm Final project proposals due today at 11:59pm Course survey

Encoding blend weights: I(x,y) = (  R,  G,  B,  ) color at p = Implement this in two steps: 1. accumulate: add up the (  premultiplied) RGB  values at each pixel 2. normalize: divide each pixel’s accumulated RGB by its  value Q: what if  = 0? Alpha Blending Optional: see Blinn (CGA, 1994) for details: er=7531&prod=JNL&arnumber=310740&arSt=83&ared=87&a rAuthor=Blinn%2C+J.Fhttp://ieeexplore.ieee.org/iel1/38/7531/ pdf?isNumb er=7531&prod=JNL&arnumber=310740&arSt=83&ared=87&a rAuthor=Blinn%2C+J.F. I1I1 I2I2 I3I3 p

How should we set the alpha values of I 1, I 2, I 3 ? Simplest choice: set all alpha values to one (gives discontinuities) Better choice: use feathering to ramp alpha values to zero near the edges Blend weights I1I1 I2I2 I3I3 p Feathered alpha channel

What about more than two views? The geometry of three views is described by a 3 x 3 x 3 tensor called the trifocal tensor The geometry of four views is described by a 3 x 3 x 3 x 3 tensor called the quadrifocal tensor After this it starts to get complicated…

New approach These matrices, tensors, etc, model geometry as a matrix that depends (in complicated ways) on the camera parameters alone Instead, we will explicitly model both cameras and points

Large-scale structure from motion Dubrovnik, Croatia. 4,619 images (out of an initial 57,845). Total reconstruction time: 23 hours Number of cores: 352

Questions?

Structure from motion Given many images, how can we a) figure out where they were all taken from? b) build a 3D model of the scene? This is (roughly) the structure from motion problem

Structure from motion Input: images with points in correspondence p i,j = (u i,j,v i,j ) Output structure: 3D location x i for each point p i motion: camera parameters R j, t j possibly K j Objective function: minimize reprojection error Reconstruction (side) (top)

Camera calibration and triangulation Suppose we know 3D points – And have matches between these points and an image – How can we compute the camera parameters? Suppose we have know camera parameters, each of which observes a point – How can we compute the 3D location of that point?

Structure from motion SfM solves both of these problems at once A kind of chicken-and-egg problem – (but solvable)

Photo Tourism

First step: how to get correspondence? Feature detection and matching

Feature detection Detect features using SIFT [Lowe, IJCV 2004]

Feature detection Detect features using SIFT [Lowe, IJCV 2004]

Feature matching Match features between each pair of images

Feature matching Refine matching using RANSAC to estimate fundamental matrix between each pair

Image connectivity graph (graph layout produced using the Graphviz toolkit:

Structure from motion Camera 1 Camera 2 Camera 3 R 1,t 1 R 2,t 2 R 3,t 3 X1X1 X4X4 X3X3 X2X2 X5X5 X6X6 X7X7 minimize f (R, T, P)f (R, T, P) p 1,1 p 1,2 p 1,3 non-linear least squares

Problem size What are the variables? How many variables per camera? How many variables per point? Trevi Fountain collection 466 input photos + > 100,000 3D points = very large optimization problem

Incremental structure from motion

Photo Explorer

Demo

Questions?