Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely.

Similar presentations


Presentation on theme: "Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely."— Presentation transcript:

1 Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely

2 Readings Szeliski, Chapter 7.1 – 7.4

3 Announcements Project 2b due on Tuesday by 10:59pm Final project proposals due today at 11:59pm Course survey

4 Encoding blend weights: I(x,y) = (  R,  G,  B,  ) color at p = Implement this in two steps: 1. accumulate: add up the (  premultiplied) RGB  values at each pixel 2. normalize: divide each pixel’s accumulated RGB by its  value Q: what if  = 0? Alpha Blending Optional: see Blinn (CGA, 1994) for details: http://ieeexplore.ieee.org/iel1/38/7531/00310740.pdf?isNumb er=7531&prod=JNL&arnumber=310740&arSt=83&ared=87&a rAuthor=Blinn%2C+J.Fhttp://ieeexplore.ieee.org/iel1/38/7531/00310740.pdf?isNumb er=7531&prod=JNL&arnumber=310740&arSt=83&ared=87&a rAuthor=Blinn%2C+J.F. I1I1 I2I2 I3I3 p

5 How should we set the alpha values of I 1, I 2, I 3 ? Simplest choice: set all alpha values to one (gives discontinuities) Better choice: use feathering to ramp alpha values to zero near the edges Blend weights I1I1 I2I2 I3I3 p Feathered alpha channel

6 What about more than two views? The geometry of three views is described by a 3 x 3 x 3 tensor called the trifocal tensor The geometry of four views is described by a 3 x 3 x 3 x 3 tensor called the quadrifocal tensor After this it starts to get complicated…

7 New approach These matrices, tensors, etc, model geometry as a matrix that depends (in complicated ways) on the camera parameters alone Instead, we will explicitly model both cameras and points

8 Large-scale structure from motion Dubrovnik, Croatia. 4,619 images (out of an initial 57,845). Total reconstruction time: 23 hours Number of cores: 352

9 Questions?

10 Structure from motion Given many images, how can we a) figure out where they were all taken from? b) build a 3D model of the scene? This is (roughly) the structure from motion problem

11 Structure from motion Input: images with points in correspondence p i,j = (u i,j,v i,j ) Output structure: 3D location x i for each point p i motion: camera parameters R j, t j possibly K j Objective function: minimize reprojection error Reconstruction (side) (top)

12 Camera calibration and triangulation Suppose we know 3D points – And have matches between these points and an image – How can we compute the camera parameters? Suppose we have know camera parameters, each of which observes a point – How can we compute the 3D location of that point?

13 Structure from motion SfM solves both of these problems at once A kind of chicken-and-egg problem – (but solvable)

14 Photo Tourism

15 First step: how to get correspondence? Feature detection and matching

16 Feature detection Detect features using SIFT [Lowe, IJCV 2004]

17 Feature detection Detect features using SIFT [Lowe, IJCV 2004]

18 Feature matching Match features between each pair of images

19 Feature matching Refine matching using RANSAC to estimate fundamental matrix between each pair

20 Image connectivity graph (graph layout produced using the Graphviz toolkit: http://www.graphviz.org/)http://www.graphviz.org/

21 Structure from motion Camera 1 Camera 2 Camera 3 R 1,t 1 R 2,t 2 R 3,t 3 X1X1 X4X4 X3X3 X2X2 X5X5 X6X6 X7X7 minimize f (R, T, P)f (R, T, P) p 1,1 p 1,2 p 1,3 non-linear least squares

22 Problem size What are the variables? How many variables per camera? How many variables per point? Trevi Fountain collection 466 input photos + > 100,000 3D points = very large optimization problem

23 Incremental structure from motion

24

25

26 Photo Explorer

27 Demo

28

29

30

31

32 Questions?


Download ppt "Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely."

Similar presentations


Ads by Google