Lecture 9 Feature Extraction and Motion Estimation Slides by: Michael Black Clark F. Olson Jean Ponce.

Slides:



Advertisements
Similar presentations
Feature extraction: Corners
Advertisements

CSE 473/573 Computer Vision and Image Processing (CVIP)
Interest points CSE P 576 Ali Farhadi Many slides from Steve Seitz, Larry Zitnick.
TP14 - Local features: detection and description Computer Vision, FCUP, 2014 Miguel Coimbra Slides by Prof. Kristen Grauman.
MASKS © 2004 Invitation to 3D vision Lecture 7 Step-by-Step Model Buidling.
Computational Photography
Structure from motion.
Feature extraction: Corners 9300 Harris Corners Pkwy, Charlotte, NC.
Lecture 6: Feature matching CS4670: Computer Vision Noah Snavely.
Lecture 4: Feature matching
Epipolar geometry. (i)Correspondence geometry: Given an image point x in the first view, how does this constrain the position of the corresponding point.
Structure from motion. Multiple-view geometry questions Scene geometry (structure): Given 2D point matches in two or more images, where are the corresponding.
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2005 with a lot of slides stolen from Steve Seitz and.
Feature extraction: Corners and blobs
Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.
Lecture 3a: Feature detection and matching CS6670: Computer Vision Noah Snavely.
Visual motion Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys.
Motion Estimation Today’s Readings Trucco & Verri, 8.3 – 8.4 (skip 8.3.3, read only top half of p. 199) Numerical Recipes (Newton-Raphson), 9.4 (first.
May 2004Stereo1 Introduction to Computer Vision CS / ECE 181B Tuesday, May 11, 2004  Multiple view geometry and stereo  Handout #6 available (check with.
Matching Compare region of image to region of image. –We talked about this for stereo. –Important for motion. Epipolar constraint unknown. But motion small.
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.
CS4670: Computer Vision Kavita Bala Lecture 7: Harris Corner Detection.
Lecture 12: Structure from motion CS6670: Computer Vision Noah Snavely.
Summary of Previous Lecture A homography transforms one 3d plane to another 3d plane, under perspective projections. Those planes can be camera imaging.
Automatic Camera Calibration
MASKS © 2004 Invitation to 3D vision Lecture 3 Image Primitives andCorrespondence.
Lecture 11 Stereo Reconstruction I Lecture 11 Stereo Reconstruction I Mata kuliah: T Computer Vision Tahun: 2010.
1 Interest Operators Harris Corner Detector: the first and most basic interest operator Kadir Entropy Detector and its use in object recognition SIFT interest.
The Correspondence Problem and “Interest Point” Detection Václav Hlaváč Center for Machine Perception Czech Technical University Prague
Structure from Motion Computer Vision CS 143, Brown James Hays 11/18/11 Many slides adapted from Derek Hoiem, Lana Lazebnik, Silvio Saverese, Steve Seitz,
Visual motion Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys.
CSE 185 Introduction to Computer Vision Local Invariant Features.
December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.
Feature extraction: Corners 9300 Harris Corners Pkwy, Charlotte, NC.
776 Computer Vision Jan-Michael Frahm, Enrique Dunn Spring 2013.
Local invariant features 1 Thursday October 3 rd 2013 Neelima Chavali Virginia Tech.
Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.
Notes on the Harris Detector
Raquel A. Romano 1 Scientific Computing Seminar May 12, 2004 Projective Geometry for Computer Vision Projective Geometry for Computer Vision Raquel A.
Feature extraction: Corners and blobs. Why extract features? Motivation: panorama stitching We have two images – how do we combine them?
CSE 185 Introduction to Computer Vision Feature Matching.
Project 3 questions? Interest Points and Instance Recognition Computer Vision CS 143, Brown James Hays 10/21/11 Many slides from Kristen Grauman and.
Local features and image matching October 1 st 2015 Devi Parikh Virginia Tech Disclaimer: Many slides have been borrowed from Kristen Grauman, who may.
Local features: detection and description
Features Jan-Michael Frahm.
CS654: Digital Image Analysis
Optical flow and keypoint tracking Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys.
CSE 185 Introduction to Computer Vision Local Invariant Features.
MASKS © 2004 Invitation to 3D vision Lecture 3 Image Primitives andCorrespondence.
Lecture 10: Harris Corner Detector CS4670/5670: Computer Vision Kavita Bala.
Keypoint extraction: Corners 9300 Harris Corners Pkwy, Charlotte, NC.
Invariant Local Features Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging.
Interest Points EE/CSE 576 Linda Shapiro.
3D Vision Interest Points.
TP12 - Local features: detection and description
Motion and Optical Flow
Local features: detection and description May 11th, 2017
Image Primitives and Correspondence
Epipolar geometry.
Structure from motion Input: Output: (Tomasi and Kanade)
Noah Snavely.
Local features and image matching
Announcements Questions on the project? New turn-in info online
CSE 185 Introduction to Computer Vision
Lecture VI: Corner and Blob Detection
Lecture 5: Feature invariance
Lecture 5: Feature invariance
Structure from motion Input: Output: (Tomasi and Kanade)
Presentation transcript:

Lecture 9 Feature Extraction and Motion Estimation Slides by: Michael Black Clark F. Olson Jean Ponce

2 Motion Rather than using two cameras, we can extract information about the environment by moving a single camera. Some motion problems are similar to stereo: Correspondence Reconstruction New problem: motion estimation Sometimes another problem is also present: Segmentation: Which image regions correspond to rigidly moving objects.

Given m pictures of n points, can we recover the three-dimensional configuration of these points? the camera configurations? (structure) (motion) Some textbooks treat motion largely from the perspective of small camera motions. We will not be so limited! Structure From Motion 3 x1jx1j x2jx2j x3jx3j XjXj P1P1 P2P2 P3P3

4 Several questions must be answered: What image points should be matched? - feature selection What are the correct matches between the images? - feature tracking (unlike stereo, no epipolar constraint) Given the matches, what is camera motion? Given the matches, where are the points? Simplifying assumption: scene is static. - objects don’t move relative to each other

5 Feature Selection We could track all image pixels, but this requires excessive computation. We want to select features that are easy to find in other images. Edges are easy to find in one direction, but not the other: aperture problem! Corner points (with gradients in multiple directions) can be precisely located.

Corner Detection We should easily recognize the point by looking through a small window. Shifting a window in any direction should give a large change in intensity. 6 “edge”: no change along the edge direction “corner”: significant change in all directions “flat” region: no change in all directions Source: A. Efros

Basic idea for corner detection: Find image patches with gradients in multiple directions. InputCorners selected Corner Detection 7

8 2 x 2 matrix of image derivatives (averaged in neighborhood of a point). Notation:

Corner detection “Corner” 1 and 2 are large, 1 ~ 2 ; E increases in all directions 1 and 2 are small; E is almost constant in all directions “Edge” 1 >> 2 “Edge” 2 >> 1 “Flat” region Classification of image points using eigenvalues of M:

Harris Corner Detector 10 1)Compute M matrix for each image window to get their cornerness scores. 2)Find points whose surrounding window gave large corner response. 3)Take the points of local maxima, i.e., perform non- maximum suppression.

Harris Corner Detector 11 Input images

Harris Corner Detector 12 Cornerness scores

Harris Corner Detector 13 Thresholded

Harris Corner Detector 14 Local maxima

Harris Corner Detector 15 Corners output

Harris Detector Properties 16 Rotation invariant? Scale invariant? All points will be classified as edges Corner ! Yes No

Automatic Scale Selection 17 Intuition: Find scale that gives local maxima of some function f in both position and scale.

Choosing a Detector 18 What do you want it for? – Precise localization in x-y: Harris – Good localization in scale: Difference of Gaussian – Flexible region shape: MSER Best choice often application dependent – Harris-/Hessian-Laplace/DoG work well for many natural categories – MSER works well for buildings and printed things Why choose? – Get more points with more detectors There have been extensive evaluations/comparisons – [Mikolajczyk et al., IJCV’05, PAMI’05] – All detectors/descriptors shown here work well

19 Feature Tracking Determining the corresponding features is similar to stereo vision. Problem: epipolar lines unknown - Matching point could be anywhere in the image. If small motion between images, can search only in small neighborhood. Otherwise, large search space necessary. - Coarse-to-fine search used to reduce computation time.

Feature Tracking Challenges: Figure out which features can be tracked Efficiently track across frames Some points may change appearance over time (e.g., due to rotation, moving into shadows, etc.) Drift: small errors can accumulate as appearance model is updated Points may appear or disappear: need to be able to add/delete tracked points 20

21 Feature Matching Example: The set of vectors from each image location to the corresponding location in the subsequent image is called a motion field.

22 Feature Matching Example: If the camera motion is purely translation, the motion vectors all converge at the “focus-of-expansion”.

23 Ambiguity The relative position between the cameras has six degrees of freedom (six parameters): - Translation in x, y, z - Rotation about x, y, z Problem: images looks exactly the same if everything is scaled by a constant factor. For example: - Cameras twice as far apart - Scene twice as big and twice as far away Can only recover 5 parameters. - Scale can’t be determined, unless known in advance

Scale Ambiguity 24

Structure From Motion 25 Given a set of corresponding points in two or more images, compute the camera parameters and the 3D point coordinates Camera 1 Camera 2 Camera 3 R 1,t 1 R 2,t 2 R 3,t 3 ? ? ? Slide credit: Noah Snavely ?

26 Solving for Structure and Motion Total number of unknown values: - 5 camera motion parameters - n point depths (where n is the number of points matched) Total number of equations: - 2n (each point match has a constraint on the row and column) Can (in principle) solve for unknowns if 2n ≥ 5 + n (n ≥ 5) Usually, many more matches than necessary are used. - Improves performance with respect to noise

27 Solving for Structure and Motion Once the motion is known, dense matching is possible using the epipolar constraint.

28 Multiple Images If there are more than two images, similar ideas apply: - Perform matching between all images - Use constraints given by matches to estimate structure and motion For m images and n points, we have: - 6(m-1)-1+n unknowns = 6m-7+n - 2(m-1)n constraints = 2mn-2n Can (in principle) solve when n is at least (6m-7)/(2m-3).

Bundle adjustment 29 Non-linear method for refining structure and motion Minimizing reprojection error x1jx1j x2jx2j x3jx3j XjXj P1P1 P2P2 P3P3 P1XjP1Xj P2XjP2Xj P3XjP3Xj

30 Stereo Ego-motion One application of structure from motion is to determine the path of a robot by examining the images that it takes. The use of stereo provides several advantages: - The scale is known, since we can compute scene depths - There is more information for matching points (depth)

31 Stereo Ego-motion Stereo ego-motion loop: 1.Feature selection in first stereo pair. 2.Stereo matching in first stereo pair. 3.Feature tracking into second stereo pair. 4.Stereo matching in second stereo pair. 5.Motion estimation using 3D feature positions. 6.Repeat with new images until done.

Ego-motion steps Features selectedFeatures matched in right image Features tracked in left imageFeatures tracked in right image 32

33 Stereo Ego-motion “Urbie” Odometry track Actual track (GPS) Estimated track

34 Advanced Feature Matching Right imageLeft image Left image after affine optimization