776 Computer Vision Jan-Michael Frahm Spring 2012.

776 Computer Vision Jan-Michael Frahm Spring 2012

Feature point extraction homogeneous edge corner Find points for which the following is maximum i.e. maximize smallest eigenvalue of M

Compare intensities pixel-by-pixel Comparing image regions I(x,y) I´(x,y) Sum of Square Differences Dissimilarity measures

Compare intensities pixel-by-pixel Comparing image regions I(x,y) I´(x,y) Zero-mean Normalized Cross Correlation Similarity measures

Feature point extraction Approximate SSD for small displacement Δ o Image difference, square difference for pixel SSD for window

Harris corner detector Only use local maxima, subpixel accuracy through second order surface fitting Select strongest features over whole image and over each tile (e.g. 1000/image, 2/tile) Use small local window: Maximize „cornerness“:

Simple matching for each corner in image 1 find the corner in image 2 that is most similar (using SSD or NCC) and vice-versa Only compare geometrically compatible points Keep mutual best matches What transformations does this work for?

Feature matching: example 0.96-0.40-0.16-0.390.19 -0.050.75-0.470.510.72 -0.18-0.390.730.15-0.75 -0.270.490.160.790.21 0.080.50-0.450.280.99 1 5 2 4 3 15 2 4 3 What transformations does this work for? What level of transformation do we need?

Feature tracking Identify features and track them over video o Small difference between frames o potential large difference overall Standard approach: KLT (Kanade-Lukas-Tomasi)

Feature Tracking Establish correspondences between identical salient points multiple images 11

Good features to track Use the same window in feature selection as for tracking itself Compute motion assuming it is small Affine is also possible, but a bit harder (6x6 in stead of 2x2) differentiate:

Example Simple displacement is sufficient between consecutive frames, but not to compare to reference template

Example

Synthetic example

Good features to keep tracking Perform affine alignment between first and last frame Stop tracking features with too large errors

Brightness constancy assumption Optical flow (small motion) 1D example possibility for iterative refinement

Brightness constancy assumption Optical flow (small motion) 2D example (2 unknowns) (1 constraint) ? isophote I(t)=I isophote I(t+1)=I the “aperture” problem

Motion estimation The Aperture Problem Let Algorithm: At each pixel compute by solving M is singular if all gradient vectors point in the same direction e.g., along an edge of course, trivially singular if the summation is over a single pixel or there is no texture i.e., only normal flow is available (aperture problem) Corners and textured areas are OK and Slide credit: S. Seitz, R. Szeliski

Optical flow How to deal with aperture problem? Assume neighbors have same displacement (3 constraints if color gradients are different) Slide credit: S. Seitz, R. Szeliski

Motion estimation 21 SSD Surface – Textured area Slide credit: S. Seitz, R. Szeliski

Motion estimation 22 SSD Surface -- Edge Slide credit: S. Seitz, R. Szeliski

Motion estimation 23 SSD – homogeneous area Slide credit: S. Seitz, R. Szeliski

Lucas-Kanade Assume neighbors have same displacement least-squares:

Revisiting the small motion assumption Is this motion small enough? o Probably not—it’s much larger than one pixel (2 nd order terms dominate) o How might we solve this problem? * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Reduce the resolution! * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

image I t-1 image I Gaussian pyramid of image I t-1 Gaussian pyramid of image I image I image I t-1 u=10 pixels u=5 pixels u=2.5 pixels u=1.25 pixels Coarse-to-fine optical flow estimation slides from Bradsky and Thrun

image I image J Gaussian pyramid of image I t-1 Gaussian pyramid of image I image I image I t-1 Coarse-to-fine optical flow estimation run iterative L-K warp & upsample...... slides from Bradsky and Thrun

Gain-Adaptive KLT-Tracking Video with fixed gain Video with auto-gain Data parallel implementation on GPU [Sinha, Frahm, Pollefeys, Genc MVA'07] Simultaneous tracking and radiometric calibration [Kim, Frahm, Pollefeys ICCV07] -But: not data parallel – hard for GPU acceleration Block-Jacobi iterations [Zach, Gallup, Frahm CVGPU’08] +Data parallel, very efficient on GPU 29

Gain Estimation Camera reported (blue) and estimated gains (red) ‏ [Zach, Gallup, Frahm CVGPU08] 30

Limits of the gradient method Fails when intensity structure in window is poor Fails when the displacement is large (typical operating range is motion of 1 pixel) Linearization of brightness is suitable only for small displacements Also, brightness is not strictly constant in images actually less problematic than it appears, since we can pre-filter images to make them look similar Slide credit: S. Seitz, R. Szeliski

Limitations of Yosemite Only sequence used for quantitative evaluation Limitations: Very simple and synthetic Small, rigid motion Minimal motion discontinuities/occlusions Image 7 Image 8 Yosemite Ground-Truth Flow Flow Color Coding Slide credit: S. Seitz, R. Szeliski

Limitations of Yosemite Only sequence used for quantitative evaluation Current challenges: Non-rigid motion Real sensor noise Complex natural scenes Motion discontinuities Need more challenging and more realistic benchmarks Image 7 Image 8 Yosemite Ground-Truth Flow Flow Color Coding Slide credit: S. Seitz, R. Szeliski

Motion estimation 34 Realistic synthetic imagery Randomly generate scenes with “ trees ” and “ rocks ” Significant occlusions, motion, texture, and blur Rendered using Mental Ray and “ lens shader ” plugin Rock Grove Slide credit: S. Seitz, R. Szeliski

Motion estimation 35 Modified stereo imagery Recrop and resample ground-truth stereo datasets to have appropriate motion for OF Venus Moebius Slide credit: S. Seitz, R. Szeliski

Paint scene with textured fluorescent paint Take 2 images: One in visible light, one in UV light Move scene in very small steps using robot Generate ground-truth by tracking the UV images Dense flow with hidden texture Setup Visible UV LightsImageCropped Slide credit: S. Seitz, R. Szeliski

Experimental results Algorithms: Pyramid LK: OpenCV-based implementation of Lucas-Kanade on a Gaussian pyramid Black and Anandan: Author ’ s implementation Bruhn et al.: Our implementation MediaPlayer TM : Code used for video frame-rate upsampling in Microsoft MediaPlayer Zitnick et al.: Author ’ s implementation Slide credit: S. Seitz, R. Szeliski

Motion estimation 38 Experimental results Slide credit: S. Seitz, R. Szeliski

Conclusions Difficulty: Data substantially more challenging than Yosemite Diversity: S ubstantial variation in difficulty across the various datasets Motion GT vs Interpolation: Best algorithms for one are not the best for the other Comparison with Stereo: Performance of existing flow algorithms appears weak Slide credit: S. Seitz, R. Szeliski

Motion representations How can we describe this scene? Slide credit: S. Seitz, R. Szeliski

Block-based motion prediction Break image up into square blocks Estimate translation for each block Use this to predict next frame, code difference (MPEG-2) Slide credit: S. Seitz, R. Szeliski

Layered motion Break image sequence up into “ layers ” :  = Describe each layer ’ s motion Slide credit: S. Seitz, R. Szeliski

Layered motion Advantages: can represent occlusions / disocclusions each layer ’ s motion can be smooth video segmentation for semantic processing Difficulties: how do we determine the correct number? how do we assign pixels? how do we model the motion? Slide credit: S. Seitz, R. Szeliski

Motion estimation 44 Layers for video summarization Slide credit: S. Seitz, R. Szeliski

Background modeling (MPEG-4) Convert masked images into a background sprite for layered video coding + + + = Slide credit: S. Seitz, R. Szeliski

What are layers? [Wang & Adelson, 1994] intensities alphas velocities Slide credit: S. Seitz, R. Szeliski

How do we form them? Slide credit: S. Seitz, R. Szeliski

How do we estimate the layers? 1.compute coarse-to-fine flow 2.estimate affine motion in blocks (regression) 3.cluster with k-means 4.assign pixels to best fitting affine region 5.re-estimate affine motions in each region… Slide credit: S. Seitz, R. Szeliski

Layer synthesis For each layer: stabilize the sequence with the affine motion compute median value at each pixel Determine occlusion relationships Slide credit: S. Seitz, R. Szeliski

Results

Fitting

Fitting We’ve learned how to detect edges, corners, blobs. Now what? We would like to form a higher-level, more compact representation of the features in the image by grouping multiple features according to a simple model 9300 Harris Corners Pkwy, Charlotte, NC Slide credit: S. Lazebnik

Source: K. GraumanFitting Choose a parametric model to represent a set of features simple model: lines simple model: circles complicated model: car

Fitting: Issues Noise in the measured feature locations Extraneous data: clutter (outliers), multiple lines Missing data: occlusions Case study: Line detection Slide credit: S. Lazebnik

Fitting: Overview If we know which points belong to the line, how do we find the “optimal” line parameters? o Least squares What if there are outliers? o Robust fitting, RANSAC What if there are many lines? o Voting methods: RANSAC, Hough transform What if we’re not even sure it’s a line? o Model selection Slide credit: S. Lazebnik

Least squares line fitting Data: (x 1, y 1 ), …, (x n, y n ) Line equation: y i = m x i + b Find ( m, b ) to minimize Normal equations: least squares solution to XB=Y (x i, y i ) y=mx+b Slide credit: S. Lazebnik

Problem with “vertical” least squares Not rotation-invariant Fails completely for vertical lines Slide credit: S. Lazebnik

776 Computer Vision Jan-Michael Frahm Spring 2012.

Similar presentations

Presentation on theme: "776 Computer Vision Jan-Michael Frahm Spring 2012."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

776 Computer Vision Jan-Michael Frahm Spring 2012.

Similar presentations

Presentation on theme: "776 Computer Vision Jan-Michael Frahm Spring 2012."— Presentation transcript:

Similar presentations

About project

Feedback