Presentation is loading. Please wait.

Presentation is loading. Please wait.

776 Computer Vision Jan-Michael Frahm Spring 2012.

Similar presentations


Presentation on theme: "776 Computer Vision Jan-Michael Frahm Spring 2012."— Presentation transcript:

1 776 Computer Vision Jan-Michael Frahm Spring 2012

2

3 Feature point extraction homogeneous edge corner Find points for which the following is maximum i.e. maximize smallest eigenvalue of M

4 Compare intensities pixel-by-pixel Comparing image regions I(x,y) I´(x,y) Sum of Square Differences Dissimilarity measures

5 Compare intensities pixel-by-pixel Comparing image regions I(x,y) I´(x,y) Zero-mean Normalized Cross Correlation Similarity measures

6 Feature point extraction Approximate SSD for small displacement Δ o Image difference, square difference for pixel SSD for window

7 Harris corner detector Only use local maxima, subpixel accuracy through second order surface fitting Select strongest features over whole image and over each tile (e.g. 1000/image, 2/tile) Use small local window: Maximize „cornerness“:

8 Simple matching for each corner in image 1 find the corner in image 2 that is most similar (using SSD or NCC) and vice-versa Only compare geometrically compatible points Keep mutual best matches What transformations does this work for?

9 Feature matching: example 0.96-0.40-0.16-0.390.19 -0.050.75-0.470.510.72 -0.18-0.390.730.15-0.75 -0.270.490.160.790.21 0.080.50-0.450.280.99 1 5 2 4 3 15 2 4 3 What transformations does this work for? What level of transformation do we need?

10 Feature tracking Identify features and track them over video o Small difference between frames o potential large difference overall Standard approach: KLT (Kanade-Lukas-Tomasi)

11 Feature Tracking Establish correspondences between identical salient points multiple images 11

12 Good features to track Use the same window in feature selection as for tracking itself Compute motion assuming it is small Affine is also possible, but a bit harder (6x6 in stead of 2x2) differentiate:

13 Example Simple displacement is sufficient between consecutive frames, but not to compare to reference template

14 Example

15 Synthetic example

16 Good features to keep tracking Perform affine alignment between first and last frame Stop tracking features with too large errors

17 Brightness constancy assumption Optical flow (small motion) 1D example possibility for iterative refinement

18 Brightness constancy assumption Optical flow (small motion) 2D example (2 unknowns) (1 constraint) ? isophote I(t)=I isophote I(t+1)=I the “aperture” problem

19 Motion estimation The Aperture Problem Let Algorithm: At each pixel compute by solving M is singular if all gradient vectors point in the same direction e.g., along an edge of course, trivially singular if the summation is over a single pixel or there is no texture i.e., only normal flow is available (aperture problem) Corners and textured areas are OK and Slide credit: S. Seitz, R. Szeliski

20 Optical flow How to deal with aperture problem? Assume neighbors have same displacement (3 constraints if color gradients are different) Slide credit: S. Seitz, R. Szeliski

21 Motion estimation 21 SSD Surface – Textured area Slide credit: S. Seitz, R. Szeliski

22 Motion estimation 22 SSD Surface -- Edge Slide credit: S. Seitz, R. Szeliski

23 Motion estimation 23 SSD – homogeneous area Slide credit: S. Seitz, R. Szeliski

24 Lucas-Kanade Assume neighbors have same displacement least-squares:

25 Revisiting the small motion assumption Is this motion small enough? o Probably not—it’s much larger than one pixel (2 nd order terms dominate) o How might we solve this problem? * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

26 Reduce the resolution! * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

27 image I t-1 image I Gaussian pyramid of image I t-1 Gaussian pyramid of image I image I image I t-1 u=10 pixels u=5 pixels u=2.5 pixels u=1.25 pixels Coarse-to-fine optical flow estimation slides from Bradsky and Thrun

28 image I image J Gaussian pyramid of image I t-1 Gaussian pyramid of image I image I image I t-1 Coarse-to-fine optical flow estimation run iterative L-K warp & upsample...... slides from Bradsky and Thrun

29 Gain-Adaptive KLT-Tracking Video with fixed gain Video with auto-gain Data parallel implementation on GPU [Sinha, Frahm, Pollefeys, Genc MVA'07] Simultaneous tracking and radiometric calibration [Kim, Frahm, Pollefeys ICCV07] -But: not data parallel – hard for GPU acceleration Block-Jacobi iterations [Zach, Gallup, Frahm CVGPU’08] +Data parallel, very efficient on GPU 29

30 Gain Estimation Camera reported (blue) and estimated gains (red) ‏ [Zach, Gallup, Frahm CVGPU08] 30

31 Limits of the gradient method Fails when intensity structure in window is poor Fails when the displacement is large (typical operating range is motion of 1 pixel) Linearization of brightness is suitable only for small displacements Also, brightness is not strictly constant in images actually less problematic than it appears, since we can pre-filter images to make them look similar Slide credit: S. Seitz, R. Szeliski

32 Limitations of Yosemite Only sequence used for quantitative evaluation Limitations: Very simple and synthetic Small, rigid motion Minimal motion discontinuities/occlusions Image 7 Image 8 Yosemite Ground-Truth Flow Flow Color Coding Slide credit: S. Seitz, R. Szeliski

33 Limitations of Yosemite Only sequence used for quantitative evaluation Current challenges: Non-rigid motion Real sensor noise Complex natural scenes Motion discontinuities Need more challenging and more realistic benchmarks Image 7 Image 8 Yosemite Ground-Truth Flow Flow Color Coding Slide credit: S. Seitz, R. Szeliski

34 Motion estimation 34 Realistic synthetic imagery Randomly generate scenes with “ trees ” and “ rocks ” Significant occlusions, motion, texture, and blur Rendered using Mental Ray and “ lens shader ” plugin Rock Grove Slide credit: S. Seitz, R. Szeliski

35 Motion estimation 35 Modified stereo imagery Recrop and resample ground-truth stereo datasets to have appropriate motion for OF Venus Moebius Slide credit: S. Seitz, R. Szeliski

36 Paint scene with textured fluorescent paint Take 2 images: One in visible light, one in UV light Move scene in very small steps using robot Generate ground-truth by tracking the UV images Dense flow with hidden texture Setup Visible UV LightsImageCropped Slide credit: S. Seitz, R. Szeliski

37 Experimental results Algorithms: Pyramid LK: OpenCV-based implementation of Lucas-Kanade on a Gaussian pyramid Black and Anandan: Author ’ s implementation Bruhn et al.: Our implementation MediaPlayer TM : Code used for video frame-rate upsampling in Microsoft MediaPlayer Zitnick et al.: Author ’ s implementation Slide credit: S. Seitz, R. Szeliski

38 Motion estimation 38 Experimental results Slide credit: S. Seitz, R. Szeliski

39 Conclusions Difficulty: Data substantially more challenging than Yosemite Diversity: S ubstantial variation in difficulty across the various datasets Motion GT vs Interpolation: Best algorithms for one are not the best for the other Comparison with Stereo: Performance of existing flow algorithms appears weak Slide credit: S. Seitz, R. Szeliski

40 Motion representations How can we describe this scene? Slide credit: S. Seitz, R. Szeliski

41 Block-based motion prediction Break image up into square blocks Estimate translation for each block Use this to predict next frame, code difference (MPEG-2) Slide credit: S. Seitz, R. Szeliski

42 Layered motion Break image sequence up into “ layers ” :  = Describe each layer ’ s motion Slide credit: S. Seitz, R. Szeliski

43 Layered motion Advantages: can represent occlusions / disocclusions each layer ’ s motion can be smooth video segmentation for semantic processing Difficulties: how do we determine the correct number? how do we assign pixels? how do we model the motion? Slide credit: S. Seitz, R. Szeliski

44 Motion estimation 44 Layers for video summarization Slide credit: S. Seitz, R. Szeliski

45 Background modeling (MPEG-4) Convert masked images into a background sprite for layered video coding + + + = Slide credit: S. Seitz, R. Szeliski

46 What are layers? [Wang & Adelson, 1994] intensities alphas velocities Slide credit: S. Seitz, R. Szeliski

47 How do we form them? Slide credit: S. Seitz, R. Szeliski

48 How do we estimate the layers? 1.compute coarse-to-fine flow 2.estimate affine motion in blocks (regression) 3.cluster with k-means 4.assign pixels to best fitting affine region 5.re-estimate affine motions in each region… Slide credit: S. Seitz, R. Szeliski

49 Layer synthesis For each layer: stabilize the sequence with the affine motion compute median value at each pixel Determine occlusion relationships Slide credit: S. Seitz, R. Szeliski

50 Results

51 Fitting

52 Fitting We’ve learned how to detect edges, corners, blobs. Now what? We would like to form a higher-level, more compact representation of the features in the image by grouping multiple features according to a simple model 9300 Harris Corners Pkwy, Charlotte, NC Slide credit: S. Lazebnik

53 Source: K. GraumanFitting Choose a parametric model to represent a set of features simple model: lines simple model: circles complicated model: car

54 Fitting: Issues Noise in the measured feature locations Extraneous data: clutter (outliers), multiple lines Missing data: occlusions Case study: Line detection Slide credit: S. Lazebnik

55 Fitting: Overview If we know which points belong to the line, how do we find the “optimal” line parameters? o Least squares What if there are outliers? o Robust fitting, RANSAC What if there are many lines? o Voting methods: RANSAC, Hough transform What if we’re not even sure it’s a line? o Model selection Slide credit: S. Lazebnik

56 Least squares line fitting Data: (x 1, y 1 ), …, (x n, y n ) Line equation: y i = m x i + b Find ( m, b ) to minimize Normal equations: least squares solution to XB=Y (x i, y i ) y=mx+b Slide credit: S. Lazebnik

57 Problem with “vertical” least squares Not rotation-invariant Fails completely for vertical lines Slide credit: S. Lazebnik


Download ppt "776 Computer Vision Jan-Michael Frahm Spring 2012."

Similar presentations


Ads by Google