Presentation on theme: "Feature Tracking and Optical Flow"— Presentation transcript:
1Feature Tracking and Optical Flow 11/07/11Feature Tracking and Optical FlowComputer VisionCS 143, BrownJames HaysMany slides adapted from Derek Hoeim, Lana Lazebnik, Silvio Saverse, who in turn adapted slides from Steve Seitz, Rick Szeliski, Martial Hebert, Mark Pollefeys, and others
2Recap: Stereo Input: two views Output: dense depth measurements Outline:If views are not rectified, estimate the epipolar geometryUse a stereo correspondence algorithm to match patches across images while respecting the epipolar constraintsDepth is a function of disparityz = (baseline*f) / (X – X’)
3Recap: EpipolesPoint x in left image corresponds to epipolar line l’ in right imageEpipolar line passes through the epipole (the intersection of the cameras’ baseline with the image plane
4Recap: Fundamental Matrix Fundamental matrix maps from a point in one image to a line in the otherIf x and x’ correspond to the same 3d point X:F can be estimated from 8 (x, x’) matches by simply setting up a system of linear equations.
58-point algorithm Solve a system of homogeneous linear equations Write down the system of equations
68-point algorithm Solve a system of homogeneous linear equations Write down the system of equationsSolve f from Af=0
88-point algorithm Solve a system of homogeneous linear equations Write down the system of equationsSolve f from Af=0Resolve det(F) = 0 constraint using SVDMatlab:[U, S, V] = svd(F);S(3,3) = 0;F = U*S*V’;
9Recap: Structure from Motion Input: Arbitrary number of uncalibrated viewsOutput: Camera parameters and sparse 3d points
10Photo synthNoah Snavely, Steven M. Seitz, Richard Szeliski, "Photo tourism: Exploring photo collections in 3D," SIGGRAPH 2006
11Can we do SfM and Stereo? Yes, numerous systems do. One example of the state of the art:
12This class: recovering motion Feature-trackingExtract visual features (corners, textured areas) and “track” them over multiple framesOptical flowRecover image motion at each pixel from spatio-temporal image brightness variations (optical flow)Two problems, one registration methodB. Lucas and T. Kanade. An iterative image registration technique with an application tostereo vision. In Proceedings of the International Joint Conference on Artificial Intelligence, pp. 674–679, 1981.
13Feature trackingMany problems, such as structure from motion require matching pointsIf motion is small, tracking is an easy way to get them
14Feature tracking Challenges Figure out which features can be tracked Efficiently track across framesSome points may change appearance over time (e.g., due to rotation, moving into shadows, etc.)Drift: small errors can accumulate as appearance model is updatedPoints may appear or disappear: need to be able to add/delete tracked points
15Feature trackingI(x,y,t)I(x,y,t+1)Given two subsequent frames, estimate the point translationKey assumptions of Lucas-Kanade TrackerBrightness constancy: projection of the same point looks the same in every frameSmall motion: points do not move very farSpatial coherence: points move like their neighbors
16The brightness constancy constraint I(x,y,t)I(x,y,t+1)Brightness Constancy Equation:Take Taylor expansion of I(x+u, y+v, t+1) at (x,y,t) to linearize the right side:Brightness constancy equation is equivalent to minimizing the squared error of I(t) and I(t+1)In mathematics, a Taylor series is a representation of a function as an infinite sum of terms that are calculated from the values of the function's derivatives at a single point.Image derivative along xDifference over framesHence,
17How does this make sense? What do the static image gradients have to do with motion estimation?
18The brightness constancy constraint Can we use this equation to recover image motion (u,v) at each pixel?How many equations and unknowns per pixel?One equation (this is a scalar equation!), two unknowns (u,v)The component of the motion perpendicular to the gradient (i.e., parallel to the edge) cannot be measuredIf (u, v) satisfies the equation, so does (u+u’, v+v’ ) ifA: u and v are unknown, 1 equationgradient(u,v)(u+u’,v+v’)(u’,v’)edge
23Solving the ambiguity… B. Lucas and T. Kanade. An iterative image registration technique with an application to stereo vision. In Proceedings of the International Joint Conference on Artificial Intelligence, pp. 674–679, 1981.How to get more equations for a pixel?Spatial coherence constraintAssume the pixel’s neighbors have the same (u,v)If we use a 5x5 window, that gives us 25 equations per pixel
25Matching patches across images Overconstrained linear systemLeast squares solution for d given byThe summations are over all pixels in the K x K window
26Conditions for solvability Optimal (u, v) satisfies Lucas-Kanade equationWhen is this solvable? I.e., what are good points to track?ATA should be invertibleATA should not be too small due to noiseeigenvalues 1 and 2 of ATA should not be too smallATA should be well-conditioned 1/ 2 should not be too large ( 1 = larger eigenvalue)Does this remind you of anything?Criteria for Harris corner detector
27M = ATA is the second moment matrix ! (Harris corner detector…) Eigenvectors and eigenvalues of ATA relate to edge direction and magnitudeThe eigenvector associated with the larger eigenvalue points in the direction of fastest intensity changeThe other eigenvector is orthogonal to itSuppose M = vv^T . What are the eigenvectors and eigenvalues? Start by considering Mv
28Low-texture regiongradients have small magnitudesmall l1, small l2
29Edgegradients very large or very smalllarge l1, small l2
30High-texture region gradients are different, large magnitudes large l1, large l2
33Dealing with larger movements: Iterative refinement Original (x,y) positionInitialize (x’,y’) = (x,y)Compute (u,v) byShift window by (u, v): x’=x’+u; y’=y’+v;Recalculate ItRepeat steps 2-4 until small changeUse interpolation for subpixel valuesIt = I(x’, y’, t+1) - I(x, y, t)2nd moment matrix for feature patch in first imagedisplacement
34Dealing with larger movements: coarse-to-fine registration Gaussian pyramid of image 1 (t)Gaussian pyramid of image 2 (t+1)image 2image 1run iterative L-Kupsamplerun iterative L-K.image Jimage I
35Shi-Tomasi feature tracker Find good features using eigenvalues of second-moment matrix (e.g., Harris detector or threshold on the smallest eigenvalue)Key idea: “good” features to track are the ones whose motion can be estimated reliablyTrack from frame to frame with Lucas-KanadeThis amounts to assuming a translation model for frame-to-frame feature movementCheck consistency of tracks by affine registration to the first observed instance of the featureAffine model is more accurate for larger displacementsComparing to the first frame helps to minimize driftJ. Shi and C. Tomasi. Good Features to Track. CVPR 1994.
36J. Shi and C. Tomasi. Good Features to Track. CVPR 1994. Tracking exampleJ. Shi and C. Tomasi. Good Features to Track. CVPR 1994.
37Summary of KLT tracking Find a good point to track (harris corner)Use intensity second moment matrix and difference across frames to find displacementIterate and use coarse-to-fine search to deal with larger movementsWhen creating long tracks, check appearance of registered patch against appearance of initial patch to find points that have drifted
38Implementation issues Window sizeSmall window more sensitive to noise and may miss larger motions (without pyramid)Large window more likely to cross an occlusion boundary (and it’s slower)15x15 to 31x31 seems typicalWeighting the windowCommon to apply weights so that center matters more (e.g., with Gaussian)
39Optical flowVector field function of the spatio-temporal image brightness variationsPicture courtesy of Selim Temizer - Learning and Intelligent Systems (LIS) Group, MIT
40Motion and perceptual organization Sometimes, motion is the only cue
41Motion and perceptual organization Even “impoverished” motion data can evoke a strong perceptG. Johansson, “Visual Perception of Biological Motion and a Model For Its Analysis", Perception and Psychophysics 14, , 1973.
42Motion and perceptual organization Even “impoverished” motion data can evoke a strong perceptG. Johansson, “Visual Perception of Biological Motion and a Model For Its Analysis", Perception and Psychophysics 14, , 1973.
43Uses of motion Estimating 3D structure Segmenting objects based on motion cuesLearning and tracking dynamical modelsRecognizing events and activitiesImproving video quality (motion stabilization)
44Motion fieldThe motion field is the projection of the 3D scene motion into the imageWhat would the motion field of a non-rotating ball moving towards the camera look like?
45Optical flowDefinition: optical flow is the apparent motion of brightness patterns in the imageIdeally, optical flow would be the same as the motion fieldHave to be careful: apparent motion can be caused by lighting changes without any actual motionThink of a uniform rotating sphere under fixed lighting vs. a stationary sphere under moving illumination
46Lucas-Kanade Optical Flow Same as Lucas-Kanade feature tracking, but for each pixelAs we saw, works better for textured pixelsOperations can be done one frame at a time, rather than pixel by pixelEfficient
48Iterative Refinement Iterative Lukas-Kanade Algorithm Estimate displacement at each pixel by solving Lucas-Kanade equationsWarp I(t) towards I(t+1) using the estimated flow field- Basically, just interpolationRepeat until convergence* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
49Coarse-to-fine optical flow estimation Gaussian pyramid of image 1 (t)Gaussian pyramid of image 2 (t+1)image 2image 1run iterative L-Kwarp & upsamplerun iterative L-K.image Jimage I
50Coarse-to-fine optical flow estimation Gaussian pyramid of image 1Gaussian pyramid of image 2image 2image 1u=10 pixelsu=5 pixelsu=2.5 pixelsu=1.25 pixelsimage Himage I
51Example* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
52Multi-resolution registration * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
53Optical Flow Results* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
54Optical Flow Results* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
55Errors in Lucas-Kanade The motion is largePossible Fix: Keypoint matchingA point does not move like its neighborsPossible Fix: Region-based matchingBrightness constancy does not holdPossible Fix: Gradient constancyIterative refinement:Estimate velocity at each pixel using one iteration of Lucas and Kanade estimationWarp one image toward the other using the estimated flow fieldRefine estimate by repeating the process
56State-of-the-art optical flow Start with something similar to Lucas-Kanade + gradient constancy + energy minimization with smoothing term + region matching + keypoint matching (long-range)Region-based+Pixel-based+Keypoint-basedLarge displacement optical flow, Brox et al., CVPR 2009
57Stereo vs. Optical Flow Similar dense matching procedures Why don’t we typically use epipolar constraints for optical flow?B. Lucas and T. Kanade. An iterative image registration technique with an application tostereo vision. In Proceedings of the International Joint Conference on Artificial Intelligence, pp. 674–679, 1981.
58Summary Major contributions from Lucas, Tomasi, Kanade Key ideas Tracking feature pointsOptical flowKey ideasBy assuming brightness constancy, truncated Taylor expansion leads to simple and fast patch matching across framesCoarse-to-fine registration