Motion ECE 847: Digital Image Processing Stan Birchfield

Motion ECE 847: Digital Image Processing Stan Birchfield
Clemson University

Outline Motion basics Lucas-Kanade algorithm Horn-Schunck algorithm

What if you only had one eye?
Depth perception is possible by moving eye S. Birchfield, Clemson Univ., ECE 847,

Head movements for depth perception
Some animals move their heads to generate parallax pigeon praying mantis S. Birchfield, Clemson Univ., ECE 847,

Parallax Motion parallax – apparent displacement of object viewed along two different lines of sight S. Birchfield, Clemson Univ., ECE 847,

Difference between stereo and 2-frame motion
Images taken at same time Scene is guaranteed to be static Epipolar constraint restricts search to 1D 2-frame motion: Images taken at different times Objects in scene may have independently moved Epipolar constraint no longer guaranteed Of course, motion involves a video sequence of images S. Birchfield, Clemson Univ., ECE 847,

Motion field and optical flow
Motion field – The actual 3D motion projected onto image plane Optical flow – The “apparent” motion of the brightness pattern in an image (sometimes called optic flow) S. Birchfield, Clemson Univ., ECE 847,

When are the two different?
Barber pole illusion Television / movie (no motion field) Rotating ping-pong ball (no optical flow) S. Birchfield, Clemson Univ., ECE 847,

Optical flow breakdown
Perhaps an aperture problem discussed later.  * From Marc Pollefeys COMP

Another illusion from G. Bradski, CS223B

Motion field Point in world: Projection onto image:
dp/dt p Motion of point in world: dx/dt camera translation rotation where S. Birchfield, Clemson Univ., ECE 847,

Motion field (cont.) Image velocity of projection: Expanding yields:
dp/dt p Expanding yields: dx/dt camera Note that rotation gives no depth information depth S. Birchfield, Clemson Univ., ECE 847,

Motion field (cont.) Two special cases:
Note: This is a radial field emanating from the instantaneous epipole Note: Here the instantaneous epipole is at infinity S. Birchfield, Clemson Univ., ECE 847,

Optical Flow Assumptions: Brightness Constancy
* Slide from Michael Black, CS

Optical Flow Assumptions:
* Slide from Michael Black, CS

Optical flow constraint equation
Brightness constancy assumption velocity (to be estimated) pixel Optical flow constraint equation image image velocity (unknown) image gradient (known) temporal derivative (known) S. Birchfield, Clemson Univ., ECE 847,

Recall: Taylor series A function f(x) can be approximated near x=x0 by
Linear approximation is If f is a function of 2 variables: If f is a function of 3 variables: S. Birchfield, Clemson Univ., ECE 847,

Deriving the optical flow constraint equation
Image at time t Image at time t + Dt Brightness constancy assumption: Taylor series expansion: Putting together yields: S. Birchfield, Clemson Univ., ECE 847,

Deriving the optical flow constraint equation
From previous slide: Divide both sides by Dt and take the limit: or optical flow constraint equation More compactly, S. Birchfield, Clemson Univ., ECE 847,

Aperture problem I(x,y,t+Dt)=x I(x,y,t)=x isophote isophote
From previous slide: Key idea: Any function looks linear through small `aperture’ This is one equation, two unknowns! (Underconstrained problem) true motion gradient another possible answer I(x,y,t+Dt)=x isophote I(x,y,t)=x isophote We can only compute component of motion in direction of gradient: S. Birchfield, Clemson Univ., ECE 847,

Aperture Problem Exposed
Motion along just an edge is ambiguous from G. Bradski, CS223B

Two approaches to overcome aperture problem
Horn-Schunck (1980) Assume neighboring pixels are similar Add regularization term to enforce smoothness Compute (u,v) for every pixel in image  Dense optical flow Lucas-Kanade (1981) Assume neighboring pixels are same Use additional equations to solve for motion of pixel Compute (u,v) for a small number of pixels (features); each feature treated independently of other features  Sparse optical flow S. Birchfield, Clemson Univ., ECE 847,

Lucas-Kanade Recall scalar “equation” was derived by linearizing function about current estimate: + higher order terms Total displacement that we want (u) Reinterpret: Incremental displacement that we can compute (uD) This is one iteration of Newton’s method. So we iterate and accumulate: Initial estimate (could be zero) S. Birchfield, Clemson Univ., ECE 847,

Recall: Newton’s method
Goal: Find root (or zero) of function f(x) Use first-order approximation (tangent line) Start with initial guess Iterate: Use derivative to find root, under assumption that function is linear Result used as estimate for next iteration y f’(x0) f(x0) Dy Dx x x1 x0 Solution: Iterate x = x1 = x0 + Dx S. Birchfield, Clemson Univ., ECE 847,

Goal: Find root (or zero) of function f(x) Use first-order approximation (tangent line) Start with initial guess Iterate: Use derivative to find root, under assumption that function is linear Result used as estimate for next iteration y f’(x1) f(x1) Dy x x2 x1 Dx Solution: Iterate x = x2 = x1 + Dx S. Birchfield, Clemson Univ., ECE 847,

Goal: Find extremum (min or max) of function f(x) Key idea: Extremum occurs when f’(x)=0  Replace f with f’ Solution: Iterate Note: Now we need 2nd derivative S. Birchfield, Clemson Univ., ECE 847,

Recall: Gauss-Newton method
Goal: Find extremum (min or max) of function f(x)=( e(x) )2 Key idea: When f(x) has this special form, derivatives of f are easy to compute: f’(x) = 2 e(x) e’(x) f”(x) = 2 ( e’(x) )2 + 2 e(x) e”(x) Ignoring the 2nd derivative yields f”(x) ≈ 2 ( e’(x) )2 Solution: Iterate Note: Now we only need 1st derivative S. Birchfield, Clemson Univ., ECE 847,

1D Lucas-Kanade tracking
Ix Spatial derivative uD I(x) J(x) It Temporal derivative x Incremental displacement: S. Birchfield, Clemson Univ., ECE 847,

First iteration: Given two images, and position x I(x) J(x) x S. Birchfield, Clemson Univ., ECE 847,

First iteration: Compute temporal derivative I(x) J(x) It(1)=J(x)-I(x) Temporal derivative x S. Birchfield, Clemson Univ., ECE 847,

First iteration: Compute spatial derivative Ix=dI/dx Spatial derivative I(x) J(x) It(1)=J(x)-I(x) Temporal derivative x S. Birchfield, Clemson Univ., ECE 847,

First iteration: Compute incremental displacement Ix=dI/dx Spatial derivative uD(1) I(x) J(x) It(1)=J(x)-I(x) Temporal derivative x Compute incremental displacement: Total displacement: S. Birchfield, Clemson Univ., ECE 847,

First iteration: Shift I(x) to the right by uD … or Ix=dI/dx Spatial derivative uD(1) I(x-u) J(x) It(1)=J(x)-I(x) Temporal derivative x Compute incremental displacement: Total displacement: S. Birchfield, Clemson Univ., ECE 847,

First iteration: Shift J(x) to the left by uD (equivalent) Ix=dI/dx Spatial derivative uD(1) I(x) J(x+u) It(1)=J(x)-I(x) Temporal derivative x Compute incremental displacement: Total displacement: S. Birchfield, Clemson Univ., ECE 847,

Second iteration: Repeat same steps Ix=dI/dx Spatial derivative (can reuse this) uD(2) I(x) J(x+u) It(2)=J(x)-I(x) Temporal derivative x Compute incremental displacement: Total displacement: S. Birchfield, Clemson Univ., ECE 847,

… until convergence I(x) J(x+u) x Total displacement: S. Birchfield, Clemson Univ., ECE 847,

2D Lucas-Kanade in a nutshell
Instead of solving for uD we need to solve for uD =(uD,vD) Given two consecutive images I and J from a sequence, Take pixel-wise difference between them  It Compute gradient of one image  Ix and Iy Sum over window to get Z and e Solve ZuD=e for uD Update u= uD(1)+... Shift image Repeat (2x2 matrix) (2x1 vector) What are Z and e? Why do we need to sum over window? S. Birchfield, Clemson Univ., ECE 847,

Lucas-Kanade derivation
Scalar equation with two unknowns is an underconstrained system: Introduce constraint. Assume neighboring pixels have same motion: window around pixel n = number of pixels in window or Can solve this directly using least squares, or (better yet) … S. Birchfield, Clemson Univ., ECE 847,

Lucas-Kanade derivation
Multiply by AT : or (2x2 matrix) (2x1 vector) or (region of neighboring pixels) S. Birchfield, Clemson Univ., ECE 847,

RGB version of Lucas-Kanade
For color images, we can get more equations by using all color channels E.g., for 7x7 window we have 49*3=147 equations! But color channels are highly correlated, so rarely helps much S. Birchfield, Clemson Univ., ECE 847,

Lucas-Kanade pseudocode
Typically R is a pixel x=(x,y) and the width of the square window OK to move this out of loop S. Birchfield, Clemson Univ., ECE 847,

(solving 2x2 equation is easy: invert Z by hand)
can x and y be floats? can x and y be floats? (solving 2x2 equation is easy: invert Z by hand) S. Birchfield, Clemson Univ., ECE 847,

Lucas-Kanade details Need to loop over frames and features interpolate
declare features lost detect good features smooth the image to handle large motions S. Birchfield, Clemson Univ., ECE 847,

1. Loop over frames and features
We have only described the procedure for a single pixel (using its surrounding window) two consecutive image frames Need to wrap this in 2 for loops over frames in the sequence over features (sparse pixels) in the image S. Birchfield, Clemson Univ., ECE 847,

Loop over frames and features
features is 2D array over features and frames float colon selects entire row (like Matlab) float float (Conceptually) constructs a region from pixel and window-width (In reality) does nothing; just pass both parameters to LucasKanade S. Birchfield, Clemson Univ., ECE 847,

2. Interpolate After very first iteration, x and y can be floats; same for u and v Could simply round to nearest integer before computing But results are more accurate if we interpolate S. Birchfield, Clemson Univ., ECE 847,

Linear interpolation f(x0+2) f(x0-1) f(x0) 1D function f(x0+1) x0-1 x0
f(x) ≈ f(x0) + (x-x0) ( f(x0+1) – f(x0) ) = f(x0) + a ( f(x0+1) – f(x0) ) = (1 – a) f(x0) + a f(x0+1) S. Birchfield, Clemson Univ., ECE 847,

Bilinear interpolation
x0-1 x0 x x0+1 x0+2 2D function y0-1 a Note: These are the pixel squares f00 f10 y0 b y y0+1 f01 f11 y0+2 f(x0,y) ≈ (1-b)f00 + bf01 f(x0+1,y) ≈ (1-b)f10 + bf11 f(x,y) ≈ (1-a) [(1-b)f00 + bf01] + a[(1-b)f10 + bf11] = (1-a)(1-b)f00 + (1-a)bf01 + a(1-b)f10 + abf11 S. Birchfield, Clemson Univ., ECE 847,

x0-1 x0 x x0+1 x0+2 2D function y0-1 a f00 f10 y0 b y y0+1 f01 f11 y0+2 f(x,y0) ≈ (1-a)f00 + af10 f(x,y0+1) ≈ (1-a)f01 + af11 f(x,y) ≈ (1-b) [(1-a)f00 + af10] + b[(1-a)f01 + af11] = (1-b)(1-a)f00 + (1-b)af10 + b(1-a)f01 + baf11 same result S. Birchfield, Clemson Univ., ECE 847,

Simply compute double weighted average of four nearest pixels Be careful to handle boundaries correctly (details omitted here) S. Birchfield, Clemson Univ., ECE 847,

3. Declare features lost Problem: If feature changes appearance too much, it has probably drifted onto another (occluding) surface in the scene Solution: Check the residue by computing SSD of feature window in both I and J If residue is too high, then declare feature lost, and do not continue to track S. Birchfield, Clemson Univ., ECE 847,

4. Detect good features Lucas-Kanade could be used to compute motion for every pixel in the image But Motion of nearby pixels are similar Pixel motion cannot be computed in areas of little (or no) texture Solution: First detect good features (pixel windows), then only track those features S. Birchfield, Clemson Univ., ECE 847,

What is a good feature? To solve this equation, Z= ATA must be invertible: In real world, all matrices are invertible Instead, we must ensure that Z=ATA is well-conditioned (not close to singular) S. Birchfield, Clemson Univ., ECE 847,

Eigenvalues of Z Z is gradient covariance matrix Recall from PCA:
Same matrix as Harris operator Also called Hessian Recall from PCA: (Square root of) eigenvalues give length of best fitting ellipse Large eigenvalues means large gradient vectors Small ratio means information in all directions We want: Both eigenvalues l1 and l2 are large Ratio of eigenvalues l1 / l2 is not too large Iy Ix Note: l1 ≥l2 S. Birchfield, Clemson Univ., ECE 847,

Finding good features Three cases: l1 and l2 small l1 large, l2 small
Not enough texture for tracking l1 large, l2 small On intensity edge Aperture problem: Only motion perpendicular to edge can be found l1 and l2 large Good feature to track (“corner”) Iy Ix Iy Ix Iy M. Pollefeys, Note: Even though tracking is a two-frame problem, we can determine good features from only one frame S. Birchfield, Clemson Univ., ECE 847,

Three “cornerness” measures
Threshold minimum eigenvalue (We will use this) (Tomasi-Kanade 1991) Use trace and determinant (Harris and Stephens 1988) known as Harris corner detector or Plessey operator Treat like resistors in parallel det(Z) / trace(Z) = 1 / (1/l1 + 1/l2) All 3 perform about the same S. Birchfield, Clemson Univ., ECE 847,

Tomasi-Kanade cornerness
In practice, eigenvalues cannot be arbitrarily large, b/c image values range from [0,255] Therefore, we can ignore maximum eigenvalue This is the simplest approach Recall: S. Birchfield, Clemson Univ., ECE 847,

Harris corner detector
Easy to show that Use Advantage: No need to actually compute eigenvalues (no square root) The second term reduces effect of having a small eigenvalue with a large dominant eigenvalue recommended: k = 0.04 S. Birchfield, Clemson Univ., ECE 847,

Resistors in parallel Alternative is to treat eigenvalues like resistors in parallel Combined resistance is bound by individual resistances S. Birchfield, Clemson Univ., ECE 847,

Comparison of “cornerness”
Conclusion: They are all very similar S. Birchfield, Clemson Univ., ECE 847,

Finding good features To find good features, scan image
evaluate “cornerness” measure at each pixel threshold perform non-maximal suppression 3x3 window is all you need Tomasi-Kanade (can also enforce minimum distance between features) S. Birchfield, Clemson Univ., ECE 847,

Flashback: Moravec interest operator
Moravec (1970s) proposed to measure “cornerness” by measuring SSD of window after shifting right, left, down, and up: … but this is not rotationally symmetric Expand using Taylor series and substitute our old friend again S. Birchfield, Clemson Univ., ECE 847,

Example (s=0.1) from Sebastian Thrun, CS223B Computer Vision, Winter 2005

5. Smooth the image Motion between consecutive image frames can be large Solution: Smooth the image first Note: Smoothing is already in gradient computation, so just use large s But large s leads to less accuracy Solution: Sequentially decrease s Pyramid makes this computationally efficient S. Birchfield, Clemson Univ., ECE 847,

Revisiting the small motion assumption
Is this motion small enough? Probably not—it’s much larger than one pixel (2nd order terms dominate) How might we solve this problem? * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Reduce the resolution! * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Coarse-to-fine optical flow estimation
Gaussian pyramid of image It-1 Gaussian pyramid of image I image I image It-1 slides from Bradsky and Thrun u=10 pixels u=5 pixels u=2.5 pixels u=1.25 pixels image It-1 image I

Coarse-to-fine optical flow estimation
Gaussian pyramid of image It-1 Gaussian pyramid of image I image I image It-1 slides from Bradsky and Thrun run iterative L-K warp & upsample run iterative L-K . image J image I

One final problem Not always easy to tell whether to discard feature from frame-to-frame tracking because distortions may occur gradually over several frames (frog in pot) Therefore, need to compare feature appearance over longer periods of time Translation assumption is fine between consecutive frames, but long periods of time introduce other types of deformations Solution: Use affine model of motion Lucas-Kanade easily generalizes to other motion models, but requires a different derivation S. Birchfield, Clemson Univ., ECE 847,

Alternative derivation
Compute translation assuming it is small differentiate: Affine is also possible, but a bit harder (6x6 instead of 2x2) M. Pollefeys,

Example Simple displacement is sufficient between consecutive frames, but not to compare to reference template M. Pollefeys,

Example M. Pollefeys,

Synthetic example M. Pollefeys,

Good features to keep tracking
Perform affine alignment between first and last frame Stop tracking features with too large errors M. Pollefeys,

Feature matching vs. tracking
Image-to-image correspondences are key to passive triangulation-based 3D reconstruction Extract features independently and then match by comparing descriptors Extract features in first images and then try to find same feature back in next view M. Pollefeys,

Horn-Schunck Overcome aperture problem by assuming that neighboring pixels have similar displacement Minimize where (data) (smoothness) (Global approach to compute dense optical flow) S. Birchfield, Clemson Univ., ECE 847,

Problem Goal is to minimize to find u and v
But now u and v are themselves functions of x and y: So in reality, we have a function of functions How to do this?

Calculus of variations
A function maps values to real numbers To find the value that minimizes a function, use calculus A functional maps functions to real numbers To find the function that minimizes a functional, use calculus of variations S. Birchfield, Clemson Univ., ECE 847,

“The calculus” The function y = f(x) is stationary where df(x)/dx = 0
S. Birchfield, Clemson Univ., ECE 847,

The integral of the functional is stationary where explicit function (“Lagrangian”) dependent variable (position, voltage, …) independent variable (space or time) Euler-Lagrange equations “principle of least action” F = KE – PE (brachistochrone problem) S. Birchfield, Clemson Univ., ECE 847,

General case n dependent variables m independent variables Euler-Lagrange equations independent variables  multiple terms dependent variables  multiple equations S. Birchfield, Clemson Univ., ECE 847,

COV applied to pendulum
Newtonian approach length l q mass m Use Newton’s 2nd Law: gravity g F = ma -mg sin q = mlq” -g sin q = lq” g sin q + lq” = 0 arclength = lq velocity = lq’ acceleration = lq” q=0  q”=0 q=p/2  lq”=-g q=-p/2  lq”=g S. Birchfield, Clemson Univ., ECE 847,

COV applied to pendulum
Euler-Lagrange approach length l q 1) Formulate Lagrangian: indep. variable: time t dep. variable: angle q mass m gravity g arclength = lq velocity = lq’ acceleration = lq” q=0  q”=0 q=p/2  lq”=-g q=-p/2  lq”=g 2) Take derivatives: 3) Plug into Euler-Lagrange equations:  Same result! S. Birchfield, Clemson Univ., ECE 847,

COV applied to optical flow
explicit function independent variables dependent variables Euler-Lagrange equations S. Birchfield, Clemson Univ., ECE 847,

Expand: Take derivatives: S. Birchfield, Clemson Univ., ECE 847,

Plug back in: where Rearrange: Laplacian Approximate: where (difference of Gaussians) constant mean What if l=0? S. Birchfield, Clemson Univ., ECE 847,

Lucas-Kanade solves 2N x 2N system as N 2x2 systems Horn-Schunck solves 2N x 2N system as sparse linear system S. Birchfield, Clemson Univ., ECE 847,

Solve equation: This is a pair of equations for each pixel. The combined system of equations is 2N x 2N, where N is the number of pixels in the image. Because it is a sparse matrix, iterative methods (Jacobi iterations, Gauss-Seidel, successive over-relaxation) are more appropriate. Iterate: where S. Birchfield, Clemson Univ., ECE 847,

Stationary iterative methods
Problem is to solve sparse linear system: matrix = (diagonal) – (lower) – (upper) Jacobi method (simply solve for element, using existing estimates): Gauss-Seidel is “sloppy Jacobi” (use updates as soon as they are available): Successive over relaxation (SOR) speeds convergence: G-S iterate w=1 means Gauss-Seidel S. Birchfield, Clemson Univ., ECE 847,

Horn-Schunck Algorithm
S. Birchfield, Clemson Univ., ECE 847,

Horn-Schunck Algorithm
note that this l is defined differently S. Birchfield, Clemson Univ., ECE 847,

Lucas-Kanade meets Horn-Schunck
IJCV 2005 paper by Bruhn et al. Shows that Local Methods (Lucas-Kanade) and Global Methods (Horn-Schunck) complement one another flow field on image magnitude of flow field S. Birchfield, Clemson Univ., ECE 847,

Spatiotemporal volumes

Motion ECE 847: Digital Image Processing Stan Birchfield

Similar presentations

Presentation on theme: "Motion ECE 847: Digital Image Processing Stan Birchfield"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Motion ECE 847: Digital Image Processing Stan Birchfield

Similar presentations

Presentation on theme: "Motion ECE 847: Digital Image Processing Stan Birchfield"— Presentation transcript:

Similar presentations

About project

Feedback