Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 376b Introduction to Computer Vision 04 / 01 / 2008 Instructor: Michael Eckmann.

Similar presentations


Presentation on theme: "CS 376b Introduction to Computer Vision 04 / 01 / 2008 Instructor: Michael Eckmann."— Presentation transcript:

1 CS 376b Introduction to Computer Vision 04 / 01 / 2008 Instructor: Michael Eckmann

2 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Today’s Topics Comments/Questions Motion –matching points between frames to compute a sparse motion field –use of motion vectors in MPEG compression

3 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion The motion field is defined as: a 2d array of 2d vectors representing the motion of 3d scene points. Vectors can be points at time t to t+dt. The focus of expansion (FOE) is defined as: the point in the image plane from which motion field vectors diverge. (This is usually the point toward which the camera is moving.)‏ The focus of contraction (FOC) is defined as: the point in the image plane toward which motion field vectors converge. (This is usually the point from which the camera is moving away.)‏

4 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion figure 9.2 from Shapiro and Stockman

5 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion To compute motion field vectors we'll need to detect and locate interest points with high accuracy.

6 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion To correspond points from one image to the next --- –take the neighborhood of an interesting point and look in some small window of the next frame (assuming motion is limited to some speed/distance) and find the best matching neighborhood. How might you compare neighborhoods?

7 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion To correspond points from one image to the next take the neighborhood of an interesting point and look in some small window of the next frame (assuming motion is limited to some speed/distance) and find the best matching neighborhood. –Can use normalized cross-correlation of the neighborhoods. –The Cauchy-Schwartz inequality (see page 162 in text) states that the normalized dot product of two vectors is <=1. (Note that the normalized cross- correlation of two neighborhoods is equivalent to the normalized dot product.)‏

8 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion To compute motion field vectors we can detect –high interest points (have high energy in many directions)‏ –edges are problematic here (only one direction)‏ –corners (two directions)‏ –anything that can be located accurately in a later image –centroids of moving regions after segmentation could be tracked as well. discussion on the board why corners are more accurately located than edges.

9 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion The text describes an interest operator to detect high interest points –the smallest variance in the vertical, horizontal, and 2 diagonal directions in a neighborhood must be above some threshold –how well will this be able to be located?

10 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion from algorithm 9.2 in Shapiro and Stockman

11 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion Let's consider some places where this kind of detection and matching scheme might break down. –What assumptions need to be true to make the scheme work?

12 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion Let's consider some places where this kind of detection and matching scheme might break down. –occlusion –speed of motion larger than expected (best match is outside of window)‏ –large changes in neighborhood due to possible viewpoint changes –rotation changes in scale and small translations are o.k.

13 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion MPEG –to compress video (an image sequence)‏ –replaces a 16x16 image block with a motion vector describing the motion of that block so in a later frame, a 16x16 block is represented as a vector –only the vector is stored if the blocks are identical (or very close)‏ if they differ by too much, encode the difference These 16x16 blocks and motion vectors are computed between say a frame f i and f i+3

14 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion from figure 9.7 Shapiro and Stockman

15 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion Suppose we are given a video of something (a TV show, movie, news report, etc.)‏ –we can use the techniques just discussed to determine the camera motion by –looking for a FOC, FOE or panning determine when a scene changes (large change in histogram between frames and not some typical motion field due to camera motion)‏ –can segment the video into scenes this way

16 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion from Shapiro and Stockman figure 9.18 (originally from Zhang et.al. (1993) with permission from Springer-Verlag)‏ top 2 are from same scene, bottom one from different scene.

17 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion from Shapiro and Stockman figure 9.19

18 Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Programming Assignement How's it going? Should work on having it done by end of next week as I anticipate another assignment shortly after that to be due last week of classes.


Download ppt "CS 376b Introduction to Computer Vision 04 / 01 / 2008 Instructor: Michael Eckmann."

Similar presentations


Ads by Google