Presentation is loading. Please wait.

Presentation is loading. Please wait.

3D Computer Vision and Video Computing 3D Vision Lecture 16 Visual Motion (I) CSC Capstone Fall 2004 Zhigang Zhu, NAC 8/203A

Similar presentations


Presentation on theme: "3D Computer Vision and Video Computing 3D Vision Lecture 16 Visual Motion (I) CSC Capstone Fall 2004 Zhigang Zhu, NAC 8/203A"— Presentation transcript:

1 3D Computer Vision and Video Computing 3D Vision Lecture 16 Visual Motion (I) CSC Capstone Fall 2004 Zhigang Zhu, NAC 8/203A http://www-cs.engr.ccny.cuny.edu/~zhu/ Capstone2004/Capstone_Sequence2004.html Cover Image/video credits: Rick Szeliski, MSR

2 3D Computer Vision and Video Computing Outline of Motion n Problems and Applications (Lecture Motion I) l The importance of visual motion l Problem Statement n The Motion Field of Rigid Motion (Lecture Motion I) l Basics – Notations and Equations l Three Important Special Cases: Translation, Rotation and Moving Plane l Motion Parallax n Optical Flow (Lecture Motion II) l Optical flow equation and the aperture problem l Estimating optical flow l 3D motion & structure from optical flow n Feature-based Approach (Lecture Motion II) l Two-frame algorithm l Multi-frame algorithm l Structure from motion – Factorization method n Advanced Topics (Lecture Motion II; and beyond) l Spatio-Temporal Image and Epipolar Plane Image l Video Mosaicing and Panorama Generation l Motion-based Segmentation and Layered Representation

3 3D Computer Vision and Video Computing The Importance of Visual Motion n Structure from Motion l Apparent motion is a strong visual clue for 3D reconstruction n More than a multi-camera stereo system n Recognition by motion (only) l Biological visual systems use visual motion to infer properties of 3D world with little a priori knowledge of it n Blurred image sequence n Visual Motion = Video ! [Go to CVPR sites for Workshops] l Video Coding and Compression: MPEG 1, 2, 4, 7… l Video Mosaicing and Layered Representation for IBR l Surveillance (Human Tracking and Traffic Monitoring) l HCI using Human Gesture (video camera) l Automated Production of Video Instruction Program (VIP) l Video Texture for Image-based Rendering l …

4 3D Computer Vision and Video Computing Human Tracking W4- Visual Surveillance of Human Activity From: Prof. Larry Davis, University of Maryland http://www.umiacs.umd.edu/users/lsd/vsam.html Tracking moving subjects from video of a stationary camera…

5 3D Computer Vision and Video Computing Blurred Sequence An up-sampling from images of resolution 15x20 pixels From: James W. Davis. MIT Media Lab http://vismod.www.media.mit.edu/~jdavis/MotionTemplates/m otiontemplates.html Recognition by Actions: Recognize object from motion even if we cannot distinguish it in any images …

6 3D Computer Vision and Video Computing Video Mosaicing Stereo Mosaics from a single video sequence From: Z. Zhu, E. M. Riseman, A. R. Hanson, Parallel-perspective stereo mosaics, The Eighth IEEE International Conference on Computer Vision, Vancouver, Canada, July 2001, vol I, 345-352.Parallel-perspective stereo mosaics http://www-cs.engr.ccny.cuny.edu/~zhu/ StereoMosaic.html StereoMosaic.html Video of a moving camera = multi-frame stereo with multiple cameras…

7 3D Computer Vision and Video Computing Video in Classroom/Auditorium n Demo: Bellcore Autoauditorium l A Fully Automatic, Multi-Camera System that Produces Videos Without a Crew l http://www.autoauditorium.com/ An application in e-learning: Analyzing motion of people as well as control the motion of the camera…

8 3D Computer Vision and Video Computing Vision Based Interaction Microsoft Research Vision based Interface by Matthew Turk Demo Motion and Gesture as Advanced Human-Computer Interaction (HCI)….

9 3D Computer Vision and Video Computing Video Texture Video Textures are derived from video by using the finite duration input clip to generate a smoothly playing infinite video. From: Arno Schödl, Richard Szeliski, David H. Salesin, and Irfan Essa. Video textures. Proceedings of SIGGRAPH 2000, pages 489-498, July 2000Video textures http://www.gvu.gatech.edu/perception/projects/videotexture/ Image (video) -based rendering: realistic synthesis without “vision”…

10 3D Computer Vision and Video Computing Problem Statement n Two Subproblems l Correspondence: Which elements of a frame correspond to which elements in the next frame? l Reconstruction :Given a number of correspondences, and possibly the knowledge of the camera’s intrinsic parameters, how to recovery the 3-D motion and structure of the observed world n Main Difference between Motion and Stereo l Correspondence: the disparities between consecutive frames are much smaller due to dense temporal sampling l Reconstruction: the visual motion could be caused by multiple motions ( instead of a single 3D rigid transformation) n The Third Subproblem, and Fourth…. l Motion Segmentation: what are the regions the the image plane corresponding to different moving objects? l Motion Understanding: lip reading, gesture, expression, event…

11 3D Computer Vision and Video Computing Approaches n Two Subproblems l Correspondence: n Differential Methods - >dense measure (optical flow) n Matching Methods -> sparse measure l Reconstruction : More difficult than stereo since n Structure as well as motion (3D transformation betw. Frames) need to be recovered n Small baseline causes large errors n The Third Subproblem l Motion Segmentation: Chicken and Egg problem n Which should be solved first? Matching or Segmentation n Segmentation for matching elements n Matching for Segmentation

12 3D Computer Vision and Video Computing The Motion Field of Rigid Objects n Motion: l 3D Motion ( R, T): n camera motion (static scene) n or single object motion n Only one rigid, relative motion between the camera and the scene (object) l Image motion field: n 2D vector field of velocities of the image points induced by the relative motion. n Data: Image sequence l Many frames n captured at time t=0, 1, 2, … l Basics: only consider two consecutive frames n We consider a reference frame and its consecutive frame l Image motion field n can be viewed disparity map of the two frames captured at two consecutive camera locations ( assuming we have a moving camera) Motion Field of a Video Sequence (Translation)

13 3D Computer Vision and Video Computing The Motion Field of Rigid Objects n Notations l P = (X,Y,Z) T : 3-D point in the camera reference frame l p = (x,y,f) T : the projection of the scene point in the pinhole camera n Relative motion between P and the camera l T= (T x,T y,T z ) T : translation component of the motion l  x  y  z   : the angular velocity n Note: l How to connect this with stereo geometry (with R, T)? l Image velocity v= ? p O X PV f Z Y v

14 3D Computer Vision and Video Computing The Motion Field of Rigid Objects n Notations l P = (X,Y,Z) T : 3-D point in the camera reference frame l p = (x,y,f) T : the projection of the scene point in the pinhole camera n Relative motion between P and the camera l T= (T x,T y,T z ) T : translation component of the motion l  x  y  z   : the angular velocity n Note: l How to connect this with stereo geometry (with R, T)?

15 3D Computer Vision and Video Computing Basic Equations of Motion Field n Notes: l Take the time derivative of both sides of the projection equation l The motion field is the sum of two components n Translational part n Rotational part l Assume known intrinsic parameters Rotation part: no depth information Translation part: depth Z

16 3D Computer Vision and Video Computing Motion Field vs. Disparity n Correspondence and Point Displacements StereoMotion DisparityMotion field Displacement – (dx, dy)Differential concept – velocity (v x, v y ), i.e. time derivative (dx/dt, dy/dt) No such constraintConsecutive frame close to guarantee good discrete approximation

17 3D Computer Vision and Video Computing Special Case 1: Pure Translation n Pure Translation (  =0) n Radial Motion Field (Tz <> 0) l Vanishing point p0 =(x 0, y 0 ) T : n motion direction l FOE (focus of expansion) n Vectors away from p0 if Tz < 0 l FOC (focus of contraction) n Vectors towards p0 if Tz > 0 l Depth estimation n depth inversely proportional to magnitude of motion vector v, and also proportional to distance from p to p 0 n Parallel Motion Field (Tz= 0) l Depth estimation: n depth inversely proportional to magnitude of motion vector v Tz =0

18 3D Computer Vision and Video Computing Special Case 2: Pure Rotation n Pure Rotation (T =0) l Does not carry 3D information n Motion Field (approximation) l Small motion l A quadratic polynomial in image coordinates (x,y,f) T n Image Transformation between two frames (accurate) l Motion can be large l Homography (3x3 matrix) for all points n Image mosaicing from a rotating camera l 360 degree panorama

19 3D Computer Vision and Video Computing Special Case 3: Moving Plane n Planes are common in the man-made world n Motion Field (approximation) l Given small motion l a quadratic polynomial in image n Image Transformation between two frames (accurate) l Any amount of motion (arbitrary) l Homography (3x3 matrix) for all points l See Topic 5 Camera Models n Image Mosaicing for a planar scene l Aerial image sequence l Video of blackboard Only has 8 independent parameters (write it out!)

20 3D Computer Vision and Video Computing Special Cases: A Summary n Pure Translation l Vanishing point and FOE (focus of expansion) l Only translation contributes to depth estimation n Pure Rotation l Does not carry 3D information l Motion field: a quadratic polynomial in image, or l Transform: Homography (3x3 matrix R) for all points l Image mosaicing from a rotating camera n Moving Plane l Motion field is a quadratic polynomial in image, or l Transform: Homography (3x3 matrix A) for all points l Image mosaicing for a planar scene

21 3D Computer Vision and Video Computing Motion Parallax n [Observation 1] The relative motion field of two instantaneously coincident points l Does not depend on the rotational component of motion l Points towards (away from) the vanishing point of the translation direction n [Observation 2] The motion field of two frames after rotation compensation l only includes the translation component l points towards (away from) the vanishing point p0 ( the instantaneous epipole) l the length of each motion vector is inversely proportional to the depth, and also proportional to the distance from point p to the vanishing point p0 of the translation direction l Question: how to remove rotation? n Active vision : rotation known approximately? Motion Field of a Video Sequence (Translation)

22 3D Computer Vision and Video Computing Summary n Importance of visual motion (apparent motion) l Many applications… l Problems: n correspondence, reconstruction, segmentation, understanding in x-y-t space n Image motion field of rigid objects l Time derivative of both sides of the projection equation n Three important special cases l Pure translation – FOE l Pure rotation – no 3D information, but lead to mosaicing l Moving plane – homography with arbitrary motion n Motion parallax l Only depends on translational component of motion

23 3D Computer Vision and Video Computing Next n Optical Flow, and n Estimating and Using the Motion Fields Visual Motion (II)


Download ppt "3D Computer Vision and Video Computing 3D Vision Lecture 16 Visual Motion (I) CSC Capstone Fall 2004 Zhigang Zhu, NAC 8/203A"

Similar presentations


Ads by Google