Presentation on theme: "Structure-from-Motion Algorithm to Capture 3D Information from a Sequence of Video Images By Rishabh Malhotra Supervisor: Dr. Kunio Takaya TRLabs / University."— Presentation transcript:
Structure-from-Motion Algorithm to Capture 3D Information from a Sequence of Video Images By Rishabh Malhotra Supervisor: Dr. Kunio Takaya TRLabs / University of Saskatchewan New Media EE 990 Seminar Dec 04, 2003
Outline of the Presentation Introduction: Problem Definition The Structure-from-Motion Algorithm with an illustration Conclusion Applications New Media
Defining The Problem At least 2 views are required. 2D is already available. Need to find the third dimension. Depth information obtained creates 3D model only for visible part. Hence called 2.5D model. Motion of object gives intensity change in same pixels of the object which is used to calculate the depth information leading to the 3D Structure. New Media What is SfM? Specific computation of 3D geometry (Structure) from given 2D geometry frames (Motion).
For an example – What is VRML ? Specification for displaying 3-D objects on the WWW. 3-D equivalent of HTML. Need a VRML browser or VRML plug-in to a Web browser. E.g. Cortona Plug-in from Parallel Graphics. Produces a hyperspace (or a world), a 3-D space that appears on the display screen. Can figuratively move within this space. New Media The Scene (Sphere and Spot Light) remains fixed. Only the Camera moves in pure Translational motion showing the different regions of shadow on the Sphere. Scene moves to LEFT or Camera moves to RIGHT Scene moves to RIGHT or Camera moves to LEFT Camera, Spot Light and Sphere are Collinear This world was developed in VRML (Virtual Reality Modeling Language)
Surface and Depth (of Third Dimension) Estimation New Media Camera Moves a very small distance to the Right N Different intensity values on the same pixel of the two frames Conclusion: Surface is Concave and it must be given a lower elevation (third dimension) Assumptions: Relatively large Sphere radius. Very Small Camera displacement.
New Media The Concept: Phong reflection model tells us how light reflects from surfaces. Phong Shading is a form of interpolated shading for approximating curved surfaces. Instead of interpolating intensities, it interpolates the vertex normals. Why is it needed here? For Depth estimation at a location of the object. Phong Lighting: an empirical model to calculate illumination at a point on a surface. Phong Shading: linearly interpolating the surface normal across the facet, applying the Phong lighting model at every pixel (normal-vector interpolation-shading) Examples of Images made using Phong's Model: Phong Lighting and Shading Model
New Media Has 3 Components: Ambient Light Diffuse Reflection Specular reflection The Phong equation: Ambient Intensity Diffuse ReflectionSpecular Reflection Shininess Coefficient As Shininess Coefficient (γ) is increased, the reflected light is concentrated in a narrower region, centred on the angle of perfect reflector. Quadratic Attenuation Term Uses 4 vectors: From Source (L) To Viewer (V) Normal (N) Perfect Reflector (R) Phong Lighting and Shading Model from Akenine-Moller & Haines AmbientDiffuseSpecular An Example:
The Structure-from-Motion Algorithm 1)Gradient Vector Flow (GVF) 2)Image Segmentation using 2D Wavelets 3)Motion Vector Estimation using Berkeley MPEG Tools 4)Phong Lighting and Shading Model New Media
Step 1: Gradient Vector Flow Calculation Gradient is a vector quantity and is a 2D first derivative measure of change. By Definition, The gradient of an image of continuous spatial coordinates x and y, is New Media Hence, Magnitude of and Direction of where
Results using Gradient Vector Flow Calculation The Simplest Image: A Sphere (an oversimplified image as it has no edges) New Media Original Image (100 x 100 pixels) Gradient Vector Flow Map Zoom In View
New Media Step 2: Image Segmentation using 2D Wavelets Process of separating objects from the background, as well as from each other by deciding which pixels belong to each object. Wavelet Transform applied to the vector potential defined in a 2D image. a)Sub-band filtering applied to the vector potential can produce contour images of different scales. b)The Mallat or Haar Wavelet is considered for Image Segmentation. Active contours or snakes are computer-generated curves that move within images to find object boundaries. GVF Snake Method This is a GVF field for a U-shaped object. These vectors will pull an active contour toward the boundary of the object. A GVF snake can start far from the boundary and will converge to boundary concavities. Active Contour
New Media Image Segmentation using Edge Detection for a more complicated image: Face Original Image (640 x 480 pixels) Edge Detection using x-direction Sobel operator (Threshold: 153) Other Popular Image Segmentation methods include: Edge Detection Segmentation based on color Region Growing and Shrinking Clustering Morphological Filtering
Step 3: Motion Vector Estimation using Berkeley MPEG Tools New Media Previous Approaches: 1.Full Search Algorithm (Most precise matching but Computational Complexity (2w+1) 2 times) 2.Conjugate Direction Search (Complexity is reduced noticeably 3+2w) 3.Modified Logarithmic Search (Efficient and fast 2+7log(w)) What is needed? -Novel motion vector prediction technique -A highly localized search pattern -A computational constraint explicitly incorporated into the cost measure BMA partitions the current frame in small, fixed size blocks and matches them in the previous frame in order to estimate blocks displacement (referred to as motion vectors) between two successive frames. Block Matching Algorithm
New Media To find the best block from an earlier frame to construct an area of the current frame Motion Estimation technique – using Block Matching Algorithm Translate motion-vectors into motion-predictions Step 3: Motion Vector Estimation using Berkeley MPEG Tools Frame #2 Frame #1 Frame #3 Is valid ? Apply block-matching algorithm to compute motion-vectors Motion vector: The displacement of the closest matching block in reference frame (past of future) for a block in current frame
Conclusion A 2.5 dimensional figure of the object is produced similar to the carved in relief as a result of the series of processing's. New Media Applications 1.3D Model Reconstruction –3D Motion Matching –Camera Calibration –3D Vision 2.Stereo Television –Conversion of ordinary 2D films to a stereo movie to be displayed on a stereo TV. –Will become available as the next generation Television.
Thank You Questions ? New Media
Shininess Coefficient and Specular Component New Media
18 New Media I1I1 B1B1 B2B2 B3B3 P1P1 B4B4 B5B5 B6B6 P2P2 B7B7 B8B8 B9B9 I2I2 MPEG Encoding Frame Types IIntraEncode complete image, similar to JPEG PForward PredictedMotion relative to previous I and Ps BBackward PredictedMotion relative to previous & future Is & Ps Step 3: Motion Vector Estimation using Berkeley MPEG Tools