Download presentation
Presentation is loading. Please wait.
Published byJanis Mitchell Modified over 6 years ago
1
Steps Towards the Convergence of Graphics, Vision, and Video
P. Anandan Microsoft Research
2
Graphics : The Traditional View
Synthetic Camera Model Image Output
3
Vision : The Traditional View
Real Cameras Model Output Real Scene
4
Video : The Traditional View
Encode transform coeffs motion vectors Decode Display
5
Summary… Graphics renders models simulates (approximates) physics
Vision builds models from images “inverse physics and geometry” and video … is for viewing
6
And most videos are badly made
Collection easy, but quality poor Unstable; bad camera shots; long pauses between interesting content; In the linear form, access to interesting content is hard Signal level compression is unselective Content manipulation extremely hard
7
New View: A Picture = 1000 Words
Why throw away all that information we collect in images and video? Use them to tell stories Geometry is useful, but it is not the central goal
8
Application Scenarios
Communication Commerce Entertainment (movies, games) Education and training
9
Convergence of Graphics, Vision, and Video Processing
10
Four Steps Towards Convergence
Representations and tools Software infrastructure Hardware pipeline Applications Focus of this talk: Representation and Tools
11
Geometry/image continuum
Common Data Model Geometry/image continuum Concentric mosaics Image centric Geometry centric Figure from Kang/Szeliski/Anandan ICIP paper. We will be developing an imaging model that captures this spectrum and permits easy use of all these techniques. The important thing to understand is that finding a common platform to accommodate this entire spectrum gains us the flexibility to make use of each technique represented in the spectrum and the efficiency to mix the representations without a performance penalty. Layered Depth Images Fixed geometry View-dependent texture Sprites with depth Lumigraph Light field Polygon rendering + texture mapping Warping Interpolation
12
Image-Based Rendering
Use images as rendering primitives Panoramic images Light Field and Lumigraph Concentric mosaics Sprites with Depth (3D reconstruction) …mostly new view generation
13
Image Based Modeling (and Rendering)
The Geometry Centric Approach Images Geomety + Texture Rendered Images
14
The Lightfield and the Lumigraph
The collection of all the light rays through space (3D) At all times (1D) Over all colors (3D) Along different directions (2D) Also various subsets of this The Lumigraph
15
Layered Sprites ... + ...
16
Stereo Imaging and Depth Maps
Collection of images 3-D Scene Stereo “Model” of scene
17
The notion of “disparity”
3-D point uR Disparity encodes depth z: Disparity, d = uL - uR z d-1 uL
18
Stereo Matching criterion Aggregation method Winner selection
Measures similarity of pixels How the error function is computed Computing the results
19
Problems with stereo Depth discontinuities
Lack of texture (depth ambiguity) Non-rigid effects (highlights, reflection, translucency)
20
View-dependent geometry: Concept
Correct global geometry, Gglobal GVC C1 C2 VC C3 C4 C5 C6 C7 C8
21
Video-Based Rendering
Image-Based Rendering: render from (real-world) images for efficiency, quality, and photo-realism Video-Based Rendering use video instead of still images generate computer video instead of computer graphics
22
Virtual Viewpoint Video
Capture multiple synchronized video streams
23
Acquisition setup cameras controlling hard disks laptop concentrators
Our video capture system consists of 8 cameras, each with a resolution of 1024 by 768 capturing at 15 frames per second. Each group of 4 cameras is synchronized using a device called a concentrator, which pipes all the uncompressed video data to a bank of hard disks via a fiber optic cable. The two concentrators are themselves synchronized, and are controlled by a single laptop.
24
Representation Color Depth Color Depth MAIN LAYER BOUNDARY LAYER
Matting information
26
Other VBR Examples Facial animation Video Rewrite, …
Layer/matte extraction Video Matting, … Dynamic (stochastic) elements Video Textures, … 3-D world navigation
27
Video Matting Pull dynamic -matte from video with complex backgrounds [Chuang et UW, SIGGRAPH’2002]
28
Video Matting Background modification examples
29
Video Textures How can we turn a short video clip into an amount of continuous video? dynamic elements in 3D games and presentations alternative to 3D graphics animation? [Schödl, Szeliski, Salesin, Essa, SG’2000]
30
Video Textures Find cyclic structure in the video
(Optional) region-based analysis Play frames with random shuffle Smooth over discontinuities (morph)
31
Interactive fish This is the result that we get. We processed 8 minutes of fish video to generate the fish animation. The current goal is marked with the red dot, and the bar shows the current frame.
32
Video as a Space-time Volume
33
Video as a movie
34
The “flipbook” paradigm
35
(vs.) The Space-time Cube
36
Space-time video geometry
37
Y T X
38
Automatic EpiStrip analysis and extraction
Space-Time volume EPIs: Epipolar plane images
39
EpiTubes and layers Automatically extracted EpiTube Original data
Re-synthesized layer
40
Layer extraction from sequence
Front-most layer, the dodecahedron Original sequence Background layer
41
Specularities and Highlights.
Specular reflections Highlights
42
Taxonomy of Specular EPI strips.
Specularity across multiple strips Highlight with varying colors Extracted Specularity Extracted true feature trace = + EPI trace of features AND specularities
43
Specularity in the Lightfield volume
44
Original = Diffuse + Specular
45
Original = Diffuse + Specular
46
Summary Rather than explicitly modeling the dynamic 3D appearance/behavior of the world, learn it from lots of sample data The closer you stick to the original data, the greater the realism The more you (automatically) model/abstract, the greater the control
47
Summary Great synergy exists between g
48
MPEG4 and MPEG7
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.