Presentation is loading. Please wait.

Presentation is loading. Please wait.

2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar.

Similar presentations


Presentation on theme: "2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar."— Presentation transcript:

1

2 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

3 ACKNOWLEDGEMENTS Collaborative Work with: Hai Tao Yanlin Guo Steve Hsu Supun Samarasekera Keith Hanna Aydin Arpa Rick Wildes

4 TECHNICAL SUCCESS OF CONVERGENCE TECHNOLOGIES PC based near real-time mosaicing Automated Video Enhancement: VHS-to-DVD Iris recognition, active vision Image based modeling for Entertainment Real-time Video Insertion

5 Immersive and Interactive Telepresence Modes of Operation Observation ModeInteraction ModeConversation Mode User observes a remote site from any perspective. User “walks” through site to view activities of interest “up close”. Example: security, facility guards, sports & entertainment Users talk and observe one another as if in the same room. Users walk around yet maintain eye contact. Example: immersive tele- conferencing Remote users share a common work space. Users observe each other’s hands as they manipulate shared objects, such as war room wall displays. Example: mission planning, remote surgery

6 Quality of Service for Tele-presence Critical Issues High quality for immersive experience –Artifact free recovery of 3D shape from video streams –Efficient 3D video representation and compression –High quality rendering of new views using 3D shape and video streams –Bandwidth available in the Next Generation Internet Low latency for interactive applications –Real time 3D geometry recovery at the content server end –Real time new view rendering at the browser client end –Adaptive Stream management to handle user requests and network loads –Error resilience and concealment to fill in missing packets

7 Convergence Technologies … for immersive & interactive visual applications... Vision algorithms: High-quality 3D shape recovery and dynamic scene analysis ASICs, high performance hardware: Real-time video processing Compact, low-cost cameras: CMOS cameras Low latency and high quality compression: Error resilience Real time view synthesis : Standard platforms, e.g. PCs Immersive Displays

8 Vision algorithm performance over time 2D Video Insertion Coarse 3D Depth Recovery Video registration to 3D site models 2D Stabilization Algorithm Complexity Mosaicing for entertainment & surveillance Real-time insertion in Live TV Face Finding for Iris Recognition Geo- registration visual databases Time High Quality 3d shape extraction 2000 Immersive Telepresence

9 HW Performance/Size/Cost over time Sarnoff ACADIA ASIC performance 100 MHz system clock, processes 100 million pixels/sec in each processing element 10 billion operations / sec total IC performance 800 MB/sec SDRAM interface using 64-bit bus Enables building smart 3D cameras for immersive applications. VFE VFE ACADIA ASIC 2000

10 Application Performance Parametric Motion : Stabilization & Mosaicing –720x Hz OR 720x Hz Pyramid based Fusion : Dynamic Range, Focus Enhancement –720x Hz OR 720x Hz Stereo Depth Extraction –720x240 field 32 disparity levels in 4 ms (250 Hz) –720x240 field 60 disparity levels in 10 ms (100 Hz) –60 disparities on 1k x 1k images at 55 ms (18 Hz)

11 Sarnoff Compression Technology … Required algorithm components for tele-presence are emerging... MPEG4, Progressive Encoding VideoPhone: H.263 Low Latency MPEG2 multiplexing service MPEG2: Encoding and Transmission Algorithm Complexity ICTV Time DIREC-TV & HDTV LG Electronics E-vue Just Noticeable Difference (JND): MPEG2 Encoding and Quality Measurement Tektronix Pyramid & Wavelet based Encoding Still Image Compression

12 A FRAMEWORK FOR VIDEO PROCESSING ALIGN 2D & 3D MODELS OF MOTION & STRUCTURE MODEL-BASED IMAGE SEQUENCE ALIGNMENT TEST WARP/RENDER WITH 2D/3D MODELS TEST ALIGNMENT QUALITY SYNTHESIZE CREATE OUTPUT REPRESENTATIONS

13 Core Vision Algorithms for (Real-time) Motion & 3D Video Analysis 2D Immersive & Layered Representations Stereo & Video Sequence Enhancement Multi-camera Immersive Dynamic Rendering Model-centric Video Visualization Highlights of Sarnoff’s Video Analysis Technologies … framework applied to a create immersive representations... Spherical Mosaics Dynamic & Synopsis Mosaics Hi-Q IBR based mixed resolution synthesis Video Quality Enhancement for efficient compression Dynamic model & video visualization Geo-registration with reference image database Hi-Q Depth extraction Image-based rendering with dynamic depth

14 SPHERICAL MOSAICS Sarnoff Library Video Captures almost the complete sphere with 380 frames TOPOLOGY INFERENCE & LOCAL-TO-GLOBAL ALIGNMENT [Sawhney,Hsu,Kumar ECCV98, Szeliski,Shum SIGGRAPH98]

15 SPHERICAL TOPOLOGY EVOLUTION

16 SPHERICAL MOSAIC Sarnoff Library

17 ACTIVE FOCUS OF ATTENTION WFOV/NFOV CONTROL

18 DYNAMIC MOSAICS Video Stream with deleted moving object Original Video Dynamic Mosaic Video

19 SYNOPISIS MOSAICS

20 Low-Res Left Synthesized High-Res Left Original High-Res Right ALIGNMENT & SYNTHESIS FOR HI-RES STEREO SYNTHESIS A HIGH END APPLICATION OF IBMR [Sawhney,Guo,Hanna,Kumar,Zhou,Adkins SIGGRAPH2001]

21 THE PROBLEM SCENARIO INPUT OUTPUT Left Eye (Typically 1.5K) Right Eye (Typically 6K)

22 3D & Motion Alignment Based Stereo Sequence Processing t t-1 t-2 t+1 t+2 Left Right s t e r e o fl o w f f f l l l o o o w w w Right t-1 t t+1 t+2 t+3 Left Highlights : – Scintillation effect is reduced. – Occlusion regions are better handled. s t e r e o fl o w f f f l l l o o o w w w

23 SYNTHESIS RESULT ON REAL FOOTAGE

24 IMPLICATIONS FOR IMMERSIVE IBMR CAMERA CONFIGURATIONS Lo-res camera Hi-res camera Multi-resolution camera configuration allows 3D capture at the highest resolution as well as user-controlled large range of zooms without the need for zoom control on the cameras.

25 Model-Centric Video Visualization OR Video-Centric Model Visualization [Hsu,Supun,Kumar,Sawhney CVPR00] Original Video Re-projection of video after merging with model. Geo- registration of video to site model Site model

26 Video to Site Model Alignment Model to frame alignment REFINE Correspondence-less exterior orientation from 3D-2D line pairs

27 Oriented Energy Pyramid 0° 45° 90°135° Goal: representation which indicates edge strength in the image at various orientations and scales Orientation selectivity: reduce false matches Coarse-to-fine: increase capture range

28 This will be an animation of the gradual improvement of alignment during the coarse to fine iterations regsite_animation.avi Pose Refinement Algorithm …iterative coarse to fine adjustment of pose...

29 Geo-Registration Video to Reference Database Alignment [Wildes et al. ICCV01] Current Video3D Reference Imagery

30 Registration : Radical Appearance Changes

31 Dynamic 3D Capture & Rendering …global modeling is not feasible... Recovering depth from local views Depth refinement across multiple local views New view synthesis using multiple local views Cross view depth checking

32 3D Shape/Depth Estimation from Multiple Views of a Scene Stereo Pair Estimation of high quality, artifact free depth maps co- registered with video imagery for rendering new views. Must work both outdoors and indoors

33 Multi-baseline depth estimation - requirements Depth maps New view rendering A traditional stereo algorithm Global matching method Thin structures Accurate boundaries Accurate boundaries [Tao,Sawhney,Kumar WACV00, ICCV01]

34 New view rendering using local depth estimation Color segmentation based stereo algorithm (2000) Multi- window plane+ parallax algorithm (1998) Local flow estim- ation (1992) New view rendering

35 Main ideas Motivations –be able to handle textureless regions –handle object boundaries accurately –global visibility constraints should be enforced –Hypothesize reasonable depths for unmatched regions Solutions –Global matching method - an analysis-by-synthesis approach –Representation - smooth depth representation in homogeneous region –Search method - neighborhood depth hypotheses generation –Efficient algorithm - incremental warping –Scene constraints - prior functions

36 Color Segmentation Original image (frame 12)Original image (left) Color segmentation [Comanicius 97]

37 New view rendering using local depth estimation Color segmentation based stereo algorithm True depthLeft image new view rendering

38 Depth computation from 3 views Depth map (frame 12) Video frame 11Video frame 12Video frame 13 Color segmentation (frame 12)

39 Multiple View Depth Recovery and New View Rendering New view rendering from multiple views. New view rendering from a single view. left: from frame 212, right: from frame 215

40 Multiple view depth recovery and new view rendering Original 14 video frames (frame 04-17) Depth map of frame 12 and 15 New view rendering (71 frames)

41 Immersive Visualization of a Dynamic Event Temporally consistent motion and 3D shape extraction Scintillation free dynamic high-quality rendering

42 AN IMMERSIVE IBMR GRAND CHALLENGE

43 AND IF WE DO IT RIGHT

44


Download ppt "2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar."

Similar presentations


Ads by Google