Presentation is loading. Please wait.

Presentation is loading. Please wait.

Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012.

Similar presentations


Presentation on theme: "Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012."— Presentation transcript:

1 Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

2 Video Compression Basics Fundamental tradeoff among: Bit-rate Distortion Computational complexity

3 Video Compression Basics Utilized redundancies: Spatial Temporal Psycho-visual Statistical

4 H.264 Overview

5 H.264 Redundancy Utilization MeansUtilizationRedundancy Transform coding Intra coding (spatial prediction) High Spatial Motion estimation & compensation High Temporal YCbCr color space 4:2:0 sampling DC \ AC coefficients quantization Medium Psycho-visual Entropy coding High Statistical

6 Compression using Computer Vision Motivation: Better utilization of the psycho-visual redundancy Application-specific compression methods Exploring new approaches

7 A Review of: A Scheme for Attentional Video Compression R. Gupta and S. Chaundhury PAMI 2011

8 Method Outline Salient region detection Foveated video coding Integration into H.264 Foveated image coding demonstration Figure from Guo & Zhang, Trans. Image Process., 2010

9 Saliency Map Step 1: Creating a 3D Feature Map Based onCalculation methodFeature type Liu et al, CVPR 2007 Color spatial variance Global Huang et al, ICPR 2010 Center-surround multi-scale ratio of dissimilarity Local Yu et al, ICDL 2009 Pulse-DCTRarity

10 Relevance Vector Machine (RVM) Used here as a binary classifier Advantages over support-vector-machine (SVM): Provides posterior probabilities Better generalization ability Faster decisions

11 Saliency Map Step 2: Unify Features using RVM Global local rarity average ground truth count pixels ‘salient’ \ ‘non salient’ RVM sample label Training Procedure for MBs:

12 Saliency Map Step 2: Unify Features using RVM Trained RVM Usage: RVM New input Binary label ‘salient’ \ ‘non salient’ Probability Relative saliency

13 Saliency Map: Result Comparison inputgloballocal [Huang et al, ICPR 2010] rarity [Yu et al, ICDL 2009] proposed [Harel et al, NIPS 2006] [Bruce & Tsotsos, NIPS 2006] Figures from Gupta & Chaundhury, PAMI 2011

14 Saliency Map: ROC Curve Figure from Gupta & Chaundhury, PAMI 2011 Proposed [Harel et al, NIPS 2006]

15 Integration Into H.264: Calculation of Saliency Values Recalculating saliency map only when it significantly changes Mutual-information between successive frames indicates changes in saliency: Figures from Gupta & Chaundhury, PAMI 2011

16 Integration Into H.264: Propagation of Saliency Values For inter-coded MBs, the saliency value is a weighted- average of those pointed by the motion-vector Figures from Gupta & Chaundhury, PAMI 2011

17 Integration Into H.264: Salient-Adaptive Quantization Non-uniform bit-allocation Smaller saliency value => coarser quantization

18 Integration Into H.264 Figure from Gupta & Chaundhury, PAMI 2011

19 Paper Evaluation Novelty: Methods for: saliency map saliency value propagation Assumption: All the MBs in P-frames are inter-coded (problematic) Writing level: Good Partially self-contained

20 Paper Evaluation Feasibility: Higher complexity than H.264 encoders Not for real-time encoders Useful at low bit-rates Objects entering the scene may be considered unimportant Experimental evaluation: Saliency: visual comparison: good ROC curve comparison: partial Compression: None (authors’ future direction)

21 Future Directions Improving encoding complexity less complex saliency method Better object entrance treatment Using mutual-information of frame areas Treat intra-coded MBs in P-frames

22 A Review of: 3D Models Coding and Morphing for Efficient Video Compression F. Galpin, R. Balter, L. Morin, K. Deguchi CVPR 2004

23 Method Outline 3D model extraction 3D model-based video coding Reconstruction using adaptive geometric morphing

24 3D Models Stream Generation Figure from Galpin et al, CVPR 2004

25 Stream Compression Three data types to compress: 3D model Texture images Camera parameters

26 Texture Image Compression Figure from Galpin et al, CVPR 2004 Reconstruction Process:

27 3D Model Compression The 3D model originates in decimated depth map Compressed by: Wavelet transform Depth-adaptive quantization Figures from Galpin et al, CVPR 2004

28 Video Reconstruction: Texture Fading Figure from Galpin et al, CVPR 2004

29 Video Reconstruction: Texture Fading without texture fadingwith texture fading Figures from Galpin et al, CVPR 2004

30 Video Reconstruction: Geometric Morphing Improving 3D model interpolation Figure from Galpin et al, CVPR 2004

31 Video Reconstruction: Geometric Morphing regular interpolationinterpolation with geometric morphing Figures from Galpin et al, CVPR 2004

32 Result Comparison with H.264

33 Paper Evaluation Novelty: Compression using unknown 3D model Assumptions: Static scene Moving monocular camera Neglected camera rotation GOP intrinsic parameters are fixed Writing level: Good Not self-contained

34 Paper Evaluation Feasibility: Only for static scene video High encoder\decoder complexity Real-time unsuitable Useful at very low bit-rates Experimental evaluation: Sufficient visual comparison with H.264 No run-time information

35 Future Directions Treat moving objects Improve complexity At least for real-time decoding

36 Approach Comparison 3D modelAttention Static sceneAnyVideo type Very lowLowBit-rates useful at High Encoder complexity HighRegularDecoder complexity UnsuitablePossibleIntegration in H.264 InferiorPromisingOverall evaluation


Download ppt "Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012."

Similar presentations


Ads by Google