Presentation is loading. Please wait.

Presentation is loading. Please wait.

Overview of the Scalable Video Coding Extension of the H.264/AVC Standard Kai-Chao Yang 12007/8Kai-Chao Yang, NTHU, Taiwan.

Similar presentations


Presentation on theme: "Overview of the Scalable Video Coding Extension of the H.264/AVC Standard Kai-Chao Yang 12007/8Kai-Chao Yang, NTHU, Taiwan."— Presentation transcript:

1 Overview of the Scalable Video Coding Extension of the H.264/AVC Standard Kai-Chao Yang 12007/8Kai-Chao Yang, NTHU, Taiwan

2 Outline  Introduction  Problems  Definition  Functionality  Goal  Competition  Applications  Targets  History of SVC  Structure of SVC  Temporal Scalability  Spatial Scalability  Quality Scalability  Combined Scalability  Profiles of SVC  Conclusions 2007/8Kai-Chao Yang, NTHU, Taiwan2

3 Introduction - problem  Non-Scalable Video Streaming  Multiple video streams are needed for heterogeneous clients 2007/8Kai-Chao Yang, NTHU, Taiwan3 8Mb/s 6Mb/s 4Mb/s 1Mb/s 512Kb/s

4 Introduction - definition  Scalable video stream   Scalability  Removal of parts of the video bit-stream to adapt to the various needs of end users and to varying terminal capabilities or network conditions Sub-stream 1 Sub-stream 2 Sub-stream n … Sub-stream k 1 Sub-stream k 2 Sub-stream k i … reconstruc tion High quality Low quality 42007/8Kai-Chao Yang, NTHU, Taiwan

5 Introduction - functionality  Functionality of SVC  Graceful degradation when “right” parts of the bit-stream are lost  Bit-rate adaptation to match the channel throughput  Format adaptation for backwards compatible extension  Power adaptation for trade-off between runtime and quality 2007/8Kai-Chao Yang, NTHU, Taiwan5

6 Introduction - mode  Example  Scalability mode  Fidelity reduction (SNR scalability)  Picture size reduction (spatial scalability)  Frame rate reduction (temporal scalability)  Sharpness reduction (frequency scalability)  Selection of content (ROI or object-based scalability) 2007/8Kai-Chao Yang, NTHU, Taiwan6 0 1 1 0 1 1 0 1 1 0 0 1 0 1 0 0 1 0 0 1 1 0 1 0 0 0 1 0 1 1 Enhancement 1 Enhancement 2 Enhancement 3 Enhancement 4 Enhancement 5 residual Most significant bit Base layer Enhancement layer

7 Structure of SVC 2007/8Kai-Chao Yang, NTHU, Taiwan7 Spatial decimation Temporal scalable coding Prediction Base layer coding SNR scalable coding Multiplex

8 Temporal Scalability  Hierarchical prediction structures 0 12345678910111213141516 0123456789101112131415161718 012345678910111213141516 Hierarchical B pictures Non-dyadic hierarchical prediction Hierarchical prediction with zero delay GOP 82007/8Kai-Chao Yang, NTHU, Taiwan

9 Temporal Scalability 2007/8Kai-Chao Yang, NTHU, Taiwan9 I I I I PPPPPPPP P PPP P P P B0B0 B0B0 B0B0 B0B0 B0B0 B0B0 B0B0 B1B1 B1B1 B1B1 B1B1 B1B1 B1B1 B2B2 B2B2 B2B2 B2B2 N=1 N=2 N=4 N=8 Temporal scalability Video Coding Experiment with H.264/MPEG4-AVC Foreman, CIF 30Hz @ 1320kbps Performance as a function of N Cascaded QP assignment QP(P)  QP(B0)-3  QP(B1)-4  QP(B2)-5 This slide is copied from JVT-W132-Talk

10 Spatial Scalability 2007/8 H.264/AVC MCP & Intra-prediction Hierarchical MCP & Intra-prediction Base layer coding texture motion texture motion texture motion Inter-layer prediction Intra Motion Residual Inter-layer prediction Intra Motion Residual Spatial decimation Multiplex Scalable bit-stream 10Kai-Chao Yang, NTHU, Taiwan H.264/AVC compatible coder H.264/AVC compatible base layer bit-stream

11 Spatial Scalability  Similar to MPEG-2, H.263, and MPEG-4  Arbitrary resolution ratio  The same coding order in all spatial layers  Combination with temporal scalability  Inter-layer prediction 2007/8Kai-Chao Yang, NTHU, Taiwan11 Intra Spatial 0 Temporal 0 Temporal 1 Spatial 1 Temporal 2

12 Spatial Scalability  The prediction signals are formed by  MCP inside the enhancement layer (Temporal) (small motion and high spatial detail)  Up-sampling from the lower layer (Spatial)  Average of the above two predictions (Temporal + Spatial)  Inter-layer prediction  Three kinds of inter-layer prediction  Inter-layer motion prediction  Inter-layer residual prediction  Inter-layer intra prediction  Base mode MB  Only residual are transmitted, but no additional side info. 2007/8Kai-Chao Yang, NTHU, Taiwan12

13 Spatial Scalability  Inter-layer motion prediction  base_mode_flag = 1  The reference layer is inter-coded  Data are derived from the reference layer  MB partitioning  Reference indices  MVs  motion_pred_flag  1: MV predictors are obtained from the reference layer  0: MV predictors are obtained by conventional spatial predictors. 2007/8Kai-Chao Yang, NTHU, Taiwan13 (x1,y1)(x1,y1) Reference layer 16 8 8 (x2,y2)(x2,y2) (2x 2,2y 2 )(2x 1,2y 1 )

14 Spatial Scalability  Inter-layer residual prediction  residual_pred_flag = 1  Predictor  Block-wise up-sampling by a bi-linear filter from the corresponding 8  8 sub-MB in the reference layer  Transform block basis 2007/8Kai-Chao Yang, NTHU, Taiwan14

15 Spatial Scalability  Inter-layer intra prediction  base_mode_flag = 1  The reference layer is intra-coded  Up-sampling from the reference layer  Luma: one-dimensional 4-tap FIR filter  Chroma: bi-linear filter 2007/8Kai-Chao Yang, NTHU, Taiwan15

16 Spatial Scalability  Past spatial scalable video:  Inter-layer intra prediction requires completely decoding of base layer.  Multiple motion compensation and deblocking filter are needed.  Full decoding + inter-layer prediction: complexity > simulcast.  Single-loop decoding  Inter-layer intra prediction is restricted to MBs for which the co-located base layer is intra-coded 2007/8Kai-Chao Yang, NTHU, Taiwan16

17 Spatial Scalability  Single-loop vs. multi-loop decoding 2007/8Kai-Chao Yang, NTHU, Taiwan17 This slide is copied from http://iphome.hhi.de/wiegand/assets/pdfs/H264AVC_SVC.pdfhttp://iphome.hhi.de/wiegand/assets/pdfs/H264AVC_SVC.pdf Inter IBP

18 Spatial Scalability  Generalized spatial scalability in SVC  Arbitrary ratio  Neither the horizontal nor the vertical resolution can decrease from one layer to the next.  Cropping  Containing new regions  Higher quality of interesting regions 2007/8Kai-Chao Yang, NTHU, Taiwan18

19 Spatial Scalability  Encoder control (JSVM)  Base layer   p 0 ’ is optimized for base layer  Enhancement layer   p 1 ’ is optimized for enhancement layer  Decisions of p 1 depend on p 0  Efficient base layer coding but inefficient enhancement layer coding 2007/8Kai-Chao Yang, NTHU, Taiwan19

20 Spatial Scalability  Encoder control (optimization)  Base layer  Considering enhancement layer coding  Eliminating p 0 ’s disadvantaging enhancement layer coding   Enhancement layer  No change  w  w = 0: JSVM encoder control  w = 1: Single-loop encoder control (base layer is not controlled) 2007/8Kai-Chao Yang, NTHU, Taiwan20

21 Quality Scalability  Coarse-grain quality scalability (CGS)  A special case of spatial scalability  Identical sizes for base and enhancement layers  Smaller quantization step sizes of for higher enhancement residual layers  Designed for only several selected bit-rate points  Supported bit-rate points = Number of layers  Switch can only occur at IDR access units 2007/8Kai-Chao Yang, NTHU, Taiwan21

22 Quality Scalability  Medium-grain quality scalability (MGS)  More enhancement layers are supported  Refinement quality layers of residual  Key pictures  Drift control  Switch can occur at any access units  CGS + key pictures + refinement quality layers 2007/8Kai-Chao Yang, NTHU, Taiwan22

23 Quality Scalability  Drift control  Drift: The effect caused by unsynchronized MCP at the encoder and decoder side  Trade-off of MCP in quality SVC  Coding efficiency  drift 2007/8Kai-Chao Yang, NTHU, Taiwan23

24 Quality Scalability  MPEG-4 quality scalability with FGS  Base layer is stored and used for MCP of following pictures  Drift: Drift free  Complexity: Low  Efficiency: Efficient based layer but inefficient enhancement layer  Refinement data are not used for MCP Base layer Refinement (possibly lost or truncated) 2007/824Kai-Chao Yang, NTHU, Taiwan

25 Quality Scalability  MPEG-2 quality scalability (without FGS)  Only 1 reference picture is stored and used for MCP of following pictures  Drift: Both base layer and enhancement layer  Frequent intra updates is necessary  Complexity: Low  Efficiency: Efficient enhancement layer but inefficient base layer 2007/8Kai-Chao Yang, NTHU, Taiwan25 Base layer Refinement (possibly lost or truncated)

26 Quality Scalability  2-loop prediction  Several closed encoder loops run at different bit- rate points in a layered structure  Drift: Enhancement layer  Complexity: High  Efficiency: Efficient base layer and medium efficient enhancement layer Base layer Refinement (possibly lost or truncated) 2007/826Kai-Chao Yang, NTHU, Taiwan

27 Quality Scalability  SVC concepts  Key picture  Trade-off between coding efficiency and drift  MPEG-4 FGS: All key pictures  MPEG-2 quality scalability: No key pictures Base layer Refinement (possibly lost or truncated) 2007/827Kai-Chao Yang, NTHU, Taiwan

28 Quality Scalability  Drift control with hierarchical prediction  Key pictures  Based layer is stored and used for the MCP of following pictures  Other pictures  Enhancement layer is stored and used for the MCP of following pictures  GOP size adjusts the trade-off between enhancement layer coding efficiency and drift Base layer Refinement (possibly lost or truncated) 2007/828Kai-Chao Yang, NTHU, Taiwan PPPB1B1 B1B1 B2B2 B2B2 B2B2 B2B2

29 Combined Scalability  SVC encoder structure Dependency layer 2007/829Kai-Chao Yang, NTHU, Taiwan The same motion/prediction information Temporal Decomposition

30  Dependency and Quality refinement layers Combined Scalability 2007/8Kai-Chao Yang, NTHU, Taiwan30 D = 2 Q = 2 Q = 1 Q = 0 D = 1 Q = 2 Q = 1 Q = 0 D = 0 Q = 2 Q = 1 Q = 0 Scalable bit- stream

31 Combined Scalability 2007/8Kai-Chao Yang, NTHU, Taiwan31 T0T0 D1D1 Q1Q1 Q0Q0 D0D0 Q1Q1 Q0Q0 T2T2 T1T1 T2T2 T0T0

32 Combined Scalability  Bit-stream format 2007/8Kai-Chao Yang, NTHU, Taiwan32 NAL unit headerNAL unit header extensionNAL unit payload 11111323362 PTDQ P (priority_id): indicates the importance of a NAL unit T (temporal_id): indicates temporal level D (dependency_id): indicates spatial/CGS layer Q (quality_id): indicates MGS/FGS layer

33 Combined Scalability  Bit-stream switching  Inside a dependency layer  Switching everywhere  Outside a dependency layer  Switching up only at IDR access units  Switching down everywhere if using multiple-loop decoding 2007/8Kai-Chao Yang, NTHU, Taiwan33

34 Profiles of SVC  Scalable Baseline  For conversational and surveillance applications requiring low decoding complexity  Spatial scalability: fixed ratio (1, 1.5, or 2) and MB- aligned cropping  Temporal and quality scalability: arbitrary  No interlaced coding tools  B-slices, weighted prediction, CABAC, and 8x8 luma transform  The base layer conforms Baseline profile of H.264/AVC 2007/8Kai-Chao Yang, NTHU, Taiwan34

35 Profiles of SVC  Scalable High  For broadcast, streaming, and storage  Spatial, temporal, and quality scalability: arbitrary  The base layer conforms High profile of H.264/AVC  Scalable High Intra  Scalable High + all IDR pictures 2007/8Kai-Chao Yang, NTHU, Taiwan35

36 References  H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the Scalable Video Coding Extension of the H.264/AVC Standard,” CSVT 2007.  T. Wiegand, “Scalable Video Coding,” Joint Video Team, doc. JVT-W132, San Jose, USA, April 2007.  T. Wiegand, “Scalable Video Coding,” Digital Image Communication, Course at Technical University of Berlin, 2006. (Available on http://iphome.hhi.de/wiegand/dic.htm) http://iphome.hhi.de/wiegand/dic.htm  H. Schwarz, D. Marpe, and T. Wiegand, “Constrained Inter-Layer Prediction for Single-Loop Decoding in Spatial Scalability,” Proc. of ICIP’05. 2007/8Kai-Chao Yang, NTHU, Taiwan36


Download ppt "Overview of the Scalable Video Coding Extension of the H.264/AVC Standard Kai-Chao Yang 12007/8Kai-Chao Yang, NTHU, Taiwan."

Similar presentations


Ads by Google