Video Transmission Adopting Scalable Video Coding over Time- varying Networks Chun-Su Park, Nam-Hyeong Kim, Sang-Hee Park, Goo-Rak Kwon, and Sung-Jea Ko,
Published byModified over 6 years ago
Presentation on theme: "Video Transmission Adopting Scalable Video Coding over Time- varying Networks Chun-Su Park, Nam-Hyeong Kim, Sang-Hee Park, Goo-Rak Kwon, and Sung-Jea Ko,"— Presentation transcript:
Video Transmission Adopting Scalable Video Coding over Time- varying Networks Chun-Su Park, Nam-Hyeong Kim, Sang-Hee Park, Goo-Rak Kwon, and Sung-Jea Ko, Senior Member, IEEE IEEE Transactions on Consumer Electronics 2006/08/08 S.K.Chang
Outline Introduction of SVC SVC Layer-Switching Problem Propose System Experiment Conclusion
Introduction of SVC Scalable video coding Video signal is encoded once, but enable decoding from partial streams with respect to the specific rate and resolution required by a certain application. Such scalability has important applications in many areas especially in heterogeneous networks where downstream client capabilities and network conditions are not known in advance.
Introduction of SVC Scalable video coding in H.264 Base layer compatibility with H.264: A sub-bit-stream of the base layer is compatible with H.264. Layered coding scheme with switchable inter-layer prediction mechanisms: The information of the enhancement layer pictures can be predicted from the coded information of base layer pictures at the same time instant. Fine granular quality scalability: A new slice type called progressive refinement slice is introduced to provide quality enhancement layer which can be truncated at any arbitrary point. Usage and extension of the syntax structure of H.264: SVC adopts a concept of network adaptation layer (NAL) units of H.264. Specifically, the NAL units of the lowest layer are compatible with those of H.264.
Introduction of SVC Scalable video coding The scalability features of SVC are built on a pyramidal structure. Temporal scalability is accomplished through a GOP-like structure A series of B frames are coded between two P frames. The P frames (currently referred to as “ key pictures ” ) together with the very first I frame (IDR frame) form a first temporal scalability layer. The coding of B frames in the JSVM is performed using a hierarchical structure, in which dyadic decomposition is used to construct temporal layers of increased temporal resolution. B frame coding induces an additional coding delay, equal to the number of intervening B frames between P frames, assuming instantaneous acquisition/encoding.
Introduction of SVC Scalable video coding Spatial and coarse-grain scalability is based on creating a refinement (in terms of both texture and motion) of the base layer for predicting the enhancement layer at either the increased spatial resolution or increased quality (lower QP). FGS coding is performed by repeated reduction of the quantizer step size and application of an entropy coding process similar to sub-bit-plane coding.
Introduction of SVC Temporal - Spatial scalability
SVC Layer-Switching Problem When the channel bandwidth is fixed, the number of transmitted layer isn’t changed, the anchor picture for the current GOP always exists. If the decoding layer is changed due to the variation of channel bandwidth, the required anchor picture may not exist and the decoder can’t operate properly. High-to-low layer switching occurring doesn’t produce problems since there always exist an anchor picture for GOP at the low layer. In Low-to-high layer switching, the required reference picture for GOP doesn’t exist at the high layer.
SVC Layer-Switching Problem Instantaneous decoder refresh (IDR) pictures are used in SVC IDR contains I/SI slice causes the decoder to make all reference pictures as “ unused for reference ” immediately after decoding the IDR picture. All the following coded pictures in decoding order can be decoded without any picture prior to the IDR picture. When the number of layer is changed, the server inserts the IDR picture into the bit-stream so that the decoder can decode following pictures not using previously transmitted pictures. It breaks the GOP structure and the server has to re-encodes the parts of the given sequence pictures following it in decoding order. One bit-stream is used by multiple users, the change of transmitted bit-stream incurred by a single user affects all the users in network and causes the waste of channel bandwidth throughout. IDR picture produces the reduced coding efficiency and non- prompt stream adaptation.
Propose System(I) Without an advance notification (no additional information on layer switching is transmitted to decoder) Decoding Start The discarded high-pass pictures in decoding process waste the computational resources of the local device and the bandwidth of the networks.
Propose System(I) This method does not decrease the quality of decoded pictures. The method results in the delay of layer switching by one GOP.
Propose System(II) Error concealment for bit-stream containing completely missing pictures or layers is supported in the SVC codec. There are four types of error concealment methods for frame loss; picture copy (PC) temporal direct motion vector generation (TD) motion and residual up-sampling (BlSkip) reconstruction base layer up-sampling (RU). Among the above four methods, the RU method can be performed only using the picture at the lowest layer. In the RU method, the picture at the lowest layer is reconstructed and up-sampled using the AVC 6-tap filter for the lost picture at the high layer. We adopt the RU method to obtain the required reference pictures in the previous GOP.
This method does not cause layer-switching delay by interpolating the required low-pass pictures. However, the quality of decoded pictures is degraded due to the difference between original and up-sampled pictures.
Propose System The low-pass pictures can be encoded as either intra or inter mode. High-to-low layer switching can be performed at any GOP boundary without considering the type of low-pass picture in the current GOP
Experiment We implemented the propose algorithms to JSVM 4.0 software. Through simulations “BUS”, “FOREMAN”, “SOCCER”, and “HARBOUR” sequences are used “BUS” and “FOREMAN” sequences are split into 2 layers; base layer (BL) and 1st enhancement layer (1st EL). In our experiment, BL and 1st EL are coded at QCIF@15 fps and CIF@30 fps, respectively. “SOCCER” and “HARBOR” sequences are split into 3 layers; BL, 1st EL, and 2nd EL. In each sequences, BL, 1st EL, and 2nd EL are coded at QCIF@15 fps, CIF@30 fps, and 4CIF@60 fps, respectively.
Experiment If the intra period is equal to 16, low-pass pictures in all GOPs are always encoded in intra mode. In case that intra period is equal to 32, low-pass pictures are encoded in intra and inter modes, alternately. If intra period is equal to -1, the low-pass picture in each GOP is encoded in inter mode. The all low-pass pictures at base layer are always reconstructed since we set the decoding loops as multi-loop coding.
Experiment At first, we measure the PSNR of the up-sampled anchor pictures.
Conclusion In this paper, we have investigated the problems of layer switching in SVC and propose two layer-switching algorithms. The method 1 has no problem in terms of the quality of decoded pictures, it causes layer-switching delay by one GOP. In method 2, the layer-switching delay is not introduced. However, the drift error can be propagated continuously. Experimental results show that the PSNR performance of the proposed method depends upon the intra period. With the selective control between Method 1 and Method 2, the proposed methods can be efficiently applied to the video streaming service over various networks.