Presentation on theme: "SWE 423: Multimedia Systems"— Presentation transcript:
1 SWE 423: Multimedia Systems Chapter 7: Data Compression (5)
2 Outline Introduction to H.261 H.261 Image Preparation H.261 Coding AlgorithmsH.263H.261 & H.263 PropertiesH.261 vs. H.263H.264
3 Introduction to H.261ISDN [Integrated Services Digital Network] is (and was) behind H.261A circuit-switched telephone network system, designed to allow digital transmission of voice and data over ordinary telephone copper wires.In ISDN, there are two types of channels, B (for "Bearer") and D (for "Delta"). B channels are used for data (which may include voice), and D channels are intended for signaling and control (but can also be used for data).In a narrow-band ISDN connection, exactly two B-channels and one D-channel is availableOne or both B channels can transfer video data, in addition to speech.This requires that both ends have to use the same video data coding schemes
4 Introduction to H.261The primary applications of ISDN were video phones and video conferencing.Such dialogue applications require that coding and decoding being carried out in real-time.In 1984, Study group XV of CCITT formed a committee to draw up a video standard for compressing moving picturesThe standard, H.261 “Video CoDec for Audiovisual services at p64Kbit/s” was finalized after 5 years and got accepted in December 1990.North America adopted it with slight modificationsSince data rates of p64 Kbit/s are considered, the recommendation was also known as p64.Maximum combined signal delay is 150 ms
5 H.261: Image Preparation Very precise format, unlike JPEG Refresh frequency at the input must be 30000/1001 frames/sDuring encoding, lower frame rates are possible (10 or 15 frames/s)Images cannot be presented at the input to the coder using interlaced scanningThe image is encoded as a luminance signal Y and chrominance difference signals Cb, Cr, according to the CCIR 601 sub-sampling scheme (2:1:1)This was later adopted by MPEG
6 H.261 Image PreparationTwo resolution formats are supported, both with an aspect ratio of 4:3Common Intermediate Format (CIF)Optional352 lines, each with 288 pixels of luminance (Y) componentAs per the (2:1:1) requirement, the chrominance components are sub-sampled with lines, each with Pixels.Quarter CIF (QCIF)All H.261 CoDec’s have to implement QCIFHas exactly half the resolution in all components.As per the (2:1:1) requirement, the chrominance components are sub-sampled with 176 lines, each with 144 Pixels.
7 H.261 Image PreparationH.261 divides the Y, Cb and Cr components into blocks of 88 pixelsA macro block results from combining 4 blocks of the Y matrix with 1 block each from the Cb and Cr components.A group of blocks consists of 311 macro blocks. Hence,CIF consists of groupsQCIF consists of groupsCIF has 352x288 Y and 2 176x144 Cb and Cr: Therefore,the number of blocks = 1584 block, Cb and Cr each has 396 blocksThe number of macro blocks = 396The number of CIF groups = 12Similarly, one can compute QCIF to be 3 groups
8 H.261 Coding Algorithms H.261 uses two different modes of coding Intra-frame codingInter-frame codingH.261 does not specify any criteria for choosing one or the other.That decision is taken during encoding
9 H.261 Coding Algorithms Intra-frame coding Considers data from image being codedLike JPEG, each block of 88 pixels is transformed into 64 coefficients using DCT.DC coefficients are quantized differently than AC coefficientsEntropy encoding using variable-length code words is then performed.
10 H.261 Coding Algorithms Inter-frame coding Considers data from other imagesA prediction method is used to find the most similar macro block in the preceding image.Motion vector is the relative position of the previous macro block w.r.t. the current macro blockAccording to H.261, the encoder needs not determine a motion vector, thus may only consider differences between macro blocks located at the same position in successive images.The motion vector is processed and entropy encoded using variable-length code wordsThe DPCM-coded macro block is processed and transformed using DCT if and only if its value exceeds a certain threshold value, linearly quantized, and entropy encoded using variable-length code wordsAn optical low pass filter can be optionally inserted between the DCT transformation and entropy encoding to delete any remaining high-frequency noise.
11 H.263 Developed in 1996 to replace H.261 for many applications Designed for low bit rate transmission, but also suitable for higher bit rates applicationsProvides one of the most efficient video compression techniques available.
12 H.263Inclusion of four negotiable options to improve performance (achieving same quality as H.261 with less than half as many bits)Syntax-based arithmetic codingDefines the use of arithmetic coding instead of variable length codingForward and backward frame predictionCan increase frame rate without changing the bit rate by coding two images as one unit.Unrestricted Motion VectorsMakes it possible for motion vectors to point outside image boundaries.Useful for small images with motion in the direction of edgesAdvanced PredictionUses the overlapped block motion compensation (OBMC) technique for P-frame luminance.An algorithm that obtains motion vectors from blocks next to the current macro block and uses them with the current macro block to achieve a more accurate predication and a smaller bit stream.Requires the use of unrestricted motion vectors.
13 H.261 & H.263 PropertiesThe data stream contains information for error correction, although the use of external error correction standards (e.g. H.223) is recommended.Each image in H.261 includes a 5-bit image number that can be used as a temporal reference. H.263 uses 8-bit image numbersDuring decoding, a command can be sent to the decoder to “freeze” the last video frame.It is possible to switch between still images and moving images using an additional command sent by the coder.
14 H.261 vs. H.263 Use similar coding algorithms With some enhancements and error correction in H.263H.263 uses half pixel precision for motion compensation, while H.261 uses full pixel precision with “loop filter”.Some parts of the hierarchical structure of the data stream are now optional, so the codec can be configured for a lower data rate or better error recovery.Inclusion of four negotiable options to improve performance (achieving same quality as H.261 with less than half as many bits)Unrestricted Motion Vectors,Syntax-based arithmetic coding,Advance prediction, andForward and backward frame predictionsimilar to MPEG’s P and B frames.H.263 supports three more resolutions (SQCIF, 4CIF, and 16CIF) in addition to the two supported by H.261 (QCIF and CIF)SQCIF is approximately half the resolution of QCIF.4CIF and 16CIF are 4 and 16 times the resolution of CIF respectively.The support of 4CIF and 16CIF means the codec could then compete with other higher bitrate video coding standards such as the MPEG standards.Check for samplesIn video compression, Motion compensation describes a picture in terms of where each section of that picture came from, in a previous picture. A more sophisticated approach is to approximate the motion of the whole scene and the objects of a video sequence.A first approach would be to simply subtract a reference frame from a given frame. The difference is then called residual and usually contains less energy (or information) than the original frame. The residual can be encoded at a lower bit-rate with the same quality. The decoder can reconstruct the original frame by adding the reference frame again. The motion is described by some parameters that have to be encoded in the bit-stream. The pixels of the predicted frame are approximated by appropriately translated pixels of the reference frame. This gives much better residuals than a simple subtraction. However, the bit-rate occupied by the parameters of the motion model must not become too large.Usually, the frames are processed in groups. One frame (usually the first) is encoded without motion compensation just as a normal image. This frame is called I-frame (intra-coded frame, MPEG terminology) or I-picture. The other frames are called P-frames or P-pictures and are predicted from the I-frame or P-frame that comes (temporally) immediately before it. The prediction schemes are, for instance, described as IPPPP, meaning that a group consists of one I-frame followed by four P-frames.
15 H.264H.264 was finalized and published in March 2005 and represents an evolution of the existing video coding standards (H.261,H.262, and H.263)It was developed in response to the growing need for higher compression of moving pictures forvarious applications such as videoconferencing, digital storage media, television broadcasting, Internet streaming, and communication and enable the use of the coded video representation in a flexible manner for a wide variety of network environments.It also allows motion video to be manipulated as a form of computer data and to be stored on various storage media, transmitted and received over existing and future networks and distributed on existing and future broadcasting channels.
16 H.264The revision contains modifications of the video coding standard toadd four new profiles, High, High 10, High 4:2:2, and High 4:4:4 profilesimprove video quality capabilityextend the range of applications addressed by the standard (for example, by including support for a greater range of picture sample precision and higher-resolution chroma formats).define new types of supplemental data has been specified to further broaden the applicability of the video coding standard.