Presentation is loading. Please wait.

Presentation is loading. Please wait.

Video/Audio Compression

Similar presentations


Presentation on theme: "Video/Audio Compression"— Presentation transcript:

1 Video/Audio Compression
Multimedia Video/Audio Compression T.Sharon-A.Frank

2 Hybrid coding Images: Video/Audio JPEG M-JPEG MPEG (1, 2, 4)
Other codings H.26x T.Sharon-A.Frank

3 Video Coding Requirements
Random access Fast forward /reverse searches Reverse playback Audio-visual synchronization Robustness to errors Low coding/decoding delay Editability Format flexibility Cost tradeoffs T.Sharon-A.Frank

4 Video Compression Spatial (intra-frame) compression:
Compresses each frame in isolation, treating it as a bitmapped image. Based on quantization of DCT coefficients. Temporal (inter-frame) compression: Compresses sequences of frames by only storing differences between them. Record displacement of object plus changed pixels in area exposed by its movement. Based on Motion Compensation (MC). T.Sharon-A.Frank

5 Spatial Compression Image compression applied to each frame.
Can therefore be lossless or lossy, but lossless rarely produces sufficiently high compression ratios for volume of data. Lossless compression implies a loss of quality if decompressed then recompressed. Ideally, work with uncompressed video during post-production. T.Sharon-A.Frank

6 Key frames are spatially compressed only
Temporal Compression Key frames are spatially compressed only Key frames often regularly spaced (e.g., every 12 frames). Difference frames only store the differences between the frame and the preceding frame or most recent key frame. Difference frames can be efficiently spatially compressed. T.Sharon-A.Frank

7 Motion-JPEG (M-JPEG) Purely spatial compression.
Apply JPEG compression to each video frame. Compression rates: 2:1 to 12:1 lossy: up to 5:1 is considered broadcast quality. No standard, but MJPEG-A format widely supported. Excellent when there are rapid scene changes in the video. Easy to edit. T.Sharon-A.Frank

8 Video Compression Coding of video is carried out in a series of steps:
Divide Image to blocks 16x16 luminance 8x8 chrominance (color) Use DCT based techniques for spatial redundancy removal (Intra-frame compression). Use MC (Motion Compensation) techniques for temporal redundancy removal (Inter-frame compression). Final stage is two dimensional run-length coding. Usually T.Sharon-A.Frank

9 Three consecutive video frames
T.Sharon-A.Frank

10 Motion Compensation Motion compensation compensates for inter-frame differences. Real-time communication consideration – only the closest previous frame is used for prediction to reduce the encoding delay. previous frame current frame best match T.Sharon-A.Frank

11 Motion Compensation Algorithm
Sends new location of block If block changed more than a certain threshold, resends all the block Refreshes all the image once in a while previous frame current frame best match T.Sharon-A.Frank

12 Frame Types in Compressed Video
Key Frame Compression is based on content of this frame. Difference/Delta Frame Compression is based on last key frame. T.Sharon-A.Frank

13 Bi-directional Motion Compensated Interpolation
T.Sharon-A.Frank

14 Delicate balance between Intra-frame and Inter-frame coding.
MPEG Dynamics Delicate balance between Intra-frame and Inter-frame coding. Two basic techniques: Transform domain DCT-based compression for the reduction of spatial redundancy (intra-frame). Block-based bi-directional MC for reduction of the temporal redundancy (inter-frame). T.Sharon-A.Frank

15 Three types of MPEG-2 frames processed by the viewing program:
The MPEG Standard Three types of MPEG-2 frames processed by the viewing program: I (Intracoded) frames: self-contained JPEG-encoded still pictures. P (Predictive) frames: block-by-block difference with the last frame. B (Bidirectional) frames: differences with the last and next frame. T.Sharon-A.Frank

16 Use of MPEG Image Types <I> Intra-picture/frame/image
Access points for random access Moderate Compression <P> Predicted pictures Coded with a reference to a past picture Used as reference for future predicted pictures <B> Bi-directional prediction (interpolated pictures) Require past and future reference for prediction Highest compression T.Sharon-A.Frank

17 Group of Pictures (GOP):
MPEG GOPs Group of Pictures (GOP): Repeating sequence of I-, P- and B-pictures. Always begins with an I-picture. Display order – frames in order they will be displayed. Bitstream order – re-ordered so that every P- or B-picture comes after frames it depends on, allowing reconstruction of the complete frames. T.Sharon-A.Frank

18 A Typical MPEG Picture Display Order
Forward prediction I B B B P B B B I I I? B B? 25fps (9 I/P, 17B) T.Sharon-A.Frank

19 A Typical MPEG Picture Bitstream Order
Transmitting order: 1, 5, 2, 3, 4, 9, 6, 7, Forward prediction I B B B P B B B I Bi-directional prediction T.Sharon-A.Frank

20 MPEG Standards MPEG-1 MPEG-2 MP3 352x240 at 30 fps.
Quality is slightly below standard VCR videos. MPEG-2 720x480 & 1280x720 at 60 fps, with full CD-quality audio. Sufficient for television (including HDTV). Used on DVD-ROMs. MP3 Audio compression. Reduces digital sound files by 12:1 ratio with virtually no loss in quality. T.Sharon-A.Frank

21 MPEG-1 Compression Source Interchange Format (SIF)
4:2:0 chrominance sub-sampling 352x240 pixel frame MPEG-1 compressed SIF video at 30 frames per second has data rate of 1.86Mbps (CD video – 40mins of video at that rate). MPEG-1 can be scaled up to larger frames, but cannot handle interlacing. T.Sharon-A.Frank

22 Profiles define subsets of the features of the data stream.
MPEG Profiles & Levels Profiles define subsets of the features of the data stream. Levels define parameters such as frame size and data rate. Each profile may be implemented at one or more levels. Notation: e.g. T.Sharon-A.Frank

23 MPEG-2 Main Profile & Level
MPEG-2 Main Profile at Main Level used for DVD video: CCIR 601 scanning 4:2:0 chrominance sub-sampling 15 Mbits per second Most elaborate representation of MPEG-2 compressed data. T.Sharon-A.Frank

24 MPEG-4 (1) Refinement of MPEG-1 compression:
I-pictures compressed by quantizing and Huffman coding DCT coefficients. Improved motion compensation leads to better quality than MPEG-1 at same bit rates. Designed to support a range of multimedia data at bit rates from 10Kbps to >1.8Mbps. Applications from mobile phones to HDTV. Video codec becoming popular for Internet use – is incorporated in QuickTime, RealMedia and DivX. T.Sharon-A.Frank

25 MPEG-4 (2) Standard defines an encoding for multimedia streams made up of different sorts of object –video, still images, animation, 3-D models… Higher profiles divide a scene into arbitrarily shaped video objects were each one may be compressed and transmitted separately; scene is composed at receiving end by combining them. SP and ASP profiles restricted to rectangular objects, usually complete frames. T.Sharon-A.Frank

26 MPEG-4 Profiles & Levels
Simple Profile (SP), suitable for low bandwidth streaming over Internet: P-pictures only Efficient decompression, suitable for PDAs, etc 64 kbps, 176x144 pixel frame. Advanced Simple Profile (ASP) suitable for broadband streaming: B-pictures Global Motion Compensation Sub-pixel motion compensation Kbps, full CCIR 601 frame. T.Sharon-A.Frank

27 DV Compression Starts with chrominance sub-sampling of CCIR 601.
Constant data rate 25Mbits per second; higher quality than MJPEG at same rate. Apply DCT, quantization, run-length and Huffman coding on zig-zag sequence – like JPEG – to 8x8 blocks of pixels. If little or no difference between fields (almost static frame), apply DCT to block containing alternate lines from odd and even fields. If motion between fields, apply DCT to two 8x4 blocks (one from each field) separately, leading to more efficient compression of frames with motion. T.Sharon-A.Frank

28 DVI (Digital Video Interactive)
Developed by General Electric. Uses specialized processors for compression. Hardware-only codec – lossless transforms. Compression rate: 80:1-160:1 10 sec video clip is compressed to ~2MB. Intel – software version of DVI algorithms, marketed as Indeo (a software only codec): there is also an audio version of Indeo. latest version uses hybrid wavelet transform for compression algorithm. T.Sharon-A.Frank

29 Cinepak Developed by Apple and SuperMac.
Outputs 320x240 (quarter screen) at 15 fps with good quality data rate that even slow single-speed and 2x CD-ROM players can deliver. Software only codec supported by Microsoft’s Video for Windows and Apple’s QuickTime. Better color definition than other codecs, so good for natural video without graphics or animation. T.Sharon-A.Frank

30 Developed by Apple but is now cross-platform.
QuickTime Developed by Apple but is now cross-platform. Supports Cinepak, Indeo, M-JPEG and MPEG-1, and is extensible to support future codecs, such as DVCAM. Synchronizes all types of digital media. For example, video frames are dropped if necessary for synchronization with audio. T.Sharon-A.Frank

31 Video For Windows Microsoft (therefore, not cross-platform).
Uses generic AVI (audio video interleaved) format which is provided by MCI (media control interface). Supports a number of compression methods in real-time, non-real-time, with or without hardware assistance Cinepak, Indeo, Microsoft Video-1. T.Sharon-A.Frank

32 ActiveMovie (API from Microsoft)
Now called DirectShow (supports DVD). Solves problems of VfW and QuickTime. Cross-platform. Supports codecs supported by VfW as well as MPEG audio, WAV audio, MPEG video, and Apple QuickTime video. Fully integrated with DirectX technology, allowing use of DirectX components and more graphics card features. T.Sharon-A.Frank

33 Video Streaming Players
RealVideo (from RealNetworks) G2 Player also plays RealAudio. Uses a variety of compression techniques. RealProducer (also from RealNetworks) Allows you to create streaming audio and video. Free software just like G2! T.Sharon-A.Frank

34 H.261 (Px64) Video compression for videoconferences
Compression in real-time Targeted to ISDN Compressed data stream: p*64 Kbits/s, p=1, …, 30) 2 resolutions: Common Intermediate Format (CIF) Quarter CIF (QCIF) T.Sharon-A.Frank

35 H.261 (Px64) Resolutions Common Intermediate Format (CIF)
Quarter CIF (QCIF) T.Sharon-A.Frank

36 Image Preparation Uncompressed CIF Uncompressed QCIF = 9.1Mbits/s
One frame = 288*352*8 + 2*144*176*8 = 1,216,512 bits 30 fps Bandwidth = 1,216,512*30 = 36.4 Mbits/s Uncompressed QCIF = 9.1Mbits/s ISDN channels: 64Kbits/s-2Mbits/s => bit reduction required T.Sharon-A.Frank

37 Desktop Videophone Applications
Channel capacity (p=1) = 64Kbits/s QCIF at 10 fps --> 3 Mbits/s Required compression ratio = Mbs/64Kbs=47 Channel capacity (p=10) = 640Kbits/s CIF at 30 fps --> 36.4 Mbits/s Required compression ratio = Mbs/640Kbs=57 T.Sharon-A.Frank

38 Audio Compression In general, lossy methods required because of complex and unpredictable nature of audio data. CD quality, stereo, 3-minute song requires over 25 Mbytes Data rate exceeds bandwidth of dial-up Internet connection. Difference in the way we perceive sound and image means different approach from image compression is needed.

39 Audio Compression Techniques
T.Sharon-A.Frank

40 Standards of Speech Encoding
T.Sharon-A.Frank

41 Basic Steps of Audio Encoding
Uncompressed audio data 32 Sub-Bands Filter-Banks [] Quantization Multiplexer Entropy Coder Psychoacoustical Model Control Compressed audio data T.Sharon-A.Frank

42 MP3 MP3 = MPEG-1 Audio, Layer 3
Three layers of audio compression in MPEG-1 (MPEG-2 essentially identical). Layer 1...Layer 3, encoding proces increases in complexity, data rate for same quality decreases e.g. Same quality 192kbps at Layer 1, 128kbps at Layer 2, 64kbps at Layer 3. 10:1 compression ratio at high quality. Variable bit rate coding (VBR).

43 Voice Quality - QoS The Objective:
Provide unfailing, ubiquitous, toll quality service Area of Unacceptable Operation 400 One-Way Delay (ms) Service Level Agreement Violation 200 Marginal Acceptance The Challenge: Eliminate the impact of delay-insensitive traffic on real-time traffic 160 Acceptable Operation 1 5 10 Packet Loss (%) high threshold low threshold T.Sharon-A.Frank

44 QoS Parameters Delay Budgets T.Sharon-A.Frank


Download ppt "Video/Audio Compression"

Similar presentations


Ads by Google