1 CSCD 443/533 Advanced Networks Fall 2016 Lecture 19 Compression of Video and Audio.

1 CSCD 443/533 Advanced Networks Fall 2016 Lecture 19 Compression of Video and Audio

2 Topics Compression technology Motivation Human attributes make it possible Audio Compression Video Compression Performance

Motivation, Why Compress? Why do we need to compress streaming media? Look at one instance – 640 X 480 pixel frames – 24 bits color/pixel – 30 frames / sec – No compression, takes over 200 Mbps to transmit just video – Do you have a 200 Mbps link? – We need massive compression to be able to view streaming video and audio with our current network

Motivation, Why Compress? What does compression buy us? Lossless DVD video - 221 Mbps Compressed DVD video - 4 Mbps 50:1 compression ratio!

Why Compress? In a Nutshell – To reduce the file size – To deliver stream to the user – To conserve storage space Choosing a compression rate is a balance: Quality of the Media Available bandwidth

So, Why Compress? Delivering video over Web means compromises Mostly trading image quality for lower bit rates In general, Video and audio are compressed Stuffed into a container and Delivered to you via web If done well, you won't notice The missing bits and The delivery of media Discuss individual format, codecs and tradeoffs

Definitions Define Streaming file format? – Like.mov,.avi or.flv Codec Compression

Definitions File Format – Particular way information is stored in a file – Known as “containers” for streaming media Codec – Codec is an acronym for Compression/Decompression Compression – Reduces file size by removing audio or video information

Format vs. Codec E Example F Flash Video (FLV) is a file format H H.264, On2, VP6, Sorenson Spark are codecs for the flash video file

Container File Formats Purpose of container formats Function as "black boxes" for holding a variety of media formats Good container formats can handle files compressed with a variety of different codecs In a perfect world, you could put any codec in any container format … Unfortunately are some incompatibilities Examples MPEG-2, Advanced Systems Format (ASF) from Microsoft, AVI, Quicktime (MOV), MP4, Flash (FLV) RealMedia

Multimedia Container Files Multimedia file extensions.mov,.ogg,.wmv,.flv,.mp4,.mpeg Essentially, videos are packaged Into encapsulation containers, or wrapper formats, that contain all information needed to present video You can think of file formats as being containers that hold all this information Very similar to a.zip,.sit or.rar file

Differences in Containers Why do you think certain formats are popular? Popular Support How widely supported is the format? File Size Larger is not better for streaming files Support for advanced codec functionality Older formats such as AVI do not support new codec features … like B-frames or VBR audio Support for advanced content Such as chapters, subtitles, meta-tags, user- data.

Compression

LosslessLossy Keeps All BitsRemoves Bits Two Types: Compression

Lossy Compression Lossy compression schemes reduce file size by discarding some amount of data during encoding before sent over Internet Once received by client, codec attempts to reconstruct information that was lost or discarded

Video Lossy Compression Image Compression – Image format uses lossy compression to sample an image and discard unnecessary color/contrast information

Can you really see difference?

Video Lossy Compression Why can you do lossy compression? Spatial and temporal redundancy – Pixel values are not independent, correlated with their neighbors both within same frame and across frame Value of pixel is predictable given values of neighboring pixels Psychovisual redundancy – Human eye has limited response to fine spatial detail, Less sensitive to detail near object edges or around shot- changes Impairments introduced by bit rate reduction should not be visible to human viewer

Audio Lossy Compression Audio compression – Lossy discards frequencies on high and low end of spectrum and attempts to locate and remove unnecessary audio data More on this … Nice description and example programs http://www.videograbber.net/compress-audio-file.html

Audio Streaming Formats Many formats and standards for streaming audio – RealNetworks' RealAudio, streaming MP3, Macromedia's Flash and Director Shockwave, Microsoft's Windows Media, and Apple's QuickTime – Also recognized standard formats, including – Liquid Audio, MP3, MIDI, WAV, and AU

Audio Lossy Compression First, player decompresses audio file as it downloads to your computer Then fills in missing information according to the instructions set by codec – Compressed file is unintelligible to listener – Decompressed file is intelligible but of a lower quality than original

MP3 Audio Lossy Compression Example - MP3 MP3 lossy audio data compression algorithm takes advantage of perceptual limitation of human hearing – Auditory Masking Discovered (in late 1800's) that tone could be rendered inaudible by another tone of lower frequency How your brain perceives similar sounds

MP3 Audio Lossy Compression Uncompressed audio, – Like CDs, stores more data than your brain can actually process – For example, Two notes are very similar and very close together, your brain may perceive only one of them Two sounds are different, one is much louder than the other, your brain may never perceive the quieter signal

MP3 Audio Lossy Compression Study these auditory phenomena Psychoacoustics, – Can be accurately described in tables and charts, – Mathematical models representing human hearing patterns – These can be stored in the codec as reference tables Article on psychoacoustics http://www.uaudio.com/blog/how-the-ear-works/

MP3 Audio Lossy Compression MP3 Encoding Tools – Analyze incoming source signal, – Break it down into mathematical patterns, and – Compare these patterns to psychoacoustic models stored in encoder itself Encoder can then discard most of data that doesn't match stored models, keeping that which does Shrinks file by discarding great deal of extra data

MP3 Audio Lossy Compression MP3 encoding process … two-pass system Step 1 – Run all psychoacoustic models, discarding data – Then compress what's left to shrink storage space Step 2 – Huffman coding, does not discard any data – Lets you store what's left in a smaller amount of space Steps 2a - Break resulting audio stream into frames assembled into a bitstream, with header information preceding each data frame – Headers contain "meta-data" specific to that frame – Such as an ID, bitrate, audio frequency, padding, type of frame, MPEG1 or 2

Basic Structure of Audio Encoder Note: A decoder works in just the opposite manner Limit values to audible tones

Processes of and Audio Encoder Mapping Block – divides audio inputs into 32 equal- width frequency subbands (samples)‏ Psychoacoustic Block – calculates masking threshold for each subband

Processes of and Audio Encoder Bit-Allocation Block (in Quantizer block) – allocates bits using outputs of the Mapping and Psychoacoustic blocks Quantizer & Coding Block – scales and quantize (reduce) the samples Frame Packing Block – formats the samples with headers into an encoded stream

Video Encoding, Standards

MPEG Organization Moving Picture Experts Group Established in 1988 Standards under International Organization for standardization (ISO) and International Electro technical Commission (IEC)‏ Official name: ISO/IEC JTC1 SC29 WG11 Responsible for MPEG standards

Evolution of MPEG MPEG-1 –Initial audio/video compression standard –Used by VCD’s – 1990's –MP3 = MPEG-1 audio layer 3 –Target of 1.5 Mb/s bitrate at 352x240 resolution –Only supports progressive pictures, no interlaced pictures

Evolution of MPEG MPEG-2 –Current de facto standard, widely used in DVD and Digital TV –Support in current hardware implies that it will be here for a long time Transition to HDTV has taken over 10 years and is not finished yet –Different profiles and levels allow for quality control

Evolution of MPEG MPEG-3 –Originally developed for HDTV, but abandoned when MPEG-2 was determined to be sufficient MPEG-4 –Includes support for AV “objects”, 3D content, low bitrate encoding, and DRM –In practice, provides equal quality to MPEG-2 at a lower bitrate, but often fails to deliver outright better quality –MPEG-4 Part 10 is H.264, which is used in HD- DVD and Blu-Ray

MPEG technical specification Part 1 - Systems - describes synchronization and multiplexing of video and audio. Part 2 - Video - compression codec for interlaced and non- interlaced video signals. Part 3 - Audio - compression codec for perceptual coding of audio signals. A multichannel-enabled extension of MPEG-1 audio. Part 4 - Describes procedures for testing compliance. Part 5 - Describes systems for Software simulation. Part 6 - Describes extensions for DSM-CC (Digital Storage Media Command and Control.)‏ Part 7 - Advanced Audio Coding (AAC)‏ Part 8 - Deleted Part 9 - Extension for real time interfaces. Part 10 - Conformance extensions for DSM-CC.

MPEG Video spatial domain processing Spatial Domain Handled Similarly to JPEG –Convert RGB values to YUV colorspace One Grey and two other color representations RGB from Television, YUV graphics processing –Split frame into 8x8 blocks –2-D Discrete Cosine Transform (DCT) on each block Similar to a Fourier Transform for Signal Processing –Quantization of DCT coefficients –Run length and entropy coding

DCT Transform on Blocks Reduction in Number of Bits – Typical blocks from natural images, – Distribution of coefficients is non-uniform – The DCT has property that, for typical image, most of visually significant information about image is concentrated in just a few coefficients of DCT – And many other coefficients are near-zero Near-zero coefficients are discarded !!! – Remaining coefficients are quantized and Huffman coded Quantized – Discrete values get rounded Huffman coding – More frequent symbols represented with fewer bits

MPEG video time domain processing Totally new ballgame (this concept doesn’t exist in JPEG)‏ General idea – Use motion vectors to specify how a 16x16 macroblock translates between reference frames and current frame, then code difference between reference and actual block

GOP (Group of Pictures)‏ GOP is a set of consecutive frames that can be decoded without any other reference frames Usually 12 or 15 frames Starts with I frame

MPEG video time domain processing Group of Pictures (GOP)‏ I-frames – Can be reconstructed without any reference to other frames, like still pictures P-frames – Forward predicted from last I-frame and P-frames, Code differences like movement – Two to 4 frames in the future B-frames – Forward and backward predicted

MPEG Processing GOP

MPEG GOP

Final Comments on Prediction Only use motion vector if a “close” match can be found – Evaluate “closeness” with Mean Standard Error or other metric – Can’t search all possible blocks, so need a smart algorithm – If no suitable match found, just code the macroblock as an I-block – If a scene change is detected, start fresh Don’t want too many P or B frames in a row – Predictive error will keep propagating until next I frame – Delay in decoding

MPEG-2 Usefulness Multimedia Communications Webcasting Broadcasting Video on Demand Interactive Digital Media Telecommunications Mobile communications

References Overviews of Codecs and Container Formats http://www.divxland.org/en/article/15/multimedia_contai ner_formats http://www.pcworld.com/article/213612/all_about_video _codecs_and_containers.html?page=2 Ripping CD's and Encoding audio http://www.blog.gartonhill.com/ripping-your-cd- collection-part-1/ http://www.blog.gartonhill.com/ripping-your-cd- collection-part-2-building-your-library/ Mp3 Audio http://oreilly.com/catalog/mp3/chapter/ch02.html Audio Streaming http://oreilly.com/catalog/sound/chapter/ch05.html

Summary Video and audio has become a huge part of our daily interaction with the Internet New codecs and file formats being proposed all the time Number of devices with different needs driving the push for more efficient ways to compress and deliver streaming media

End Lab tomorrow – Stay tuned, put it up today

1 CSCD 443/533 Advanced Networks Fall 2016 Lecture 19 Compression of Video and Audio.

Similar presentations

Presentation on theme: "1 CSCD 443/533 Advanced Networks Fall 2016 Lecture 19 Compression of Video and Audio."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 CSCD 443/533 Advanced Networks Fall 2016 Lecture 19 Compression of Video and Audio.

Similar presentations

Presentation on theme: "1 CSCD 443/533 Advanced Networks Fall 2016 Lecture 19 Compression of Video and Audio."— Presentation transcript:

Similar presentations

About project

Feedback