Presentation on theme: "Introduction to Multimedia1. 2 Introduction zMultimedia Description zWhy multimedia systems? zClassification of Media zMultimedia Systems zData Stream."— Presentation transcript:
Introduction to Multimedia1
2 Introduction zMultimedia Description zWhy multimedia systems? zClassification of Media zMultimedia Systems zData Stream Characteristics
Introduction to Multimedia3 Multimedia Description zMultimedia xis an integration of continuous media (e.g. audio, video) and discrete media (e.g. text, graphics, images) through which digital information can be conveyed to the user in an appropriate way. yMulti xmany, much, multiple yMedium xAn interleaving substance through which something is transmitted or carried on
Introduction to Multimedia4 Why Multimedia Computing? yApplication driven xe.g. medicine, sports, entertainment, education yInformation can often be better represented using audio/video/animation rather than using text, images and graphics alone. yInformation is distributed using computer and telecommunication networks. yIntegration of multiple media places demands on xcomputation power xstorage requirements xnetworking requirements
Introduction to Multimedia5 Multimedia Information Systems zTechnical challenges ySheer volume of data xNeed to manage huge volumes of data yTiming requirements xamong components of data computation and communication. xMust work internally with given timing constraints - real-time performance is required. yIntegration requirements xneed to process traditional media (text, images) as well as continuous media (audio/video). xMedia are not always independent of each other - synchronization among the media may be required.
Introduction to Multimedia6 High Data Volume of Multimedia Information
Introduction to Multimedia7 Technology Incentive zGrowth in computational capacity xMM workstations with audio/video processing capability xDramatic increase in CPU processing power xDedicated compression engines for audio, video etc. zRise in storage capacity xLarge capacity disks (several gigabytes) xIncrease in storage bandwidth,e.g. disk array technology zSurge in available network bandwidth xhigh speed fiber optic networks - gigabit networks xfast packet switching technology
Introduction to Multimedia8 Application Areas zResidential Services xvideo-on-demand xvideo phone/conferencing systems xmultimedia home shopping (MM catalogs, product demos and presentation) xself-paced education zBusiness Services xCorporate training xDesktop MM conferencing, MM e-mail
Introduction to Multimedia9 Application Areas zEducation xDistance education - MM repository of class videos xAccess to digital MM libraries over high speed networks zScience and Technology xcomputational visualization and prototyping xastronomy, environmental science zMedicine xDiagnosis and treatment - e.g. MM databases that provide support for queries on scanned images, X-rays, assessments, response etc.
Introduction to Multimedia10 Classification of Media yPerception Medium xHow do humans perceive information in a computer? Through seeing - text, images, video Through hearing - music, noise, speech yRepresentation Medium xHow is the computer information encoded? Using formats for representing and information ASCII(text), JPEG(image), MPEG(video) yPresentation Medium xThrough which medium is information delivered by the computer or introduced into the computer? Via I/O tools and devices paper, screen, speakers (output media) keyboard, mouse, camera, microphone (input media)
Introduction to Multimedia11 Classification of Media (cont.) yStorage Medium Where will the information be stored? Storage media - floppy disk, hard disk, tape, CD-ROM etc. yTransmission Medium Over what medium will the information be transmitted? Using information carriers that enable continuous data transmission - networks wire, coaxial cable, fiber optics yInformation Exchange Medium Which information carrier will be used for information exchange between different places? Direct transmission using computer networks Combined use of storage and transmission media (e.g. electronic mail).
Introduction to Multimedia12 Media Concepts zEach medium defines xRepresentation values - determine the information representation of different media Continuous representation values (e.g. electro-magnetic waves) Discrete representation values(e.g. text characters in digital form) xRepresentation space determines the surrounding where the media are presented. Visual representation space (e.g. paper, screen) Acoustic representation space (e.g. stereo)
Introduction to Multimedia13 Media Concepts (cont.) zRepresentation dimensions of a representation space are: ySpatial dimensions: xtwo dimensional (2D graphics) xthree dimensional (holography) yTemporal dimensions: xTime independent (document) - Discrete media Information consists of a sequence of individual elements without a time component. xTime dependent (movie) - Continuous media Information is expressed not only by its individual value but also by its time of occurrence.
Introduction to Multimedia14 Multimedia Systems zQualitative and quantitative evaluation of multimedia systems yCombination of media xcontinuous and discrete. yLevels of media-independence xsome media types (audio/video) may be tightly coupled, others may not. yComputer supported integration xtiming, spatial and semantic synchronization yCommunication capability
Introduction to Multimedia15 Data Streams zDistributed multimedia communication systems xdata of discrete and continuous media are broken into individual units (packets) and transmitted. zData Stream xsequence of individual packets that are transmitted in a time-dependant fashion. xTransmission of information carrying different media leads to data streams with varying features Asynchronous Synchronous Isochronous
Introduction to Multimedia16 Data Stream Characteristics xAsynchronous transmission mode provides for communication with no time restriction Packets reach receiver as quickly as possible, e.g. protocols for email transmission xSynchronous transmission mode defines a maximum end-to-end delay for each packet of a data stream. May require intermediate storage E.g. audio connection established over a network. xIsochronous transmission mode defines a maximum and a minimum end-to-end delay for each packet of a data stream. Delay jitter of individual packets is bounded. E.g. transmission of video over a network. Intermediate storage requirements reduced.
Introduction to Multimedia17 Data Stream Characteristics yData Stream characteristics for continuous media can be based on xTime intervals between complete transmission of consecutive packets Strongly periodic data streams - constant time interval Weakly periodic data streams - periodic function with finite period. Aperiodic data streams xData size - amount of consecutive packets Strongly regular data streams - constant amount of data Weakly regular data streams - varies periodically with time Irregular data streams xContinuity Continuous data streams Discrete data streams
Introduction to Multimedia18 Classification based on time intervals Strongly periodic data stream Weakly periodic data stream Aperiodic data stream T T T1T3T2 T1T2 T
Introduction to Multimedia19 Classification based on packet size T D1 T D2 D3 D1 D2 D3 D1 D2 D3 Dn Strongly regular data stream Weakly regular data stream Irregular data stream t t t
Introduction to Multimedia20 Classification based on continuity Continuous data stream Discrete data stream D D1D2D3D4 D D1D2D3D4
Introduction to Multimedia21 Broadband Multimedia Communications Audio/Image/Video Representation
Introduction to Multimedia22 Introduction zBasic Sound Concepts zComputer Representation of Sound zBasic Image Concepts zImage Representation and Formats zVideo Signal Representation zColor Encoding zComputer Video Format
Introduction to Multimedia23 Basic Sound Concepts zAcoustics xstudy of sound - generation, transmission and reception of sound waves. zSound is produced by vibration of matter. xDuring vibration, pressure variations are created in the surrounding air molecules. xPattern of oscillation creates a waveform the wave is made up of pressure differences. xWaveform repeats the same shape at intervals called a period. Periodic sound sources - exhibit more periodicity, more musical - musical instruments, wind etc. Aperiodic sound sources - less periodic - unpitched percussion, sneeze, cough.
Introduction to Multimedia24 Basic Sound Concepts zSound Transmission xSound is transmitted by molecules bumping into each other. xSound is a continuous wave that travels through air. ySound is detected by measuring the pressure level at a point. yReceiving xMicrophone in sound field moves according to the varying pressure exerted on it. xTransducer converts energy into a voltage level (i.e. energy of another form - electrical energy) ySending xSpeaker transforms electrical energy into sound waves.
Introduction to Multimedia25 Frequency of a sound wave period amplitude time Air pressure Frequency is the reciprocal value of the period.
Introduction to Multimedia26 Basic Sound Concepts yWavelength is the distance travelled in one cycle x20Hz is 56 feet, 20KHz is 0.7 in. yFrequency represents the number of periods in a second (measured in hertz, cycles/second). xFrequency is the reciprocal value of the period. xHuman hearing frequency range: 20Hz - 20Khz, voice is about 500Hz to 2Khz. Infrasound from 0 - 20 Hz Human range from 20Hz - 20KHz Ultrasound from 20kHz - 1GHz Hypersound from 1GHz - 10THz
Introduction to Multimedia27 Basic Sound Concepts yAmplitude of a sound is the measure of the displacement of the air pressure wave from its mean or quiescent state. ySubjectively heard as loudness. Measured in decibels. 0 db - essentially no sound heard 35 db - quiet home 70 db - noisy street 120db - discomfort
Introduction to Multimedia28 Computer Representation of Audio yA transducer converts pressure to voltage levels. yConvert analog signal into a digital stream by discrete sampling. xDiscretization both in time and amplitude (quantization). yIn a computer, we sample these values at intervals to get a vector of values. yA computer measures the amplitude of the waveform at regular time intervals to produce a series of numbers (samples).
Introduction to Multimedia29 Computer Representation of Audio ySampling Rate: xrate at which a continuous wave is sampled (measured in Hertz) CD standard - 44100 Hz, Telephone quality - 8000 Hz. xDirect relationship between sampling rate, sound quality (fidelity) and storage space. xQuestion How often do you need to sample a signal to avoid losing information? xAnswer To decide a sampling rate - must be aware of difference between playback rate and capturing(sampling) rate. It depends on how fast the signal is changing. In reality - twice per cycle (follows from the Nyquist sampling theorem).
Introduction to Multimedia30 Sampling samples Sample Height
Introduction to Multimedia31 Nyquist Sampling Theorem yIf a signal f(t) is sampled at regular intervals of time and at a rate higher than twice the highest significant signal frequency, then the samples contain all the information of the original signal. yExample xActual playback frequency for CD quality audio is 22050 Hz xBecause of Nyquist Theorem - we need to sample the signal twice, therefore sampling frequency is 44100 Hz.
Introduction to Multimedia32 Data Rate of a Channel yNoiseless Channel Nyquist proved that if any arbitrary signal has been run through a low pass filter of bandwidth H, the filtered signal can be completely reconstructed by making only 2H (exact) samples per second. If the signal consists of V discrete levels, Nyquist’s theorem states: max datarate = 2 *H log_2 V bits/sec noiseless 3kHz channel with quantization level 1 bit cannot transmit binary signal at a rate exceeding 6000 bits per second. yNoisy Channel Thermal noise present is measured by the ratio of the signal power S to the noise power N (signal-to-noise ratio S/N). Max datarate - H log_2 (1+S/N)
Introduction to Multimedia33 Quantization ySample precision - the resolution of a sample value yQuantization depends on the number of bits used measuring the height of the waveform. y16 bit CD quality quantization results in 64K values. yAudio formats are described by sample rate and quantization. Voice quality - 8 bit quantization, 8000 Hz mono(8 Kbytes/sec) 22kHz 8-bit mono (22kBytes/s) and stereo (44Kbytes/sec) CD quality - 16 bit quantization, 44100 Hz linear stereo (196 Kbytes/s)
Introduction to Multimedia34 Quantization and Sampling samples Sample Height 0.75 0.5 0.25
Introduction to Multimedia35 Audio Formats yAudio formats are characterized by four parameters xSample rate: Sampling frequency xEncoding: audio data representation -law encoding corresponds to CCITT G.711 - standard for voice data in telephone companies in USA, Canada, Japan A-law encoding - used for telephony elsewhere. A-law and -law are sampled at 8000 samples/second with precision of 12bits, compressed to 8-bit samples. Linear Pulse Code Modulation(PCM) - uncompressed audio where samples are proportional to audio signal voltage. xPrecision: number of bits used to store audio sample -law and A-law - 8 bit precision, PCM can be stored at various precisions, 16 bit PCM is common. xChannel: Multiple channels of audio may be interleaved at sample boundaries.
Introduction to Multimedia36 Audio Formats zAvailable on UNIX yau (SUN file format), wav (Microsoft RIFF/waveform format), al (raw a-law), u (raw u-law)… zAvailable on Windows-based systems (RIFF formats) ywav, midi (file format for standard MIDI files), avi zRIFF (Resource Interchange File Format) ytagged file format (similar to TIFF).. Allows multiple applications to read files in RIFF format zRealAudio, MP3 (MPEG Audio Layer 3)
Introduction to Multimedia37 Computer Representation of Voice zBest known technique for voice digitization is pulse-code-modulation (PCM). yConsists of the 2 step process of sampling and quantization. yBased on the sampling theorem. xIf voice data are limited to 4000Hz, then PCM samples 8000 samples per second which is sufficient for input voice signal. yPCM provides analog samples which must be converted to digital representation. xEach of these analog samples must be assigned a binary code. Each sample is approximated by being quantized.
Introduction to Multimedia38 Computer Representation of Music yMIDI (Music Instrument Digital Interface) xstandard that manufacturers of musical instruments use so that instruments can communicate musical information via computers. xThe MIDI interface consists of: Hardware - physical connection b/w instruments, specifies a MIDI port (plugs into computers serial port) and a MIDI cable. Data format - has instrument specification, notion of beginning and end of note, frequency and sound volume. Data grouped into MIDI messages that specify a musical event. An instrument that satisfies both is a MIDI device (e.g. synthesizer) xMIDI software applications include music recording and performance applications, musical notations and printing applications, music education etc.
Introduction to Multimedia39 Computer Representation of Speech xHuman ear is most sensitive in the range 600Hz to 6000 Hz. xSpeech Generation real-time signal generation allows transformation of text into speech without lengthy processing Limited vs. large vocabulary (depends on application) Must be understandable, must sound natural xSpeech Analysis Identification and Verification - recognize speakers using acoustic fingerprint Recognition and Understanding - analyze what has been said How something was said - used in lie detectors. xSpeech transmission - coding, recognition and synthesis methods - achieve minimal data rate for a given quality.
Introduction to Multimedia40 Basic Concepts (Digital Image Representation) yAn image is a spatial representation of an object, a 2D or 3D scene etc. yAbstractly, an image is a continuous function defining a rectangular region of a plane xintensity image - proportional to radiant energy received by a sensor/detector xrange image - line of sight distance from sensor position. yAn image can be thought of as a function with resulting values of the light intensity at each point over a planar region.
Introduction to Multimedia41 Digital Image Representation yFor computer representation, function (e.g. intensity) must be sampled at discrete intervals. xSampling quantizes the intensity values into discrete intervals. Points at which an image is sampled are called picture elements or pixels. Resolution specifies the distance between points - accuracy. xA digital image is represented by a matrix of numeric values each representing a quantized intensity value. I(r,c) - intensity value at position corresponding to row r and column c of the matrix. Intensity value can be represented by bits for black and white images (binary valued images), 8 bits for monochrome imagery to encode color or grayscale levels, 24 bit (color-RGB).
Introduction to Multimedia42 Image Formats yCaptured Image Format xformat obtained from an image frame grabber xImportant parameters Spatial resolution (pixels X pixels) Color encoding (quantization level of a pixel - 8-bit, 24-bit) e.g. “SunVideo” Video digitizer board allows pictures of 320 by 240 pixels with 8-bit grayscale or color resolution. Parallax-X video includes resolution of 640X480 pixels and 24-bit frame buffer.
Introduction to Multimedia43 Image Formats yStored Image Format - format when images are stored yImages are stored as 2D array of values where each value represents the data associated with a pixel in the image. xBitmap - this value is a binary digit xFor a color image - this value may be a collection of 3 values that represent intensities of RGB component at that pixel, 3 numbers that are indices to table of RGB intensities, index to some color data structure etc. yImage file formats include - GIF (Graphical Interchange Format), X11 bitmap, Postscript, JPEG, TIFF
Introduction to Multimedia44 Basic Concepts (Video Representation) yHuman eye views video ximmanent properties of the eye determine essential conditions related to video systems. yVideo signal representation consists of 3 aspects: xVisual Representation objective is to offer the viewer a sense of presence in the scene and of participation in the events portrayed. xTransmission Video signals are transmitted to the receiver through a single television channel xDigitalization analog to digital conversion, sampling of gray(color) level, quantization.
Introduction to Multimedia45 Visual Representation yThe televised image should convey the spatial and temporal content of the scene xVertical detail and viewing distance Aspect ratio: ratio of picture width and height (4/3 = 1.33 is the conventional aspect ratio). Viewing angle = viewing distance/picture height xHorizontal detail and picture width Picture width (conventional TV service ) - 4/3 * picture height xTotal detail content of the image Number of pixels presented separately in the picture height = vertical resolution Number of pixels in the picture width = vertical resolution*aspect ratio product equals total number of picture elements in the image.
Introduction to Multimedia46 Visual Representation xPerception of Depth In natural vision, this is determined by angular separation of images received by the two eyes of the viewer In the flat image of TV, focal length of lenses and changes in depth of focus in a camera influence depth perception. xLuminance and Chrominance Color-vision - achieved through 3 signals, proportional to the relative intensities of RED, GREEN and BLUE. Color encoding during transmission uses one LUMINANCE and two CHROMINANCE signals xTemporal Aspect of Resolution Motion resolution is a rapid succession of slightly different frames. For visual reality, repetition rate must be high enough (a) to guarantee smooth motion and (b) persistance of vision extends over interval between flashes(light cutoff b/w frames).
Introduction to Multimedia47 Visual Representation xContinuity of motion Motion continuity is achieved at a minimal 15 frames per second; is good at 30 frames/sec; some technologies allow 60 frames/sec. NTSC standard provides 30 frames/sec - 29.97 Hz repetition rate. PAL standard provides 25 frames/sec with 25Hz repetition rate. xFlicker effect Flicker effect is a periodic fluctuation of brightness perception. To avoid this effect, we need 50 refresh cycles/sec. Display devices have a display refresh buffer for this. xTemporal aspect of video bandwidth depends on rate of the visual system to scan pixels and on human eye scanning capabilities.
Introduction to Multimedia48 Transmission (NTSC) yVideo bandwidth is computed as follows x700/2 pixels per line X 525 lines per picture X 30 pictures per second xVisible number of lines is 480. yIntermediate delay between frames is x1000ms/30fps = 33.3ms yDisplay time per line is x33.3ms/525 lines = 63.4 microseconds yThe transmitted signal is a composite signal xconsists of 4.2Mhz for the basic signal and 5Mhz for the color, intensity and synchronization information.
Introduction to Multimedia49 Color Encoding yA camera creates three signals xRGB (red, green and blue) yFor transmission of the visual signal, we use three signals 1 luminance (brightness-basic signal) and 2 chrominance (color signals). xIn NTSC, luminance and chrominance are interleaved xGoal at receiver separate luminance from chrominance components avoid interference between them prior to recovery of primary color signals for display.
Introduction to Multimedia50 Color Encoding yRGB signal - for separate signal coding xconsists of 3 separate signals for red, green and blue colors. Other colors are coded as a combination of primary color. (R+G+B = 1) --> neutral white color. yYUV signal xseparate brightness (luminance) component Y and xcolor information (2 chrominance signals U and V) Y = 0.3R + 0.59G + 0.11B U = (B-Y) * 0.493 V = (R-Y) * 0.877 xResolution of the luminance component is more important than U,V xCoding ratio of Y, U, V is 4:2:2
Introduction to Multimedia51 Color Encoding(cont.) yYIQ signal xsimilar to YUV - used by NTSC format Y = 0.3R + 0.59G + 0.11B U = 0.60R - 0.28G + 0.32 B V = 0.21R -0.52g + 0.31B yComposite signal xAll information is composed into one signal xTo decode, need modulation methods for eliminating interference b/w luminance and chrominance components.
Introduction to Multimedia52 Digitization yRefers to sampling the gray/color level in the picture at MXN array of points. yOnce points are sampled, they are quantized into pixels sampled value is mapped into an integer quantization level is dependent on number of bits used to represent resulting integer, e.g. 8 bits per pixel or 24 bits per pixel. yNeed to create motion when digitizing video xdigitize pictures in time xobtain sequence of digital images per second to approximate analog motion video.
Introduction to Multimedia53 Computer Video Format yVideo Digitizer xA/D converter yImportant parameters resulting from a digitizer digital image resolution quantization frame rate xE.g. Parallax X Video - camera takes the NTSC signal and the video board digitizes it. Resulting video has 640X480 pixels spatial resolution 24 bits per pixel resolution 20fps (lower image resolution - more fps) xOutput of digital video goes to raster displays with large video RAM memories. Color lookup table used for presentation of color
Introduction to Multimedia54 Digital Transmission Bandwidth yBandwidth requirement for images xraw image transmission b/w = size of image = spatial resolution x pixel resolution xcompressed image - depends on compression scheme xsymbolic image transmission b/w = size of instructions and primitives carrying graphics variables yBandwidth requirement for video xuncompressed video = image size X frame rate xcompressed video - depends on compression scheme xe.g HDTV quality video uncompressed - 345.6Mbps, compressed using MPEG (34 Mbps with some loss of quality).
Introduction to Multimedia55 Broadband Multimedia Communications Multimedia Compression Techniques
Introduction to Multimedia57 Coding Requirements yStorage Requirements xUncompressed audio: 8Khz, 8-bit quantization implies 64 Kbits to store per second xCD quality audio: 44.1Khz, 16-bit quantization implies storing 705.6Kbits/sec xPAL video format: 640X480 pixels, 24 bit quantization, 25 fps, implies storing 184,320,000 bits/sec = 23,040,000 bytes/sec yBandwidth Requirements xuncompressed audio: 64Kbps xCD quality audio: 705.6Kbps xPAL video format: 184,320,000 bits/sec zCOMPRESSION IS REQUIRED!!!!!!!
Introduction to Multimedia58 Coding Format Examples yJPEG for still images yH.261/H.263 for video conferencing, music and speech (dialog mode applications) yMPEG-1, MPEG-2, MPEG-4 for audio/video playback, VOD (retrieval mode applications) yDVI for still and continuous video applications (two modes of compression) Presentation Level Video (PLV) - high quality compression, but very slow. Suitable for applications distributed on CD-ROMs Real-time Video (RTV) - lower quality compression, but fast. Used in video conferencing applications.
Introduction to Multimedia59 Coding Requirements yDialog mode applications xEnd-to-end Delay (EED) should not exceed 150-200 ms xFace-to-face application needs EED of 50ms (including compression and decompression). yRetrieval mode applications xFast-forward and rewind data retrieval with simultaneous display (e.g. fast search for information in a multimedia database). xRandom access to single images and audio frames, access time should be less than 0.5sec xDecompression of images, video, audio - should not be linked to other data units - allows random access and editing
Introduction to Multimedia60 Coding Requirements yRequirements for both dialog and retrieval mode applications xSupport for scalable video in different systems. xSupport for various audio and video rates. xSynchronization of audio-video streams (lip synchronization) xEconomy of solutions Compression in software implies cheaper, slower and low quality solution. Compression in hardware implies expensive, faster and high quality solution. xCompatibility e.g. tutoring systems available on CD should run on different platforms.
Introduction to Multimedia61 Classification of Compression Techniques xEntropy Coding lossless encoding used regardless of media’s specific characteristics data taken as a simple digital sequence decompression process regenerates data completely e.g. run-length coding, Huffman coding, Arithmetic coding xSource Coding lossy encoding takes into account the semantics of the data degree of compression depends on data content. E.g. content prediction technique - DPCM, delta modulation xHybrid Coding (used by most multimedia systems) combine entropy with source encoding E.g. JPEG, H.263, DVI (RTV & PLV), MPEG-1, MPEG-2, MPEG-4
Introduction to Multimedia62 Steps in Compression yPicture preparation analog-to-digital conversion generation of appropriate digital representation image division into 8X8 blocks fix the number of bits per pixel yPicture processing (compression algorithm) transformation from time to frequency domain, e.g. DCT motion vector computation for digital video. yQuantization Mapping real numbers to integers (reduction in precision). E.g. U-law encoding - 12bits for real values, 8 bits for integer values yEntropy coding compress a sequential digital stream without loss.
Introduction to Multimedia64 Types of compression zSymmetric Compression Same time needed for decoding and encoding phases Used for dialog mode applications zAsymmetric Compression Compression process is performed once and enough time is available, hence compression can take longer. Decompression is performed frequently and must be done fast. Used for retrieval mode applications
Introduction to Multimedia67 Additional Requirements - JPEG yJPEG implementation is independent of image size and applicable to any image and pixel aspect ratio. yImage content may be of any complexity (with any statistical characteristics). yJPEG should achieve very good compression ratio and good quality image. yFrom the processing complexity of a software solution point of view: JPEG should run on as many available platforms as possible. ySequential decoding (line-by-line) and progressive decoding (refinement of the whole image) should be possible.
Introduction to Multimedia68 Variants of Image Compression zFour different modes xLossy Sequential DCT based mode Baseline process that must be supported by every JPEG implementation. xExpanded Lossy DCT based mode enhancements to baseline process xLossless mode low compression ratio allows perfect reconstruction of original image xHierarchical mode accommodates images of different resolutions
Introduction to Multimedia70 Broadband Multimedia Communications MPEG Compression
Introduction to Multimedia71 Introduction yGeneral Information about MPEG yMPEG/ Video Standard yMPEG/ Audio Standard yMPEG Systems Multiplexing of Video/Audio Data Streams
Introduction to Multimedia72 General Information yMPEG-1 achieves data compression of 1.5Mbps. xThis is the data rate of audio CD’s and DAT’s (Digital Audio Tapes). yMPEG considers explicitly functionalities of other standards,e.g. it uses JPEG. yMPEG defines standard video, audio coding and system data streams with synchronization. yMPEG Core Technology includes many different patents MPEG committee sets technical standards
Introduction to Multimedia73 General Information (cont.) yMPEG stream provides more information than a data stream compressed according to the JPEG standard. xAspect Ratio - 14 aspect ratios can be encoded. 1:1 corresponds to computer graphics, 4:3 corresponds to 702X575 pixels (TV format), 16:9 corresponds to 625/525 (HDTV format). xRefresh Frequency- 8 frequencies are encoded - 23.976Hz, 24, 25,29.97, 50, 59.94, 60 Hz. yOther Issues with frame rate xEach frame must be built within a maximum of 41.7(33)ms to keep display rate of 24fps(30fps). xNo need or possibility of defining MCUs in MPEG. Implies sequential non-interleaving order. xFor MPEG, there is no advantage to progressive display over sequential display.
Introduction to Multimedia74 MPEG Overview zMPEG exploits temporal (i.e frame-to-frame) redundancy present in all video sequences. zTwo Categories: Intra-frame and inter-frame encoding yDCT based compression for the reduction of spatial redundancy (similar to JPEG) yBlock-based motion compensation for exploiting temporal redundancy xcausal(predictive coding) - current picture is modeled as transformation of picture at some previous time xnon-causal (interpolative coding) - uses past and future reference
Introduction to Multimedia75 MPEG Image Preparation - Motion Representation yPredictive and interpolative coding xGood compression but requires storage and information xOften makes sense for parts of an image and not the whole image. yEach image is divided into areas called macro-blocks (motion compensation units) xEach macro-blocks is partitioned into 16x16 pixels for luminance, 8x8 for each of the chrominance components. xChoice of macro-block size is a tradeoff between gain from motion compensation and cost of motion estimation. xMacro-blocks are useful for compression based on motion estimation.
Introduction to Multimedia76 MPEG Video Processing yMPEG stream includes 4 types of image coding for video processing xI-frames - Intra-coded frames - access points for random access, yields moderate compression xP-frames - Predictive-coded frames - encoded with reference to a previous I or P frame. xB-frames - Bi-directionally predictive coded frames - encoded using previous/next I and P frame, maximum compression xD-frames - DC coded frames yMotivation for types of frames xDemand for efficient coding scheme and fast random access xGoal to achieve high compression rate - temporal redundancies of subsequent pictures (i.e. interframes) must be exploited
Introduction to Multimedia77 MPEG Audio Encoding Steps Psychoacoustic Model Quantization Bit/noise Allocation Filter Bank Multiplexer Entropy Coder Huffman Coding If noise level is too low --> finer quantization is applied If noise level is too high --> rough quantization is applied Transformation from time to frequency domain 32 subbands Compressed data
Introduction to Multimedia78 MPEG/System Data Stream yVideo Stream is interleaved with audio. yVideo Stream consists of 6 layers xSequence layer xGroup of pictures layer Video Param - width, height, aspect ratio, picture rate Bitstream Param - bitrate, bufsize QT - intracoded blocks, intercoded blocks xPicture layer Time code - hours, minutes, seconds xSlice layer Type - I, P, B Buffer Param - decoder’s bufsize Encode Param - indicates info about motion vectors xMacro-block layer Vertical Position - what line does this slice start on? Qscale - how is the quantization table scaled in this slice? xBlock layer