Multimedia Retrieval Architecture Anandi Giridharan Electrical Communication Engineering, Indian Institute of Science, Bangalore – 560012, India Multimedia.

Multimedia Retrieval Architecture Anandi Giridharan Electrical Communication Engineering, Indian Institute of Science, Bangalore – 560012, India Multimedia Storage Techniques

Multimedia Retrieval Architecture Media and Storage Requirements Characteristics of multimedia data their storage requirement. Multimedia data tends to be voluminous. eg.100min of video compressed using JPEG compression algorithm requires 9GB of storage space. Most storage systems do not provide for such large continuous locations. Continuous media data, such as video and audio have timing characteristics associated with them. In Real time data need to be collected without losing a portion of the data. Imposes timing constraints on multimedia data.

Multimedia Retrieval Architecture Media requirements of MM applications and storage space.

Multimedia Retrieval Architecture Multimedia Standards A standard implies consistency and conformity, which means they facilitate interoperability and compatibility. Standards in computing are developed to solve problems: – Interoperability – allow systems to communicate with each other (e.g., TCP/IP) – Portability – allowing software to work on different systems (e.g., Java) – Data exchange – allowing data to be transferred to different systems (e.g., JPEG) Factors to consider: Lifetime, Portability and Costs

Multimedia Retrieval Architecture Storage Structures of Video Data In digital video 4 types of control information have to considered for smooth running of any mm information Control Information Frame Rate: Video is made up of 30 (or 24) pictures or frames for every second of video. Frames are split in half (odd lines and even lines), to form what are called fields. Interlaced video: When a television set displays its analogue video signal, it displays the odd lines (the odd field) first. Then it displays the even lines (the even field). Non-Interlaced Video: Computer monitor uses “progressive scan" to update the screen. Computer displays each line in sequence, from top to bottom.

Multimedia Retrieval Architecture

Interlaced video on the left, deinterlaced video on the right. - See more at: http://www.streaminglearningcenter.com/articles/shooting-for-streaming---progressive- or-interlaced.html#sthash.iAFBM02x.dpuf

Storage Structures of Video Data Control Information Color Resolution: – Color resolution refers to the number of colors displayed on the screen at one time – RGB (red-green-blue) and YUV (luminance component (the brightness) and U and V chrominance (color) components) Spatial Resolution: – “How big is the picture?” Resolution Image Quality: – Video should look acceptable for an application.

Spatial resolution is a parameter that shows how many pixels are used to represent a real object in digital form. Fig. 2 shows the same color image represented by different spatial resolution. Left flower have a much better resolution that right one

Video Data Compression Factors associated with compression – Real-Time versus Non-Real-Time Some systems compress to disk, decompress and playback video (30fps) all in real time. There are no delays. Other systems are only capable of capturing some of the 30fps and are capable of playing back some of the frames not all. – Symmetrical Versus Asymmetrical Symmetrical: if a sequence of 640x480 can be played at 30 fps, capturing, compressing and storing is also possible at the same rate. Opposite of Asymmetrical. It takes lot longer, elaborate.

Multimedia Retrieval Architecture Compression Ratios The numerical representation of the original video in comparison to the compressed video. eg.200:1 compression ratio means that the original video is represented by the number 200 and compressed video is represented by smaller number in this case 1.

Multimedia Retrieval Architecture Lossless Versus Lossy Loss factor determines whether there is a loss of quality between the original image and the image after it has been compressed. With lossless compression, every single bit of data that was originally in the file remains after the file is uncompressed. All of the information is completely restored. lossy compression reduces a file by permanently eliminating certain information, especially redundant information. When the file is uncompressed, only a part of the original information is still there (although the user may not notice it).

Multimedia Retrieval Architecture Examples of Lossless and lossy (200:1) images decoded from the same file.

Multimedia Retrieval Architecture Video Data Compression Interframe Versus Intraframes – Intraframe method compresses and stores each video frame as a discrete picture – Interframe method: Reference Frame and the differences between frames are recorded. Bit Rate Control – Parameters such as frame rate, quality of the images should be allowed to be modified w.r.t. the application requirements Selecting a Compression Technique – Motion JPEG, MPEG-1, MPEG-2, so on up to MPEG-7 and MPEG-2000 are internationally recognized standards for compression of moving pictures.

Data compression Data Compression coverts an Input data stream into another stream of smaller size. Process of reducing the amount of data needed for storage typically by use of encoding techniques. Compression helps in reducing storage space Reduce bandwidth Lower cost Used in new applications.

Multimedia Retrieval Architecture Audio Compression Predictive encoding: Difference between samples are encoded instead of absolute sample values resulting in lower bit rates. Compression is not that high. Perceptual encoding: It makes use of the flaws in our auditory system based on the study of how people perceive sound Ear's sensitivity to sound Is not uniform 2 to 4 kHz ear is sensitive Higher or lower ranges not sensitive. Audio samples that are below the threshold can be deleted. Some sound can mask other sounds.

Multimedia Retrieval Architecture Lossy Audio compression Sounds are masked by other sounds. Frequency masking: A loud sound in Frequency range can partially or fully mask another sound in nearby frequency range. Temporal masking: Loud sound can numb our ears for short duration even after sound has disappeared.

Multimedia Retrieval Architecture MPEG-1 Audio Compression Sampling is done at 32KHz, 44.1 KHz or 48 KHz. 44.1 KHz for CD quality audio. Signal is converted from Time domain to frequency domain using Fast Fourier Transform, Resulting Spectrum is divided into at-most 32 frequency bands each of which are processed separately. Frequency ranges that are to be completely masked are allocated zero bits That are to be partially masked are allocated small number of bits That are not to be masked are allocated large number of bits. In case of stereo, redundancy are similar in two audio sources are exploited.

Multimedia Retrieval Architecture Video Compression Video is temporal combination of frames Each frame can be considered as an still image comprising of spatial combination of pixels. Two principles: Joint photographic expert group: is used to compress images by removing spatial redundancy that exists in each frame. Moving Picture Expert Group: is used to compress video by removing temporal redundancy of a set of frames.

Multimedia Retrieval Architecture JPEG JPEG involves four steps – Block preparation – Discrete cosine transformation – Quantization – Compression

Multimedia Retrieval Architecture Phases of JPEG

Multimedia Retrieval Architecture Block preparation Block preparation: After video signal is digitized, is converted to array of pixels. i.e. 640*480 pixel Each pixel has RGB components each 8 bits totally 24 bits/ pixel. Before compression it is converted to Luminance (brightness) (more sensitive to our eyes) Chrominance (color) Chrominance is very sensitive to our eyes so sent with lesser resolution. It is compactable with Black and white picture Allows more compression so in YUV. Y=0.30R+0.59G+0.11B U=-0.18R-0.29G+0.44B V=0.62-0.52G-0.10B

Multimedia Retrieval Architecture Discrete Cosine transform Each block of 64 pixels goes through a transformation called DCT Example: with uniform intensity It has only one DC component and Other ac componets It has one Dc and few AC Number of zeros are more. Example 2: With 2 different intensities

Multimedia Retrieval Architecture Quantization Further increases Number of zeros

Multimedia Retrieval Architecture Zig Zag scanning To compute all the zeros together and sent in compact number as fewer number.

Multimedia Retrieval Architecture MPEG-1 First standard that finalized for video compression for interactive video on CD and digital audio Broadcasting. VCR quality 640*480 pixel, 24 bits/pixel, 25 frame /sec gives 368.64 Mbps (UC) After MPEG-1 compression gives 1.5 Mbps. It is likely to dominate the encoding of CDROM based movies, gives good quality movie. It can be used to transmit over twisted pair for modest distance (5km)

Multimedia Retrieval Architecture MPEG-1 It has 3 components, Audio, Video and system, 90 KHZ clock outputs the current time valve (time stamps) to both the encoders and propagated all the way to receiver., Audio signal System Multiplexer clock Audio encoder Video encoder 90KHz MPEG-1 Video signal

Multimedia Retrieval Architecture MPEG-1 Video compressing Encoding each frame separately with jpeg removes spatial redundancy. Additional compression can be achieved by taking advantage of the fact that consecutive frames are often almost identical.

Multimedia Retrieval Architecture MPEG-1 has 4 kinds of frames for motion compensation. (Difference between 2 frames are computed) P frame(Predictive)- Uses Block by block difference with preceding I and P. B (Bidirectional)- Difference with preceding and following I or P frames are used as references I (Intracode)- Self contained JPEG encoded appears periodically and can be decoded independently. D (DC coded) frames- Block average used for fast Farward..

Multimedia Retrieval Architecture MPEG Frames

Multimedia Retrieval Architecture Frame construction

MPEG-2 Similar to MPEG-1 Developed for Digital TV No fast forward, not supporting D frames DCT-10*10 instead of 8 * 8 For better quality Supports 4 resolutions and 5 profiles. Has a more general way of multiplexing Each streams are packetized with time stamps

Multimedia Retrieval Architecture MPEG-4 Started for low bit rate For used in portable like video phone Standard includes much more than just data compression Functionality: Content based MM access tools Manipulation and Bit stream editing Improved temporal random access Robustness in error prone environment Content based scalabilty.

Multimedia Retrieval Architecture H 261 H.261 is a ITU-T video coding standard. H.261 was originally designed for transmission over ISDN lines on which data rates are multiples of 64 kbit/s. The coding algorithm was designed to be able to operate at video bit rates between 40 Kbit/s and 2 Mbit/s. The standard supports two video frame sizes: CIF (Comman Intermediate format) and QCIF (Quarter CIF) using a 4:2:0 sampling scheme. Both encoder and decoder should be v.fast used for interactive VC, real time.

Multimedia Retrieval Architecture Anandi Giridharan Electrical Communication Engineering, Indian Institute of Science, Bangalore – 560012, India Multimedia.

Similar presentations

Presentation on theme: "Multimedia Retrieval Architecture Anandi Giridharan Electrical Communication Engineering, Indian Institute of Science, Bangalore – 560012, India Multimedia."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Multimedia Retrieval Architecture Anandi Giridharan Electrical Communication Engineering, Indian Institute of Science, Bangalore – 560012, India Multimedia.

Similar presentations

Presentation on theme: "Multimedia Retrieval Architecture Anandi Giridharan Electrical Communication Engineering, Indian Institute of Science, Bangalore – 560012, India Multimedia."— Presentation transcript:

Similar presentations

About project

Feedback