Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multimedia Processing Lab NH 140

Similar presentations


Presentation on theme: "Multimedia Processing Lab NH 140"— Presentation transcript:

1 Multimedia Processing Lab NH 140
Advisor : Dr. K.R. Rao Phone : (817) Website:

2 Multimedia Network Now a typical multimedia network looks like this. There are three main domains : 1. Authoring, 2. Distribution and 3. playback. We as end users are no longer restricted to the “Playback” domain of such a typical multimedia network. Each of these domains have promising research potential that is explored in our Multimedia Processing Lab under the guidance of Dr K. R. Rao.

3 Home Media Ecosystem A case for importance of research in multimedia
The Home Media Ecosystem shown here is an example of how multimedia and its related technology has pervaded all aspects of our lives. The MPL provides a good platform to start your career in such a promising field. A case for importance of research in multimedia

4 Video Redundancy – An Example
A video (file types like AVI and MPEG) is a collection of several images being shown. Each image is called frames and the amount of images shown per second is called frames per second or simply FPS. The more frames per second your video has, the better, since more realistic the image will be. Videos are usually saved using at least TV quality settings, i.e. 30 frames per second.

5 The need for video compression
Video signal : Sequence of frames (images) related among temporal dimension TV video quality: 704x576 pixels per frame, 12 bpp, 25 frames per second - > 121 Mbps Too much data for video transmission or storage Increasing importance of multimedia communication In order to reduce the video file size, a video compression technique is used, which works by removing from the video parts of the image that were already shown. For example, imagine a video where there is one person talking and that this person is not moving. On the first frame the image is shown complete, but on the second frame the parts of the image that are identical to the first frame are removed from the image. If only the mouth of the person is moving, only the area around the mouth will be drawn on the second frame. t NEED FOR VIDEO COMPRESION

6 Research Focus Areas MVC SVC mobile Year Coding Efficiency
Network awareness + implementation? 2005 2010 1999 1994 MPEG4 H.264 1992 MPEG1 Video Conferencing H.263 2003 Mobile Phone Hand PC Mobile TV SVC HDTV Year MPEG2 H.265(?) mobile 2009 MVC HVC(?) Blue ray DVD PC With each new compression technique that are standardized, we at the lab obtain the codec software through the networking influence of the MPL advisor Dr K. R Rao and work on optimizing the codec, either by reducing the complexity, encoding time, improving the quality, or improving the robustness of the standard using algorithms for error concealment and error correction. Research by a former student (Rahul Panchal) was accepted by the Joint Video Team to be incorporated into the h.264 standard and he is also working on H.265 which maybe a standard released in the near future.

7 Research : Image, Video, Audio
JPEG, JPEG-LS, LOCO, CALIC MPEG 1,2,4,7, 21 Dolby True HD JPEG 2000 H.264, H.265(?),HVC HD-AAC JPEG XR–AIC VP6, VP7, VP8 MP3, MP3 Pro JBIG1,2 VC–1 (WMV–9) AAC–SBR PNG Wyner Ziv HE–AC3 GIF AVS China part 2 AVS China part 3 Dirac,Dirac Pro(BBC) ATSC (E-AC3) Real Networks-RV10 WMA DTS-HD Audio Most of these software codecs are proprietary material. But Dr Rao’s influential networking has enabled him to get most of these codecs strictly for academic research in his lab. So one will have hands on experience in the latest technology by the time one leaves the university. For example work is being done h.265 in the industry but it is yet to be released as a standard; Dr Rao has his former students working on this and they collaborate with MPL to do some research on this. 7

8 Video Compression Standards
Main Applications Year JPEG, JPEG2000 Image , 2000 JBIG, JBIG2 Fax H.261 Video Conferencing 1990 H.262, H.262+ DTV, SDTV, HDTV 1995, 2000 H.263, H.263++ Videophone 1998, 2000 MPEG-1 Video CD 1992 MPEG-2 DTV, SDTV, HDTV, DVD 1995 MPEG-4 Part 2 Interactive video 2000 MPEG-7 Multimedia Content description 2001 MPEG-21 Multimedia Framework 2002 H.264/MPEG-4 part 10 Advanced Video Coding 2003

9 Latest Video Codecs HVC High Efficiency Video Coding 2010 ? Standard
Main Applications Year Dirac (B.B.C.) Internet streaming to Ultra-high definition TV 2008 Dirac pro/VC-2 Studio and professional use 2009 VC-1 (SMPTE/Microsoft) Internet streaming to High definition TV 2006 VC-3 Compositing, mastering, and multi-generational use VP6 (On2 technologies) Broadcasting 2003 VP7 2005 VP8 RV10 (Real Networks) Internet streaming AVS China IP TV , Terrestrial digital TV, Satellite broadcast, Video surveillance H.264 Fidelity Range Extensions Studio editing, Post processing, Digital cinema 2004 H.264 SVC, MVC Scalable video coding, panaromic video HVC High Efficiency Video Coding 2010 ? These latest codecs are proprietary softwares acquired by Dr Rao for academic purposes to be used in MPL

10 Advanced Television Systems Committee (ATSC)
Advanced Television Systems Committee (ATSC) A/53B ATSC Standard: Digital television standard Revision B with amendment (Video: MPEG-2, Audio: AAC), 2007 A/153 Digital TV Mobile and handheld specifications 2009 (Video: H.264) (Audio HE AACv2, ISO/ IEC ) Digital TV in North America

11 Advanced Television Systems Committee (ATSC)…….continued
ATSC Mobile DTV includes a highly robust transmission system based on vestigial sideband (VSB) modulation coupled with a flexible and extensible IP based transport, efficient MPEG AVC (H.264) video and HE AAC v2 audio (ISO/IEC ) coding. The Candidate Standard consists of eight parts: • Part 1 – Mobile/Handheld Digital Television System • Part 2 – RF/Transmission System Characteristics • Part 3 – Service Multiplex and Transport Subsystem Characteristics • Part 4 – Announcement • Part 5 – Presentation Framework • Part 6 – Service Protection • Part 7 – Video System Characteristics • Part 8 – Audio System Characteristics

12 Fig 1. ATSC Broadcast system with TS main and M/H Services [1]

13 Comparison of various video compression standards
Algorithmic Element MPEG-2 Video (H.262) MPEG-4 AVC (H.264) SMPTE VC-1 (Windows Media Video 9) Dirac (BBC) DiracPRO AVS Part 2 China Part 7 Intra Prediction None: MB encoded DC predictors 4x4 spatial 16x16 spatial I-PCM Frequency domain coefficient 4x4 Spatial (forward, backward) 8×8 block based Intra Prediction Intra_4x4 (4x4 spatial). Direct Intra Prediction Picture coding type Frame Field Picture AFF MB AFF Intra – Frame, Field (Interlace, Progressive) Motion compensation block size 16×16, 16×8, 8×16 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, 4×4 16×16, 8×8 4×4 N/A 16×16, 16×8, 8×16, 8×8 16×16, 16×8, 8×16, 8×8, 8×4, 4×8 Motion vector Precision Full pel Half pel Quarter pel 1/8 pel 1/4 pel

14 Comparison of various video compression standards
Algorithmic Element MPEG-2 Video (H.262) MPEG-4 AVC (H.264) SMPTE VC-1 (Windows Media Video 9) Dirac DiracPRO AVS Part 2 Part 7 P frame type Single reference Multiple reference Single reference, Intensity compensation No P frames Single and multiple reference (maximum of 2 reference frames) B frame type One reference each way One reference each way, Multiple reference, Direct & spatial direct weighted prediction. No B frames One reference each way, Multiple reference. Direct and symmetrical mode. No B frames. In loop filters None De-blocking Overlap transform De-blocking filter.

15 Comparison of various video compression standards
Algorithmic Element MPEG-2 Video (H.262) MPEG-4 AVC (H.264) SMPTE VC-1 (Windows Media Video 9) Dirac DiracPRO AVS Part 2 Part 7 Entropy coding VLC CAVLC,CABAC Adaptive VLC Arithmetic coding Context based adaptive binary arithmetic coding, Exp-Golomb coding. 2D variable length coding. Context based adaptive 2D variable length coding. Transform 8×8 DCT 4×4 integer DCT 8×8 integer DCT 8×4 & 4×8 integer DCT 4×4 wavelet transform 4×4 wavelet transform 4×4 DCT Other Quantization scaling matrices. Range reduction. Instream-post processing control

16 Main Compression Technologies Main Target Applications
Standards Comparison Standard Main Compression Technologies Main Target Applications H.264/MPEG-4 Part 10 Standardization body JVT (ISO/IEC & ITU-T) Main Target Bitrate 8 kb/s up to about 150 Mb/s – Integer DCT – Adaptive quantization – Zigzag reordering – Alternate Scan ordering – Predictive motion compensation – Bi-directional motion compensation – Variable block size motion compensation with small block sizes – Quarter pixel motion compensation – Motion vector over picture boundaries – Multiple reference picture motion compensation – Adaptive intra directional prediction – In-loop deblocking filter – Arithmetic coding – Variable length coding – Error resilient coding – Broadcast over cable, terrestrial and satellite – Interactive or serial storage on optical and magnetic devices, DVD, etc – Conversational services – Video on demand – MMS over ISDN, DSL, Ethernet, LAN, wireless and mobile networks – HDTV – Digital camera

17 Standards Comparison AVS Part 2 Standardization body AVS workgroup
Main Target Bitrate 1 Mb/s up to about 20 Mb/s – Interlace handling: Picture-level adaptive frame/field coding (PAFF) – Macroblock-level adaptive frame/field coding (MBAFF) – Intra prediction: 5 modes for luma and 4 modes for chroma – Motion compensation: 16×16, 16×8, 8×16, 8×8 block size – Resolution of MV: 1/4-pel, 4-tap interpolation filter – Transform: 16 bit-implemented 8×8 integer cosine transform – Quantization and scaling: scaling only in encoder – Entropy coding: 2D-VLC and Arithmetic Coding – In-loop deblocking filter – Motion vector prediction –Adaptive scan – HD broadcasting – High density storage media – Video surveillances – Video on demand

18 Standards Comparison AVS Part 7 Standardization body AVS workgroup
Main Target Bitrate 1 Mb/s up to about 20 Mb/s – Intra prediction: 9 modes for luma and 3 modes for chroma – Motion compensation: 16×16, 16×8, 8×16, 8×8, 8×4, 4×8 block size – Resolution of MV: 1/4-pel – Transform: 16 bit-implemented 4×4 integer cosine transform – Quantization and scaling: scaling only in encoder – Entropy coding: Context based adaptive 2D variable length coding – In-loop deblocking filter – Record and local playback on mobile devices – Multimedia Message Service (MMS) – Streaming and broadcasting – Real-time video conversation Dirac BBC R&D Mozilla Public License (MPL) Few hundred kbps up to about 15 Mbps – 4×4 wavelet transform – Dead-zone quantization and scaling – Entropy coding: Arithmetic coding – Hierarchical motion estimation – Intra, Inter prediction – Single and multiple reference P, B frames – 1/8 pel motion vector precision – 4×4 overlapped block based motion compensation (OBMC) – Daubechies wavelet filters – Broadcasting – Live streaming video – Pod casting – Peer to peer transfers – HDTV with SD (standard definition) simulcast capability – Desktop production – News links – Archive storage – PVRs (personal video recorders) – Multilevel Mezzanine coding

19 Standards Comparison DiracPRO (SMPTE VC-2) Standardization body
BBC R&D SMPTE Main Target Bitrate Lossless HD to < 50 Mb/s Compression ratio 20:1 – 4×4 wavelet transform – Dead-zone quantization and scaling – Entropy coding: Context based adaptive binary arithmetic coding (CABAC), exponential Golomb coding – Intra-frame only (forward, backward prediction modes also available) – Frame, Field coding (Interlaced and progressive) – Daubechies wavelet filters – Professional (high quality, low latency) applications (not for end user distribution) – Lossless or visually lossless compression for archives – Mezzanine compression for re-use of existing equipment – Low delay compression for live video links SMPTE VC-1 (WMV-9) SMPTE 421M 10 kbps – 8 Mbps – Integer DCT – Adaptive block size transform: (8×8), (8×4), (4×8) and (4×4) – Motion estimation for (16×16) and (8×8) blocks – ½ pixel and ¼ pixel motion vector resolution – Dead zone and uniform quantization – Multiple VLCs – In-loop deblock filtering, fading compensation – Media delivery over the Internet – Broadcast TV – HD DVD – Digital projection in theaters, mobile phones – DVB-T, DVB-S

20 Performance comparison of various video coding standards

21 Audio Compression Standards
Main Applications Year Dolby True HD Lossless audio, Blu-ray Disc players, A/V receivers, and home-theater 2006 HD-AAC Soundtrack applications 1997 MP3 Handheld devices 1991 MP3 Pro 2001 AAC–SBR DAB – High quality audio 2003 HE–AC3 Satellite or terrestrial audio broadcasting 2005 AVS China part 3 Handheld and broadcasting 2004 AC3 Pro E-AC3 Enhanced AC-3 or Dolby Digital Plus (Multiple program streams, multi channel signals beyond 5.1) 2007 DTS – Digital Theater Systems DTS – High Definition Audio 2008

22 Current Research Activities of MPL
Mobile Applications Development of virtual lab platform for mobile software application Developing a low complexity video codec for mobile application Complexity reduction Complexity reduction in existing video codecs Complexity reduction in existing audio codecs Quality Improvement Optimizing existing video codecs using perceptual coding techniques Improve Robustness Error Resilience of video streams in a Lossy Wireless Environment Error concealment techniques for wireless video transmission Transcoders Video transcoders : VP6 to H.264, H.264 to VC-1, Wyner Ziv to H.264, H.264-to-AVS China, H.264 to DIRAC transcoders Video/Audio Integration AVS China – Audio/Video codec – Multiplex/demultiplex and lip sync DIRAC video codec and AAC - Multiplex/demultiplex and lip sync 22

23 Virtual lab. Platforms for Mobile SW Applications

24 Low complexity Codec Applications
SensorCamPillCamWearableCamDisposable cam.ScanCam Wearable Cam Pill Cam Disposable Cam

25 Transcoding Applications
Low complexity Decoder Low complexity Encoder Some codecs have a complex encoder but a simple decoder, whereas others have a simple encoder and a complex decoder. Combining the two by will enable us to employ these codecs for low complexity applications such as mobile platforms. The high complexity decoding of bitstream from one codec to re-encoding it to the bitstream of another codec is what is called transcoding. Also for cross platform applications : transcoding enables us to use products from any company on any device. For example if a ericsson device only supports MPEG2 and we have h.264 video, then using a transcoder we can convert it to MPEG2 bitstream to play it on our device. The transcoding platforms handle the high complexity decoding on one side and high complexity encoding on the other (right) side

26 An application scenario for transcoding
A possible application scenario,where such type of transcoding is potentially useful,is depicted above. Both the service provider and the network operator benefit from using H.264/AVC for content storage and delivery because it allows significant savings in storage capacity and network bandwidth, but at the user premises the MPEG-2 format is necessary because of legacy equipment. Therefore, a transcoding functionality must be included in the home gateway to perform the necessary format adaptation.

27 Error Concealment in Lossy Wireless Environment
Reconstruct lost information Source Destination When video is transmitted over a wireless network, it may experience loss of some information due to the lossy nature of the wireless environment. For a lot of video applications, especially real time video applications, re-transmission is not possible. So we embed some error concealment algorithm at the decoder side to conceal the errors so that we can have uninterrupted transmission and veiwing. Original Information Information lost due to lossy wireless network Typical situation of 3G/4G cellular telephony 27

28 Multiplexing of Audio/Video And Lip Sync
Most compression work is done on video and there are a lot of codecs available for the same. But video is useless without audio.. Hence this is a viable research area where we can multiplex the best available audio codec to the latest video codecs that are available. AVS – Audio Video Standard of China

29 A quick view on H.264 - Encoder

30 Profiles in H.264 4:2:2, 4:4:4, upto 12 bit depth
Profiles comparison in H.264

31 Intra Adaptive Directional Prediction 4x4 in H.264
A new technique of extrapolating the edges of the previously-decoded parts of the current picture is applied in regions of pictures that are coded as intra (i.e., coded without reference to the content of some other picture). This improves the quality of the prediction signal, and also allows prediction from neighboring areas that were not coded using intra coding. For mode 0 (vertical prediction) the samples above the 4x4 block are copied into the block as indicated by the arrows. Mode 1 (horizontal prediction) operates in a manner similar to vertical prediction except that the samples to the left of the 4x4 block are copied. For mode 2 (DC prediction) the adjacent samples are averaged as indicated. The remaining 6 modes are diagonal prediction modes which are called diagonal-down-left, diagonal-down-right, vertical-right, horizontal-down, vertical-left, and horizontal up prediction.

32 Intra Adaptive Directional Prediction 8x8 in H.264
Similar to 4x4

33 Intra Adaptive Directional Prediction 16x16 in H.264
Similar to 4x4 modes

34 Motion Estimation/Compensation Sizes (H.264)
Partitions with luma block sizes of 16x16, 16x8, 8x16, and 8x8 samples are supported by the syntax. In case partitions with 8x8 samples are chosen, one additional syntax element for each 8x8 partition is transmitted. This syntax element specifies whether the corresponding 8x8 partition is further partitioned into partitions of 8x4, 4x8, or 4x4 luma samples and corresponding chroma samples. The prediction signal for each predictive-coded MxN luma block is obtained by displacing an area of the corresponding reference picture, which is specified by a translational motion vector and a picture reference index. Thus, if the macroblock is coded using four 8x8 partitions and each 8x8 partition is further split into four 4x4 partitions, a maximum of sixteen motion vectors may be transmitted for a single P macroblock.

35 Sub pixel accuracy for ME/MC (H.264)
The accuracy of motion compensation is in units of one quarter of the distance between luma samples. In case the motion vector points to an integer-sample position, the prediction signal consists of the corresponding samples of the reference picture; otherwise the corresponding sample is obtained using interpolation to generate non-integer positions. The prediction values at half-sample positions are obtained by applying a one-dimensional 6-tap FIR filter horizontally and vertically. Prediction values at quarter sample positions are generated by averaging samples at integer- and half-sample positions.

36 Scanning of transform coefficients (H.264)
Zig zag and alternate scanning After quantisation, the DCT coefficients for a block are reordered to group together nonzero coefficients, enabling efficient representation of the remaining zero-valued quantised coefficients. The optimum reordering path (scan order) depends on the distribution of nonzero DCT coefficients. Zig-zag scan Alternate scan

37 SVC Extensions (H.264) The objective of the SVC standardization has been to enable the encoding of a high-quality video bitstream that contains one or more subset bitstreams that can themselves be decoded with a complexity and reconstruction quality similar to that achieved using the existing H.264/MPEG-4 AVC design with the same quantity of data as in the subset bitstream. The subset bitstream is derived by dropping packets from the larger bitstream. A subset bitstream can represent a lower spatial or temporal resolution or a lower quality video signal (each separately or in combination) compared to the bitstream it is derived from. The following modalities are possible: 1. Temporal (frame rate) scalability: the motion compensation dependencies are structured so that complete pictures (i.e. their associated packets) can be dropped from the bitstream. (Temporal scalability is already enabled by H.264/MPEG-4 AVC. SVC has only provided supplemental enhancement information to improve its usage.) 2. Spatial (picture size) scalability: video is coded at multiple spatial resolutions. The data and decoded samples of lower resolutions can be used to predict data or samples of higher resolutions in order to reduce the bit rate to code the higher resolutions. 3. SNR/Quality/Fidelity scalability: video is coded at a single spatial resolution but at different qualities. The data and decoded samples of lower qualities can be used to predict data or samples of higher qualities in order to reduce the bit rate to code the higher qualities. 4. Combined scalability: a combination of the 3 scalability modalities described above.

38 (HDR storage/display) (conventional display)
Future Standards Activities – Bit depth Scalability LCD dynamic range – 500:1 HDR displays: Sharp “Mega-contrast”, LG.Philips - 1,000,000:1, Dolby – 250,000:1 Bit Depth Scalable Coder HDR video input 10, 12, 14 bits/pixel HDR video output (HDR storage/display) LDR video output (conventional display) + = Tone Mapping HDR range 8-bit HDR = High Dynamic Range LDR = Low Dynamic Range

39 Future Standards Activities – 3D Video
Consumer Electronics auto-stereoscopic display, 10+ views required Digital Cinema polarized glasses, 2 views sufficient 3D Video (3DV)/Free View-Point Video (FVV) effort initiated in MPEG. Similar concept to MPEG-C. Any number of views can be recreated using depth map in the decoder. 2D video data + depth HDR = High Dynamic Range LDR = Low Dynamic Range

40 Future Standards Activities – 3D Video
Paramount Pictures' Beowulf is benefiting from theaters utilizing next-generation 3D technology (grossed approximately $23.4 million of a total domestic gross over 79.4 million.” “U2 3D, the first live-action movie to be shot, produced, and screened exclusively with digital 3-D technology DreamWorks Animation is joining the digital 3-D wave Studio plans to release all its pics in 3-D starting in 2009.” HDR = High Dynamic Range LDR = Low Dynamic Range

41 Original and compressed Lena image with different methods
Original Lena (51251224) (b) AIC: 0.22bpp, PSNR=28.84dB (c) JPEG2000: 0.22bpp, PSNR=29.57dB

42 Compressed Lena image with different methods(contd.)
(d) M-AIC: 0.22bpp, PSNR=29.02dB (e) JPEG: 0.22bpp, PSNR=24.29dB

43 AVS AVS is a set of integrity standard system-system, video, audio and media copyright management. AVS-M is the seventh part of the video coding standard developed by AVS work group of China which aims for mobile systems and devices. In AVS-M,a Jiben Profile has been defined which has 9 different levels. AVS follows a layered structure for the data and this representation is seen in the coded bitstream. Sequence layer provides an entry point into the coded video. It consists of a set of mandatory and optional downloadable parameters.

44 AVS-M ENCODER Block Diagram of AVS-M encoder [34]

45 AVS-M DECODER Block Diagram of AVS-M Decoder [34]

46 AVS-M Analysis

47 AVS-M Analysis

48 AVS-M Analysis Original Decoded sequence

49 AVS-M Analysis

50 Dirac features Direct support of multiple picture formats
4K e-cinema through to quarter common intermediate format (QCIF) Supports I-frame only up to long group of picture (GOP) structures Direct support of multiple chroma formats e.g. 4:4:4/4:2:2/4:2:0 Direct support of multiple bit depths e.g. 8 bit to 16 bit Direct support of interlace via metadata Direct support of multiple frame rates from fps to 60fps Definable pixel aspect ratios Multiple color spaces with metadata Definable wavelet depth

51 Dirac Encoder Dirac encoder architecture [1]

52 Dirac Decoder Dirac decoder architecture [8]

53 Dirac Results for Miss America

54 Dirac Results for Miss America

55 Dirac Results for Miss America

56 Dirac Results for Miss America

57 Dirac Results for Miss America

58 Dirac Results for Miss America

59 Current Interns & Alumni Network
Current & Recent Grads: Jay R Padia (M.S) (May 2010) - Intel Att Kruafak (Ph.D) – Engineer CAT, Thailand Sangseok Park (Dec 2008) (Ph.D) – DiaLogic Aruna Ravi Subramanya Sahana Devaraju Tejaswini Purushottam Krishnan FastVDO Swaroop Suchethan - Jennie Abraham - Nikshep Patil – Datamatics Radhika Veerla (Aug 08) –   Theju Jacob (Aug 08) – Ph.D. student Pooja Agawane (Aug 08) – Leena Agarwal (Dec 07) – Rahul Panchal (May 07) – Harishankar Murugan (May 07)- Sreejana Sharma (May 07)- Hitesh Yadav (August 06)- Basavaraj S. M. (May 06)- VDO Rochelle Pereira (Dec 05)- Sandya Sheshadri (Dec 05) – Tarun Bhatia (Dec 05)- Vidhya Vijaykumar MPL has an impressively strong network of employed alumni giving us the opportunity to explore the latest things in the video coding field. MPL has had a record of 100% placement for all thesis and Ph. D students so far. Companies like Intel and Fast VDO are in the past specifically asked for graduating students of MPL

60 Current Interns & Alumni Network
Pragnesh Ramolia- Tactel US Nikshep Patil Marvell Semiconductors Sreya – RIM Shreyanka – Intel Amruta RIM Tejas RIM Sadaf Ericsson Anuradha (Dec 04) Qualcomm Shubha Kumbadkone (Dec 04) Intel Nandakishore (Aug 04) Qualcomm Phani (May 04) – Qualcomm Ravi Kumar (May 04) Qualcomm

61 References DIRAC T. Borer, and T. Davies, “Dirac video compression using open technology”, BBC EBU Technical Review, July 2005 BBC Research on Dirac: The Dirac web page: T. Davies, “The Dirac Algorithm”: Dirac developer support: Overlapped block-based motion compensation: “Dirac Pro to bolster BBC HD links”: Dirac software and source code: Dirac video codec - A programmer's guide: Daubechies wavelet: Daubechies wavelet filter design: Dirac developer support: Wavelet transform: Dirac developer support: RDO motion estimation metric: A. Ravi and K.R. Rao, “Performance analysis and comparison of the DIRAC video codec with H.264/MPEG-4 part 10 AVC”, IJWMIP , vol.4, pp , 2011.

62 References H.264 T.Wiegand, et al “Overview of the H.264/AVC video coding standard”, IEEE Trans. on Circuit and Systems for Video Technology, vol.13, pp , July 2003. T. Wiegand and G. J. Sullivan, “The H.264 video coding standard”, IEEE Signal Processing Magazine, vol. 24, pp , March 2007. D. Marpe, T. Wiegand and G. J. Sullivan, “The H.264/MPEG-4 AVC standard and its applications”, IEEE Communications Magazine, vol. 44, pp , Aug S.K.Kwon, A.Tamhankar and K.R.Rao, “Overview of H.264 / MPEG-4 Part 10” J. Visual Communication and Image Representation, vol. 17, pp , April 2006. A. Puri, X. Chen and A. Luthra, “Video coding using the H.264/MPEG-4 AVC compression standard”, Signal Processing: Image Communication, vol. 19, pp , Oct H.264 AVC JM software: [19] H.264/MPEG-4 AVC: M.Fieldler, “Implementation of basic H.264/AVC decoder”, seminar paper at Chemnitz University of Technology, June 2004. H.264 encoder and decoder: R. Schäfer, T. Wiegand and H. Schwarz, “The emerging H.264/AVC standard”, EBU Technical Review, Jan H.264 reference software download : D. Marpe, T. Wiegand, and S. Gordon, "H.264/MPEG4-avc fidelity range extensions: tools, profiles, performance, and application areas," in, IEEE International Conference on Image Processing, vol. 1, pp. I-593-6, 2005. S. Saponara et al, "The JVT advanced video coding standard: complexity and performance analysis on a tool-by-tool basis," in Packet Video Workshop, Nantes, France, April 2003.

63 References VC-1 VC-1 technical overview - Microsoft Windows Media: S. Srinivasan, et al, “Windows Media Video 9: overview and applications”, Signal Processing: Image Communication, vol .19, Issue 9, pp , Oct. 2004 S. Srinivasan and S. L. Regunathan, “An overview of VC-1”, SPIE / VCIP, vol. 5960, pp , July 2005. AVS AVS Video Expert Group, “Information technology – Advanced coding of audio and video – Part 2: Video (AVS1-P2 JQP FCD 1.0),” Audio Video Coding Standard Group of China (AVS), Doc. AVS-N1538, Sept AVS Video Expert Group, “Information technology – Advanced coding of audio and video – Part 3: Audio,” Audio Video Coding Standard Group of China (AVS), Doc. AVS-N1551, Sept L Yu et al., “Overview of AVS-Video: Tools, performance and complexity,” SPIE VCIP, vol. 5960, pp ~ , Beijing, China, July 2005. L. Fan, S. Ma and F. Wu, “Overview of AVS video standard,” IEEE Int’l Conf. on Multimedia and Expo, ICME '04, vol. 1, pp. 423–426, Taipei, Taiwan, June 2004. W. Gao et al., “AVS – The Chinese next-generation video coding standard,” National Association of Broadcasters, Las Vegas, 2004. Special issue on 'AVS and its Applications' Signal Processing: Image Communication, vol. 24,pp , April AVS China software : ftp:// /public/avs_doc/avs_software (need password)

64 References AVS working group official website, http://www.avs.org.cn
W.Gao et al., “AVS–the Chinese next-generation video coding standard,” National Association of Broadcasters, Las Vegas, 2004. L.Fan, “Mobile Multimedia Broadcasting Standards”, ISBN: , Springer US, 2009 F.Yi et al., “Low-Complexity Tools in AVS Part 7”, J. Comput. Sci. Technol, vol.21, pp , May. 2006 L.YU, S.Chen and J.Wang, “Overview of AVS-video coding standards”, Signal Process: Image Commun, vol. 24, Issue 4, pp , April 2009 W.Gao, “AVS–A project towards to an open and cost efficient Chinese national standard”, ITU-T VICA workshop, ITU Headquarters, Geneva, July 2005. Z. Zhang et al., “Improved intra prediction mode-decision method”, Proc. of SPIE ,Vol. 5960, pp W-1~ 59601W-9, Beijing, China, July 2005. Z.. Ma et al., “Intra coding of AVS Part 7 video coding standard”, J. Comput. Sci. Technol,vol.21, Feb.2006

65 References W.Gao and T.Huang “AVS Standard -Status and Future Plan”, Workshop on Multimedia New Technologies and Application, Shenzhen, China, Oct Y.Cheng et al., “Analysis and application of error concealment tools in AVS-M decoder”, Journal of Zhejiang University –Science A, vol. 7, pp , Jan 2006. M.Liu and Z.Wei, “A fast mode decision algorithm for intra prediction in AVS-M video coding” Vol 1, ICWAPR 07, Issue, 2-4, pp.326 –331, Nov Q.Wang et al., “Context-Based 2D-VLC for Video Coding”, IEEE Int’l Conf. on Multimedia and Expo (ICME), vol.1, pp , June W.Gao, K.N. Ngan and L.Yu, “Special issue on AVS and its applications: Guest editorial”, Signal Process: Image Commun, vol. 24, Issue 4, pp , April 2009. S.W.Ma and W.Gao, “Low Complexity Integer Transform and Adaptive Quantization Optimization”, J. Comput. Sci. Technol, vol.21, pp , May 2006. S.Hu, X.Zhang and Z.Yang, “Efficient Implementation of Interpolation for AVS”, Image and Signal Processing, Congress, vol. 3, Issue, 27-30, pp.133 –138, May 2008. R. Schafer and T. Sikora, “Digital video coding standards and their role in video communications”, Proc. of the IEEE, vol. 83, pp , June 1995. A. K. Jain, “Image data compression: A review”, Proc. IEEE, vol. 69, pp , March 1981.

66 References JPEG, JPEG-2000, JPEG-XR (XR Extended range)
AIC website: T. Wiegand et.al, “Overview of the H.264/AVC Video Coding Standard,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 13, pp , July 2003. G. Sullivan, P. Topiwala and A. Luthra, “The H.264/AVC Advanced Video Coding Standard: Overview and Introduction to the Fidelity Range Extensions,” SPIE Conference on Applications of Digital Image Processing XXVII, vol. 5558, pp , Aug I. Richardson, H.264 and MPEG-4 Video Compression: Video Coding for Next-Generation Multimedia, Hoboken, NJ: Wiley, 2003. P. Topiwala, “Comparative study of JPEG2000 and H.264/AVC FRExt I-frame coding on high definition video sequences,” Proc. SPIE Int’l Symposium, Digital Image Processing, vol. , pp San Diego, Aug P. Topiwala, T. Tran and W.Dai, “Performance comparison of JPEG2000 and H.264/AVC high profile intra-frame coding on HD video sequences,” Proc. SPIE Int’l Symposium, Digital Image Processing, applications of digital image processing XXIX, vol. 6321, pp , San Diego, Aug

67 References T. Tran, L.Liu and P. Topiwala, “Performance comparison of leading image codecs: H.264/AVC intra, JPEG 2000, and Microsoft HD photo,” Proc. SPIE Int’l Symposium, Digital Image Processing, vol. , pp ,San Diego, Sept G. J. Sullivan, “ ISO/IEC (JpegDI part 2 JPEG XR image coding – Specification),” ISO/IEC JTC 1/SC 29/WG1 N 4492, Dec. 2007 D. Marpe, T.Weigand and G. Sullivan, “The H.264/MPEG4 advanced video coding standards and its applications”, IEEE Communications Magazine, vol. 44, pp , Aug A. Skodras, C. Christopoulus and T. Ebrahimi, “The JPEG2000 still image compression standard,” IEEE Signal Processing Magazine, vol. 18, pp , Sept D.S. Taubman and M.W. Marcellin, JPEG 2000: Image compression fundamentals, standards and practice, Kluwer academic publishers, 2001. W.B. Pennebaker and J.L. Mitchell, JPEG: Still image data compression standard, Kluwer academic publishers, 2003. D. Marpe, V. George, and T.Weigand, “Performance comparison of intra-only H.264/AVC HP and JPEG 2000 for a set of monochrome ISO/IEC test images”, JVT-M014, pp.18-22, Oct. 2004 D. Marpe et al, “Performance evaluation of motion JPEG2000 in comparison with H.264 / operated in intra-coding mode”, Proc. SPIE, vol. 5266, pp , Feb Z. Xiong et al, “A comparative study of DCT- and wavelet-based image coding,” IEEE Trans. on Circuits and Systems for Video Tech., vol.9, pp , Aug

68 References H.264/AVC reference software (JM 13.2) Website: JPEG reference software website: ftp://ftp.simtel.net/pub/simtelnet/msdos/graphics/jpegsr6.zip Microsoft HD photo specification: JPEG2000 latest reference software (Jasper Version ) Website: JPEG-LS reference software website M.D. Adams, “JasPer software reference manual (Version ),” ISO/IEC JTC 1/SC 29/WG 1 N 2415, Dec M.D. Adams and F. Kossentini, “Jasper: A software-based JPEG-2000 codec implementation,” in Proc. of IEEE Int. Conf. Image Processing, vol.2, pp 53-56, Vancouver, BC, Canada, Oct M. J. Weinberger, G. Seroussi, and G. Sapiro, “LOCO-I: A low complexity, context-based, lossless image compression algorithm”, Hewlett-Packard Laboratories, Palo Alto, CA. M.J. Weinberger, G. Seroussi and G. Sapiro, “The LOCO-I lossless image compression algorithm: principles and standardization into JPEG-LS”, IEEE Trans. Image Processing, vol. 9, pp , Aug.2000. Ibid, “LOCO-I A low complexity context-based, lossless image compression algorithm”, Proc DCC, pp , Snowbird, Utah, Mar K. Sayood, “Introduction to Data Compression”, Third Edition, Morgan Kaufmann Publishers, 2006. M.Ghanbari, “Standard Codecs: Image Compression to Advanced Video Coding”. IEE, London, UK, 2003. Z. Wang and A. C. Bovik, “Modern image quality assessment”, Morgan and Claypool Publishers, 2006.

69 References Special Issue on JPEG-2000, Signal Processing: Image Communication, vol. 17, pp , Jan 2002. A. Stoica, C. Vertan, and C. Fernandez-Maloigne, “Objective and subjective color image quality evaluation for JPEG compressed images,” IEEE Int’l Symposium on Signals, Circuits and Systems, vol. 1, pp. 137 – 140, July 2003. J. J. Hwang and S. G. Cho, “Proposal for objective distortion metrics for AIC standardization”, ISO/IEC JTC 1/SC 29/WG 1 N4548, Mar 2008. H. R. Wu and K. R. Rao, “Digital video image quality and perceptual coding,” Boca Raton, FL: Taylor and Francis, 2006. I. H. Witten, R. M. Neal, and J. G. Cleary, “Arithmetic coding for data compression,” Communications of the ACM, vol. 30, pp , June 1987. Z. Zhang, R. Veerla and K. R. Rao, “A modified advanced image coding”, Proceedings of CANS’ 2008, Romania, Nov. 8-10, 2008. X. Shang, “Structural similarity based image quality assessment: pooling strategies and applications to image compression and digit recognition,” M.S. Thesis, EE Department, The University of Texas at Arlington, Aug A. M. Eskicioglu and P. S. Fisher, “Image quality measures and their performance,” IEEE Signal Processing Letters, vol. 43, pp , Dec Test images found in: Information collected for various topics included in the material: www-ee.uta.edu/dip Y-L. Lee and K-H. Han, “Complexity of the proposed lossless intra for 4:4:4”, (ISO/IEC JTC1/SC29/WG11 and ITU-T SG 16 Q.6) document JVT-Q035, Oct M. Ouaret F. Dufaux and T. Ebrahimi, “ On comparing JPEG 2000 and intraframe AVC”’,SPIE, Applications of digital image processing XXIX, vol.6312, pp ,Aug S-T. Hsiang, “ A new subband/wavelet framework for AVC/H.264 intraframe coding and performance comparison with motion-JPEG 2000”, VCIP, Proc of SPIE-IS& T Electronic Imaging, SPIE vol. 6822, pp P-1 thru 68220P-12, Jan S. Srinivasan et al, “An introduction to the HD photo technical design” , JPEG document wg1n4183, April 2007.

70 References Books I. Richardson “The H.264 advanced video compression standard” Hoboken, NJ: Wiley, 2010.

71 4x4 INTDCT in H.264 Vcodex white paper on 4x4 transform and quantization in H.264   The description of the normative inverse quantization and transform process is found in the latest standard specification: Last, the following papers and standardization contributions contain valuable information and insight on the transform and quantization design of H.264/MPEG-4 Part 10 AVC: 1) H. S. Malvar, A. Hallapuro, M. Karczewicz, and L. Kerofsky, “Low-Complexity Transform and Quantization in H.264/AVC”, IEEE Trans. on Circ. Sys. on Video Tech., vol. 13, pp , July 2003,  2) A. Hallapuro, M. Karczewicz, and H. Malvar, “Low Complexity Transform and Quantization – Part I: Basic Implementation”, JVT of ISO/IEC MPEG and ITU-T VCEG, JVT-B038, Feb   3) A. Hallapuro, M. Karczewicz, and H. Malvar, “Low Complexity Transform and Quantization – Part II: Extensions”, Joint Video Team of ISO/IEC MPEG and ITU-T VCEG, JVT-B039, Feb

72 LARGE SIZE TRANSFORMS W.K. Cham, “Simple order-16 integer transform for video coding” IEEE ICIP 2010, Hong Kong, Sept.2010. R. Joshi, Y.A. Reznik and M. Karczewicz, “ Efficient large size transforms for high-performance video coding”, SPIE 0ptics + Photonics, vol. 7798, paper , San Diego, CA, Aug A.T. Hinds, “ Design of high- performance fixed-point transforms using the common factor method”, SPIE 0ptics + Photonics, vol. 7798, paper , San Diego, CA, Aug G.J. Sullivan, “ Standardization of IDCT approximation behavior for video compression: the history and the new MPEG-C parts 1 and 2 standards”, SPIE vol. 6696, paper 35, Aug.2007. I. E. Richardson , “The H.264 Advanced Video Compression Standard”, 2nd Edition, Wiley publications, 2010.

73 High efficiency video coding (HEVC)
has info on developments in HEVC NGVC – Next generation video coding. Some of the tools contributing to the gain are: (1) RD Picture Decision (2) RDO_Q (from Qualcomm) (3) MDDT (from Qualcomm) (4) New Offset (from Qualcomm) (5) Adaptive Interpolation Filter (from Qualcomm & Nokia) (6) Block Adaptive Loop Filter (BALF) (from Toshiba) (7) Bigger Blocks and Bigger transform (32x32 and 64x64) (Qualcomm) (8) Motion Vector Competition (France Telecomm) (9) Template matching JVT KTA reference software (KTA: key technical areas)  G.J. Sullivan and J.-R. Ohm,“Recent developements in standardization of high efficiency video coding“, Proc. SPIE, vol. 7798, pp V-1 thru V-7, San diego, CA Aug

74 NEW GENERATION VIDEO CODING (NGVC)
VCEG MPEG (ITU-T) (ISO/IEC) Joint collaborative team on video coding (JCT-VC) (15-23 April first meeting) Table. 1 [1] Reference of Table.1 is [1] G.J. Sullivan and J.-R. Ohm,“Recent developements in standardization of high efficiency video coding“, Proc. SPIE, vol. 7798, pp V-1 thru V-7, San diego, CA Aug

75 Technical assessment first JCT-VC, Dresden, Germany 15-23 April 2010
All proposed algorithms are based on the traditional MC hybrid (transform-DPCM)coding approach. Random Access Low Delay TMUC ( test Model Under Consideration) Coding Units (CU) Prediction Units (PU) Transform Units (TU)

76 Coding Units Intra prediction – upto 28 angular directions ME/MC
Inter prediction ( Multiple ref. pictures, bi-prediction, weighted prediction) New MV competition Transform unit block size 4X4 to 64X64 ( Mode dependent directional transform MDDT and rotational transforms)

77 ADAPTIVE LOOP FILTER JCT- VC : Developing a well validated design called TM leading to HEVC standardization by 2011. First version of HEVC is probably expected by end of 2012 or early 2013.

78 Explore the field of multimedia processing in MPL @
- Dr. K.R. Rao (817) NH 140


Download ppt "Multimedia Processing Lab NH 140"

Similar presentations


Ads by Google