Concepts of Multimedia Processing and Transmission IT 481, Lecture #9 Hung Nguyen, Ph.D. 11 April, 2005 IT 481, Lecture #10 Dennis McCaughey, Ph.D. 13 November, 2006
04/10/2006 Hung Nguyen, IT 481, Spring Outline MPEG-4 –Overview –System Description –Coding of Audio Objects –Coding of Visual Objects –Scene Description –IPR and MPEG-4 –DMIF (Delivery Multimedia Integration Framework) –MPEG-4 Applications
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Overview Flexible Multimedia Communications 5kbps - 50Mbps Audio object compression Video object compression Synthetic Audio/Speech and Video Systems: multiplexing and flexible composition
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4 Overview (cont’d) Functionalities beyond MPEG-1/2 –Interaction with individual objects: The displayed scene can be composed by the receiver from coded objects –Scalability of contents –Error resilience –Coding of both natural and synthetic audio and video
System Description
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4 : System No need to know –What the terminal device –How to receive the stream Sync Layer –Object Descriptors: Relate to elementary streams to a media object (MO) –Synchronization Layer has two types of timing information Convey the speed of the encoder to the decoder The time stamp to portions of the encoded Audio Visual data
04/10/2006 Hung Nguyen, IT 481, Spring Overview of MPEG-4 System Scene segmentation and depth layering O2O2 O3O3 O1O1 Layered encoding contour motion texture contour motion texture contour motion texture bitstream layer 1 bitstream layer 2 bitstream layer 3 multiplexer demultiplexer Separate decoding compositor AV-objects
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: System Delivery of streaming data
04/10/2006 Hung Nguyen, IT 481, Spring The Parts of MPEG-4 Standard Part 1: Systems Part 2: Audio Part 3: Visual Part 4: Conformance Part 5: Reference Software Part 6: DMIF: Delivery Multimedia Integration Framework. A session protocol. –set-up of connection channels (broadcast & interactive) –network becomes transparent to application
04/10/2006 Hung Nguyen, IT 481, Spring Object-Based Coding Entire scene is decomposed into multiple objects –Object segmentation is the most difficult task! –But this does not need to be standardized ☺ Each object is specified by its shape, motion, and texture (color) –Shape and texture both changes in time (specified by motion) MPEG-4 assumes the encoder has a segmentation map available, specifies how to decode shape, motion and texture
04/10/2006 Hung Nguyen, IT 481, Spring The displayed scene is composed by the receiver based on desired view angle and objects of interests
04/10/2006 Hung Nguyen, IT 481, Spring Example of Scene Composition The decoder can compose a scene by including different VOPs (Video Object Planes) in a VOL (Video Object Layer)
04/10/2006 Hung Nguyen, IT 481, Spring Example of Editing, Reusability and Compositing of Video Objects
04/10/2006 Hung Nguyen, IT 481, Spring Example of Editing, Reusability and Compositing of Video Objects
04/10/2006 Hung Nguyen, IT 481, Spring Example of Scalability of Video Objects and Optimizing Regions of Interest
04/10/2006 Hung Nguyen, IT 481, Spring Example of Creativity and Interactivity with Video Objects
04/10/2006 Hung Nguyen, IT 481, Spring Example of 2-D Interactivity
04/10/2006 Hung Nguyen, IT 481, Spring Example of 3-D Multimedia Interactivity
04/10/2006 Hung Nguyen, IT 481, Spring Example of Creating a 3-D illusion from a 2-D Presentation
04/10/2006 Hung Nguyen, IT 481, Spring Example of Interactive Virtual Reality
04/10/2006 Hung Nguyen, IT 481, Spring Example of Multicast and Multi-channel Interoperability
Coding of Audio Objects
04/10/2006 Hung Nguyen, IT 481, Spring Progression of MPEG-Audio MPEG-1 Two-Channel coding standard (Nov. 1992) MPEG-2 Extension towards Lower- Sampling-Frequency (LSF) (1994) MPEG-2 Backwards compatible multi- channel coding (1994) MPEG-2 Higher Quality multi-channel standard (MPEG-2 AAC) (1997) MPEG-4 Audio Coding and Added Functionalities (1999, 2000)
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4 Audio: The Audio Toolbox MPEG-4 Audio integrates the different audio worlds High Quality Audio Coding Speech Coding Representation of Natural Audio MPEG-4 Audio - AAC - AAC Scal - AAC LC - Twin VQ - HILN - HVXC - CELP Sound Synthesis - SAOL (extension toMidi) - Text To Speech (TTS) - Effects Processing - 3-D Localization - HVXC - CELP with different Modes
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4 Audio Objects
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4 Audio: Natural and Unnatural (Structured) Audio High Quality Audio Tools: –AAC (Advanced Audio Coding) –Twin VQ (Transform domain weighted interleaved Vector Quantisation) Low Bit-rate Audio Tools: –CELP (Code Excited Linear Predictive) coding –Parametric, i.e. coding based on parametric representation of audio signal HVXC (Harmonic Vector eXcitation Coding) HILN (Harmonic and Individual Line and Noise) SNHC (Synthethic/Natural Hybrid Coding) Audio Tools –SAOL (Structured Audio Orchestra Language) –TTS (Text-To-Speech)
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Audio Objects Coding of Audio Objects (Natural Sound) –HVXC (Harmonic Vector eXcitation Coding) : kbps –CELP (Code Excitation Linear Predictive) Sampling rates 8kHz : kbps Sampling rates 16kHz : 18 kbps –MPEG-2 AAC –Examples A base layer : CELP An enhancement layer : AAC
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Audio Objects (cont’d) Synthesized Sound –Text To Speech –Score Driven Synthesis Not a standardized method of synthesis “Scores” - described in Structured Audio Orchestra Language (SOAL), an MPEG-4 audio tool
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4 Audio Version 2: New Tools (Part 2) Environmental Spatialization –Physical approach, based on description of the acoustical properties of the environment –Perceptual approach, based on high level perceptual description of audio scenes –Version 2 Advanced Audio BIFS (Binary Format for Scene) description CELP Silence Compression MPEG-4 File Format: MP4 –Independent of any particular delivery mechanism –Streamable format, rather than a streaming format –Based on the QuickTime format from Apple Computer Inc. Backchannel Specification –Allows for user-controlled streaming
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4 Audio Version 2: Profiles and Levels Low Delay Audio Profile –For speech and generic audio coding Scalable Internet Audio Profile –Extends scalable profile of version 1 by small step scalability and Parametric Audio Coding MainPlus Audio Profile –Provides superset of all audio profiles
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4 Audio: D ifferent Codecs and their typical bit-rate Class A: based on Time/Frequency mapping (T/F), i.e. transform coding Class B: Linear Predicting Coding (LPC) based analysis/synthesis codec Class C: Codecs, based on parametric description of audio signal bit-rate (kbit/s) Audio bandwidth (kHz)20 Parametric Core Enhance Modules LPC Core Enhance Modules T/F-based Core + Large Step Enhancement Modules + + Satellite Secure Com Cellular Phone InternetISDN ITU-T G.729
04/10/2006 Hung Nguyen, IT 481, Spring Basics of Speech Coding: Principles of Linear Prediction Coding (LPC) Source:
04/10/2006 Hung Nguyen, IT 481, Spring Conceptual Diagram LPC-based Audio Coding Preprocessing Excitation Generator LPC Synthesis Filter Spectral Weighting Filter Multiplex Error Minimization Audio Signal Coded Bit-Stream LPC Analysis Filter - Based on synthesis by analysis Performs best for speech between 6 kbit/s and 16 kbit/s per channel Multiple bit-rates Bit-rate and bandwidth scaleability
04/10/2006 Hung Nguyen, IT 481, Spring Conceptual Diagram of MPEG-4 CELP Decoder Source: B. Edler, Speech Coding in MPEG-4 LSP VQ Decoder LSP Interpolation Conversion LSP ---> LPC Adaptive Codebook RPE/MPE Codebooks Amplitude Decoder LP Synthesis Filter Filter x x + Formants Fund. Pitch Excitation Amplitude LSP: Line Spectral Pair RPE: Regular Puls Excitation MPE: Multi-Pulse Excitation
04/10/2006 Hung Nguyen, IT 481, Spring Conceptual Diagram of MPEG-4 HVXC Decoder LSP VQ Decoder LSP Interpolation Conversion LSP ---> LPC Formants LP Synthesis Filter Filter Harmonic Synthesis Stochastic Synthesis Excitation Parameter Decoder v/uv Excitation + Source: B. Edler, Speech Coding in MPEG-4
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4 Natural Speech Coding Tools used in CELP and HVXC CELP MPE Harmonic VQ LSP VQ HVXC RPE WB kbit/s NB/WB kbit/s HVXC: „ H armonic V ector e X citation C oding“ Source:
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4 Natural Speech Coding: Specification of Tools HVXC CELP
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4 Natural Speech Coding Tools Scalable Bit-rate Coding Input Speech 6 kbit/s 2 kbit/s 10 kbit/s CELP Encode r Decoder A 6 kbit/s Decoder B 8 kbit/s Decoder C 10 kbit/s Decoder D 12 kbit/s Decoder E 22 kbit/s 10 kbit/s high frequency component Core Bit Stream 8 kHz 16 kHz 8 kHz Base Quality High Quality
Coding of Visual Objects
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Visual Objects High bitrate tools (interlace) VLBV core Content-based functionalities (shape, scalability) functionality Bit rate 5-64kbit/s 1. Low spatial resolution 2. Low frame rate 3. Conventional rectangular size 4. High coding efficiency 1. High spatial resolution 2. High frame rate 3. Broadcast TV 1. Separate coding of content 2. Content-based interactive 3. Object manipulation 4. Hybrid coding of natural as well as synthetic objects …………………. Classification of Video Coding Tools functionality
04/10/2006 Hung Nguyen, IT 481, Spring VLBC Core and the Generic MPEG-4 Coder Motion (MV) Texture (DCT) MPEG-4 VLBV Core Coder (similar to H.263/MPEG-1) Motion (MV) Texture (DCT) Generic MPEG-4 Coder Shape MPEG-4: Visual Objects
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Visual Objects Natural Video –Conventional and Content-Based Functionality
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Synthetic Visual Objects Parametric descriptions of –A synthetic description of human face and body (body in Version 2) –Animation streams of the face and body (body in Version 2) Static and Dynamic Mesh Coding with texture mapping Texture Coding for View Dependent application
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Synthetic Visual Objects – Facial Animation Facial Definition Parameter Facial Animation Parameter Facial animation needs –The Facial Description Parameter (FDP) in Binary Format Scene –The Face Animation Table within FDPs Example : FAP could say ‘open_jaw’ –The Face Interpolation Technique in BIFS Example : Missing FAP
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Synthetic Visual Objects – Animated Meshes 2D animated meshes –Only triangular meshes View dependent scalability –Taking into account the viewing position in the 3D virtual world –This object is sent and computed both at the encoder and the decoder side
04/10/2006 Hung Nguyen, IT 481, Spring Example of MPEG-4 Visual Objects +
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Scene Description How objects are grouped together and positioned in space & time Attribute Value Selection Other transforms on media objects scene person 2D background furniture audiovisual presentation spritevoiceglobedesk Sprite: region of a VO present in scene throughout a video segment
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Intellectual Property Rights (IPR) IPR generally bestows on its owners the right to exclude others (with certain limited exceptions) from the use or re-use of their intellectual property without a license from the IPR owner. Any intellectual property which is delivered, either freely or for commercial gain, through an MPEG-4 application is afforded such protection. More information on MPEG-4 and IPR on sys/sys-faq-ipmp.htm#IPMP-What-is-IP
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Intellectual Property Rights (IPR) IPMP: Intellectual Property Management and Protection
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Delivery Multimedia Integration Framework (DMIF)
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Delivery Multimedia Integration Framework (DMIF) – cont’d Original DMIFTarget DMIF App App 2 App 1 DMIF
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Applications Scalable & Interactive Application on the WWW “Configurable TV”: choose the elements –subtitles ( +language) –stock values –inset of soccer match Animated talking head with speech synthesis Interactive DVD applications
04/10/2006 Hung Nguyen, IT 481, Spring Interactive content can include anything that enhances the viewer's appreciation of TV— for example, local news headlines, sports scores, program lineups, links to Web sites, advertiser logos, actor/actress profiles, and plot summaries. Interactive TV content is limited only by the imagination. An interactive "i" icon tells viewers that interactive content is available © Microsoft Cooperation MPEG-4 Applications: Configurable TV
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4 Applications: Internet Viewcam Developed by SHARP (1999.3) Record moving picture (160×120) during 1hour with 32MB Memory Recording Format : ASF of Microsoft
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Other Applications Virtual Work Space –Collaborative applications by 2D or 3D worlds (Server - Clients) Internet e-commerce –Low bit-rate connection –Reduce the original data from several Mbytes to a few Kbytes
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Other Applications (cont’d) MPEG-4 Audio encoding and decoding –Real time : 6~96 kbps, 44.1kHz –PC & DSP card –Digital audio broadcasting via AM Bandwidth Mobile phone High speed computer links Interactive Multimedia Authoring Tool –Compose, modified, combined of images and video clips
04/10/2006 Hung Nguyen, IT 481, Spring MPEG-4: Summary MPEG-4 will be a standard that defines tools for the representation of Multimedia Objects. Both these tools and the AV Objects can be used and re-used, in different services and applications, and on many media
04/10/2006 Hung Nguyen, IT 481, Spring References [1] Mohammad Khansari,Overview of MPEG-4, Sharif University of Technology, Tehran, Iran. [2] Gerhard Stoll, MPEG-4 Audio Coding, Institut für Rundfunktechnik, München, Germany [1] Multimedia Communications System II, Video Coding Standards, Yao Wang, Polytechnic University, Brooklyn, NY11201