Presentation is loading. Please wait.

Presentation is loading. Please wait.

數位三維視訊 楊 家 輝 Jar-Ferr Yang 電腦與通信工程研究所 電機工程學系 國立成功大學 Institute of Computer and Communication Engineering Department of Electrical Engineering National.

Similar presentations


Presentation on theme: "數位三維視訊 楊 家 輝 Jar-Ferr Yang 電腦與通信工程研究所 電機工程學系 國立成功大學 Institute of Computer and Communication Engineering Department of Electrical Engineering National."— Presentation transcript:

1

2 數位三維視訊 楊 家 輝 Jar-Ferr Yang 電腦與通信工程研究所 電機工程學系 國立成功大學 Institute of Computer and Communication Engineering Department of Electrical Engineering National Cheng Kung University, Tainan, Taiwan MPEG Multimedia Standards Digital 3D Video: Chapter 10

3 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-2 MPEG History MPEG-1 –Started in 1988 by Lenardo Chiariglione –Compression standard for progressive frame-based video in SIF (360x240) formats –Applications: VCD MPEG-2 –Compression standard for interlaced frame-based video in CCIR-601 (720x480) and high definition (1920x1088i) formats –Applications: DVD, SVCD, DIRECTV, GA, DVB, HDTV Studio MPEG-4 –Multimedia standard for object-based video from natural or synthetic source –Applications: Internet, cable TV, virtual studio, home LAN etc.. MPEG-7 –Multimedia content description interface –Applications: Internet, video search engine, digital library MPEG-21 –Intellectual right protection propose

4 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-3 MPEG-1 Video Coding MPEG-1  MPEG-1 Design focus on non-interlaced SIF (352x240 pel) @ 1.5 Mbps  Trick modes (fast forward, reverse etc.) are supported.  Syntax is flexible and can accommodate other picture resolutions  Good picture quality demonstrated, competitive with all public domain video algorithms  Committee Draft ISO 11172, issued November 1991, DIS 11172 issued November 1992, IS January 1993 MPEG-l+:  is MPEG-l at MPEG-2 resolution (CCIR).  used for TV-on-top-boxes for broadcasting or video on demand  typical data rate, are 3.5Mbps - 5.5 Mbps.  is about 5%-10% less efficient than MPEG-2.

5 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-4 Requirements for MPEG-2 (Extension to MPEG-1) include:  Interlace support  Scalability/compatibility  Interoperability with broadcast TV/HDTV  Cell loss resilience  Low delay mode  Goal is a single syntax standard across applications MPEG-1 & 2 Standards 11172-1 system 11172-2 video 11172-3 audio 11172-4 conformance MPEG-1 Related Standards

6 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-5 MPEG-2 Related Standards 13818-1 system 13818-2 video 13818-3 audio 13818-4 conformance 13818-5 technical report 13818-6 extension for DSM-CC 13818-7 non backward compatible audio coding 13818-8 extension for 10-bit rate coding 13818-9 extension for real-time interface,for system decoders MPEG-2 Related Standards

7 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-6 Digital Consumer Electronics (MPEG Relevant) Karaoke CD Direct Broadcast Satellite Computer Games Digital Cable Digital Video Disc Digital Audio Broadcast Personal Video Recorder Digital Camcorder Video Conferencing HDTV DVD Digital TV CD-ROM (VCD) Video Streaming 1994199519961997 19981999 2000

8 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-7 MPEG-2 System (A/V) System Decoder Video Decoder Audio Decoder Video Encoder Audio Encoder System Encoder Channel Carrier Not Covered by the Standard Covered by the Standard To assure the MPEG encoded video stream data can be correctly decoded! The content provider could use an exhausted method to encode the video data (off-line process) ! The designer could use simple and fast algorithms to efficiently real-time encode the video data!

9 數位三維視訊 Multiplexing Subsystem: MPEG-2 System (13818-1)

10 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-9 MPEG-2 System Information MPEG-2 packets can contain : Video, Audio, Teletext, Data streaming (13818-1) DSM-CC Sections ( data carousel, object carousel, SI-tables, etc ) (13818-6) DVB Data Piping MPEG-2 Transport Stream (TS) in the MHP Context: Content:: Multiplexing: Video Audio Teletext (DVB) SI Cond. Access IP Packets Private Data Applications App. Info TDM 1 TS Packet (188 Bytes) Payload PES / Section / Piped Data (( 184-n) Byte ) Header with PID ( 4 byte ) Adaptation Field ( n byte ) MPEG-2 Transport Stream:

11 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-10 Multiplexer Transport Multiplexing Subsystem Multiplexing – MPEG-2 System Audio Compression Digital Modulation Error Correction Encoder Video Compression Video Ancillary data Audio Transmission Subsystem Control data Mixer Video Subsystem Audio Subsystem ES TS

12 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-11 MPEG-2 Layers and Data * Channel (ATM,SONET, etc.) * Transport Stream (Multiplexed A/V PES and User Data) * PES (Packetized Elementary Stream, Audio or Video) * ES (Elementary Streams-Compressed Data) Video Data Audio Data Elementary Streams Video Encoder Audio Encoder Packetizer ES Video PES Program Stream MUX Transport Stream MUX Audio PES Program Stream Transport Stream

13 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-12 MPEG-2 Packetized ES (PES) MPEG-2 Video Video Elementary Stream I 0 P 3 B 1 B 2 P 6 B 3 B 4 I 9 B 7 B 8 P 12 B B P B B I I 0 B 1 B 2 P 3 B 4 B 5 P 6 B 7 B 8 Video Frames Frame MPEG-2 System Subband Samples Side Information Sync, System Info. and CRC Ancillary Data Field Audio Elementary Stream MPEG-2 System Audio Tracks frame MPEG-2 Audio Video Packetized Elementary Stream Audio Packetized Elementary Stream

14 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-13 * PES - basic stream format for video, audio, data,.. * PES - offers a mechanism to carry conditional access information * PES - can be scrambled and also assigned priority * PES - can carry time references: PTS and DTS * PES - largest data size within a PES packet is 64k Bytes. Packetized Elementary Stream (PES) Packetized Elementary Stream Contains: PES_Priority - Indicates priority of the current PES packet. PES_Scrambling_Control - Defines whether scrambling is used, and the chosen scrambling method. Data_alignment_indicator - Indicates if the payload starts with a video or audio start code. Copyright information - Indicates if the payload is copyright protected. Original_or_copy - Indicates if this is the original ES PES Indicators:

15 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-14 Presentation Time Stamp (PTS) and possibly a Decode Time Stamp (DTS) - For audio / video streams these time stamps which may be used to synchronize a set of elementary streams and control the rate at which they are replayed by the receiver. Elementary Stream Clock Reference (ESCR) Elementary Stream Rate - Rate at which the ES was encoded. Trick Mode - indicates the video/audio is not the normal ES, e.g. after DSM-CC has signaled a replay. Copyright Information - set to 1 to indicated copyright ES. CRC - this may be used to monitor errors in the previous PES packet PES Extension Information - may be used to support MPEG-1 streams. PES Optional Field

16 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-15 Packetized Elementary Stream (PES) 6 bytes Protocol Header 3 bytes start code 1 byte stream ID – 110x xxxx audio stream number x xxxx – 1110 yyyy video stream number yyyy – 1111 0010 DSM-CC (Digital Storage Media) control packet 2 bytes length field MPEG -2 System Processor Packetized Elementary Stream (PES): Elementary Stream (ES): - Digital Control Stream - Digital Audio (compressed) - Digital Video (compressed) - Digital Data Output from MPEG-2 System Encoder: (up to 65536 bytes including 6 byte protocol header)

17 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-16 PES Packet Syntax Diagram Packet Start Code Prefix 24 Stream ID 816 PES Header (optional) PES Packet Data Bytes PES Packet Length ’10’ PES Scrambling Control PES Priority Optional Fields 7 Flags Copyright PES Header Length Data Alignment Indicator Stuffing Bytes (0xFF) Original or Copy 2218111 DSM Trick Mode PTS DTS PES Extension Additional Copy Info ES Rate ESCR Previous PES CRC 1 3342228716 5 Flags Optional Fields Pack Header Field PES Private Data Program Packet Seq Counter P-STD Buffer PES Extension Length PES Extension Data 128168 7 8m * 8

18 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-17 Transport Stream Formation Program 1 Video 1 PES Program 2 video 2 PES Audio 1 PES Transport Stream 188 Bytes

19 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-18 PID numbers for Program Specific Information (PSI) used for Service Information (SI) 0x0000PAT Program Association Table 0x0001CATConditional Access Table 0x0002TSDTEI DVB 0x0003-0x000Freserved 0x0010NIT, STNetwork Information Table and Stuffing Table 0x0011SDT, BAT, STBouquet Association Table 0x0012EIT, STEvent Information Table 0x0013RST, STRunning Status Table 0x0014TDT, TOT, STTime and Date Table and Time Offset Table 0x0015Network synchronization 0x0016-0x001Dreserved for future use (0x001EDITDiscontinuity Information Table (0x001FSITSelection Information Table Transport Stream Packet 188 bytes Sync 1 byte Header 3 bytes Optional Adaptation Field X bits Payload 184 bytes PID Packet Identifier 13 bits

20 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-19 MPEG-2 Signaling Tables PAT - Program Association Table (lists the PIDs of tables describing each program). The PAT is sent with the well- known PID value of 0x0000. PMT - Program Map Table (defines the set of PIDs associated with a program, e.g. audio, video,...) CAT - Conditional Access Table (defines type of scrambling used and PID values of transport streams which contain the conditional access management and entitlement information (EMM)). NIT - Network Information Table (PID=0x0010, contains details of the bearer network used to transmit the MPEG multiplex, including the carrier frequency) DSM-CC - Digital Storage Media Command and Control (messages to the receiver)

21 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-20 MPEG-2/DVB PID Allocation Program Association Table (PAT) always has PID = 0 (zero) Conditional Access Table (CAT) always has PID = 1 Event Information Table (EIT) always has PID = 18 (0x0012) Program Map Tables (PMTs) have the PIDs specified in the PAT The audio, video, PCR, subtitle, teletext etc PIDs for all programs are specified in their respective PMTs TablePID value PAT0x0000 CAT0x0001 TSDT0x0002 Reserved 0x0003 – 0x000F NIT, ST0x0010 SDT, BAT, ST0x0011 EIT, ST0x0012 RST, ST0x0013 TDT, TOT, ST0x0014 Network Synchronization 0x0015 Reserved 0x0016 – 0x001B Inband signaling0x001C measurement0x001D DIT0x001E SIT0x001F

22 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-21 Transport Stream Packet 188 Bytes Sync Byte PID 8 111 13 22 Continuity Counter PES 1PES 2PES N…….. Adaptation Field Transport Error Indicator Payload Unit Start Indicator Transport Priority PCR Flags 5 111 42 Flag 83 Optional Fields Length 8 Stuffing Bytes Optional Fields Adaptation Field Control Scrambling Control OPCR 42 Splice Countdown 8 Private Data Length 8 Private ….. Data …. Adaptation Field Extension Length Discontinuity Indicator Random Access Indicator PES Priority

23 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-22 bit Byte 7 6 5 4 3 2 1 0 Program Clock Reference (PCR) base The intended time, in 90 kHz clock symbols, of the arrival at the input of the decoder of the fourth byte of this structure. (cont.) reserved PCR extension. Additional resolution, in 27 MHz clock. PCR = 300*base + ext Program Clock Reference PCR OPCR Original PCR (OPCR) base should not be modified by any multiplexer or decoder Used for recovery of single-program PCR from multi-program Transport Stream (cont.) reserved Original PCR extension

24 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-23 MPEG-2 Transport Stream Packetizer Video Encoder Audio Encoder Video Encoder Audio Encoder Video Encoder Audio Encoder Packetizer Program 1 Video_1Audio_1 Data_1 Program 2Program 3 Video_2Audio_2 Data_2 Video_3Audio_3 Data_3 TRANSPORT MUX TP1_1TP2_1TP1_2TP2_2TP3_1TP1_3TP2_3TP3_2 Transport Stream TP3_3 TP1_1TP1_2TP1_3TP2_1TP2_2TP2_3TP3_1TP3_2TP3_3 Transport Mux

25 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-24 MPEG-2 Signaling Tables Program Associate Table (PAT) Program Map Table (PMT) Other Packets Audio Packet Video Packet Packet header includes a unique Packet ID (PID) For each stream PAT lists PIDs for program map tables Network Info=10 Prog 1 = 150 Prog 2 = 301 Prog 3 = 511 etc. Program guides, Subtitles Multimedia data Internet Packets, etc. PMT lists PID associated with a particular program Video = 51 Audio (English) = 64 Audio (French) = 66 Subtitle = 101 etc. 51 6664 0150101

26 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-25 MPEG-2 / DVB PSI Structure Program 0PID=16 Program 1PID=22 Program 2PID=33 …… Program MPID=55 Program Map Table for Program 1 Conditional Access Table (PID=1) Network Information Table (always Program 0, PID=16) NIT is considered a Private data by ISO Table section ID assigned by system Table section ID always set to 0x01 Table section ID always set to 0x02 Table section ID always set to 0x00 Stream 1PCR31 Stream 2Video 154 Stream 3Audio 148 Stream 3Audio 249 ……… Stream kData K66 Program Associate Table (PID=0) CA Section 1 (Program 1)EMM PID(99) CA Section 2 (Program 2)EMM PID(109) CA Section 3 (Program 3)EMM PID(119) …… CA Section k (Program k)EMM PID(x) Private Section 1NIT Info. Private Section 2NIT Info. Private Section 3NIT Info. …… Private Section kNIT Info. 0 PAT 22 Prog 1. PMT 33 Prog 2. PMT 99 Prog 1 EMM 31 Prog 1 PCR 48 Prog 1 Audio 1 54 Prog 1 Video 1 109 Prog 2 EMM Multiple-Program MPEG-2 Transport Stream: Program Map Table for Program 2 Stream 1PCR41 Stream 2Video 119 Stream 3Audio 181 Stream 3Audio 282 ……… Stream kData K88

27 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-26 Transport Multiplexing & Decoding Transport Stream Demultiplex and Decoder Clock Control Video Decoder Channel Specific Decoder Audio Decoder Decoded Video Decoded Audio Channel Transport stream containing one or multiple programs Transport Stream Demultiplex and Decoder Channel Specific Decoder Transport Stream with single program Program Stream ≠ Transport Stream

28 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-27 Transport Stream Decoder Multiplex Buffer Video Decoder Transport Buffer Re-order Buffer Decoded Video Decoded Audio ES Stream Buffer Multiplex Buffer Transport Buffer ES Stream Buffer Multiplex Buffer Transport Buffer ES Stream Buffer Audio Decoder System Info. Decoder System Control Transport Stream Decoder

29 數位三維視訊 Video Subsystem: MPEG-2 Video (13818-2)

30 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-29 Multiplexer Transport Multiplexing Subsystem Multiplexing – MPEG-2 System Audio Compression Digital Modulation Error Correction Encoder Video Compression Video Ancillary data Audio Transmission Subsystem Control data Mixer Video Subsystem Audio Subsystem ES TS

31 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-30 Color representation  Colors can be represented by a mixture of three primaries: R, G, B.  Various equivalent color spaces are possible  Many important color spaces comprise a luminance component and two chrominance components, for example: (R, G, B) (Y, U, V) or (Y, C r, C b ) Digital Video: Color Space Transform

32 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-31 Layer Structure in Video Sequences The highest syntactic structure of the coded video bitstream. Sequence header, sequence extension. Group of picture header, group of pictures. Picture header, picture extension, picture, data. Slice. Macroblock. Block.

33 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-32 Hierarchical Video Bitstream Video Sequence... Group of Pictures Picture Slice Macroblock 8 pixels 8 pixels Block

34 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-33 MPEG-1/2 Video Coding  Technical Summary of MPEG-1/2  Compression based on Motion-compensated interframe DCT  Group of pictures (GOP) basic video unit for compression  Motion compensation with I, P, B frame structure  Intra (I) frames (intraframe DCT)  Predicted (P) frame (DCT with forward MC)  Bidirectional (B) frame (DCT with forward/backward MC) N=9 M=3 I BB P B B P BB I Forward Motion Compensation Forward Motion Compensation Forward Motion Compensation Bidirectional Motion compensation GOP

35 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-34 MPEG-1/2 Video Coding  I and P pictures –called “anchor” pictures –stored in memory –form the basis for prediction of P and B pictures  GOP rules –A GOP must contain at least one picture –This I picture may be followed by any number of I and P pictures –Any number of B pictures may occur between anchor pictures, and B picture may proceed the first I picture –A GOP, in coding order, must start with an I picture –A GOP, in display order, must start with an I or B picture and must end with an I or P picture

36 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-35 Sequence of Frames 0 3 1 2 6 4 5 9 7 8 12 I P B B P B B I B B P Encoding Order of Frames Intra frame coding (Temporal reference) 0 1 2 3 4 5 6 7 8 9 10 I B B P B B P B B I B Group of Picture (GOP) forward prediction Bidirectional prediction

37 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-36 Slice * A slice is a series of an arbitrary number of macroblocks. * Every slice, shall contain at least one macroblock. * The first and last macroblocks of a slice, shall not be skipped macroblocks. * Slices shall not overlap. * The first and last macroblock of a slice shall be in the same horizontal row of macroblocks. * Slice shall occur in the bitstream in the order in which they are encountered.

38 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-37 General Slice Structure A C G E D F H B I

39 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-38 A B C G ED F H I A B C GE D F H I J K OM L N A I Restricted Slice Structure

40 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-39 Macroblock 4:2:0 MB Structure 0 2 1 3 45 4:2:2 MB Structure 0 2 1 3 6 4 7 5 4:4:4 MB Structure 0 2 1 3 4 6 8 10 5 7 9 11

41 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-40 Chrominance Subsampling  Most image information is contained in the Y channel  Chrominance data can be subsampled without significant degradation in image quality  Full resolution is YUV 4:4:4,  Downsampling 2:1 horizontally - YUV 4:2:2  Downsampling 2:1 horizontally & vertically - YUV 4:2:0  Downsampling 4:1 horizontally (VHS) – YUV 4:1:1 CrCbCrCbCrCbCrCbCrCbCrCbCrCbCrCbCrCb CrCbCrCbCrCbCrCbCrCbCrCbCrCb CrCbCrCbCrCbCrCbCrCbCrCbCrCbCrCbCrCb CrCbCrCbCrCbCrCbCrCbCrCbCrCb Y Y Y YYY Y YYY YUV 4:4:4 YY YYY Y YUV 4:2:2 YYYY YYYY YYYY YYYY YUV 4:2:0 YYYY YYYY YYYY YYYY YUV 4:1:1 YYYY YYYY YYYY YYYY (VHS quality) Downsample UV 2:1 horizontally Downsample UV 2:1 vertically Downsample UV 4:1 horizontally

42 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-41 Picture Sampling Formats * 4:2:2 Format Luminance samples Chrominance samples * 4:4:4 Format * 4:2:0 Format

43 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-42 Coding and Decoding Processes PreprocessingEncoding Storage and/or Transmission DecodingPost- Processing Display Source

44 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-43 Preprocessing - Decimation * Decimation Filter: 1 3 3 1 /8 rounding -29 0 88 138 88 0 -19 /256 rounding 720 480/576 240/288 720 Vertical sampling 360 240/288 Horizontal sampling

45 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-44 Postprocessing - Interpolation 720 480/576 360 240/288 Horizontal Upsampling 720 240/288 Vertical Upsampling Interpolation Filter: -12 0 140 256 140 0 -12 /256 rounding 1 3 3 1 /8 rounding

46 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-45 Blocks Luminance MB structure in frame DCT coding Luminance MB in field DCT coding Blocks Marcoblock to Blocks MB

47 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-46 MPEG Video Encoding Motion Estimator MC Mode Decision Picture Predictor & Store MV MC Modes Residual DCTQ Q -1 IDCT Decoded Picture Prediction Lossless Coder Rate Control Buffer Coded Video Bit Steam Ordered Source Pictures _ + ++

48 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-47 MPEG-1/2 Video Coding  Motion Compensation  Track motion between current frame and anchor frame  MPEG supports up to +/- 512 pixel range  Subpixel accuracy (e.g. half pixel)  Macroblock basis Current Picture Anchor Picture Motion Vectors Predictive Difference

49 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-48 MPEG-1/2 Video Coding Discrete Cosine Transform (DCT)  Transform 8x8 pel block into frequency coefficient matrix  Organizes video information in a way that is easy to compress and manipulate  DCT applied to intra block as well as motion compensated blocks 255 196 0 0 0 0 17 94 DCT LowHigh Low High Frequency coefficients 276 59 89 39 7 -13 -12 -7 137 -94 -35 4 17 16 7 2 51 25 -42 -20 –14 -4 -5 -5 -12 40 -8 -16 -4 0 2 -1 -8 3 17 -13 -4 0 -1 0 2 14 14 5 -7 0 -1 0 -1 -3 -2 12 0 -4 -2 1 -6 2 -6 6 8 -5 -1 0 255 255 255 255 255 122 0 0 35 136 213 255 255 247 43 0 0 0 0 0 255 255 82 0 0 0 0 0 255 255 128 0 0 0 0 0 Pixels

50 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-49 MPEG-1 Bit Stream Organization Seq. Header Block Data MB Header Slice Header Picture Header GOP Header

51 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-50 MPEG-2 Bit Stream Organization Sequence Header Sequence Extensio n & User GOP Header Extension & User Picture Header Sequence Header Pic.Coding Extension Sequence End Picture Data Extension & User ISO/IEC 111722-2

52 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-51 MPEG-2 Start Codes Start Code : code in the bitstream with prefix " 0000 0000 0000 0000 0000 0001" Name Start code value picture-start-code 00 slice-start-code 01 through AF reserved B0 reserved B1 user-data-start-codeB2 B3 B4 B5 B6 B7 B8 B9 through FF sequence-header-code sequence-error-code extension-start-code reserved sequence-start-code group-start-code system-start-code

53 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-52 Video Sequence Structure Video Sequence l Sequence 2 Picture (I)Picture (B) Picture (I) Slice 1Slice 2 Slice N …... MB 1MB 2 MB6 …... Block 1Block 2 Block 6 …... GOP lGOP2 GOP 12GOP 13

54 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-53 Simplified MPEG Video Decoding Coded Video Bit Steam Picture Frame Buffer MVs MV Mode Q -1 IDCT Ordered Source Pictures Lossless Decoder VLC/RLC Motion Compensation

55 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-54 Simplified MPEG Video Decoder Inverse Scan Inverse DCT Motion Compen- sation Frame- Store Memory Inverse Quantiz- ation Variable Length Decoding Coded Data QFS[n] QF(u, v) F(u, v) f(x, y) d(x, y)

56 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-55 MPEG Video Decoding (with SNR Scalability) Frame Memory Saturation Mismatch Control Inverse DCT Motion Compen- sation Code Data Variable Length Decoder Inverse Quantis- ation Inverse Scanning QFS[n] QF(v, u) F L ”(v, u) f(x, y) Decoded Pixels Code Data Variable Length Decoder Inverse Quantis- ation Inverse Scanning  F E ”(v, u) F””(v, u) F’(v, u) F(v, u) d(x, y) Lower Layer Enhanced Layer

57 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-56 DC coefficients in intra blocks are encoded as a VLC dct_dc_size followed by FLC, dc_dct_differential of dct_dc_size bits. A differential value is first recovered from the coded data which is added to a prediction in order to recover the final decoded coefficient. 3 DC predictors are maintained, one for each of the color components. Variable Length Code Decoding Intra VLC Decoding The predictor shall be set to the value of the coefficient just decoded. At below cases, the predictors shall be reset to 2 7+intra-dc-precision - at the start of a slice - whenever a non-intra macroblock is decoded. - whenever a macroblock is skipped. When an "End of Block" is decoded, the remainder coefficients in the block shall be set to zero. A code for a run-level pair (r, 1) followed by a sign bit s is decoded into r zeros followed be a signed level ( s == 0)? l :  l When can "Escape code is encountered, it will be followed by a 6- bit run and a 12 bit level. VLC decoding of Other Coefficients

58 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-57 Inverse and Alternative Scans 0 1 2 3 4 5 6 7 0 0 1 5 6 14 15 27 28 1 2 4 7 13 16 26 29 42 2 3 8 12 17 25 30 41 43 3 9 11 18 24 31 40 44 53 4 10 19 23 32 39 45 52 54 5 20 22 33 38 46 51 55 60 6 21 34 37 47 50 56 59 61 7 35 36 48 49 57 58 62 63 Inverse Scan Alter- native Scan 0 1 2 3 4 5 6 7 0 0 4 6 20 22 36 38 52 1 1 5 7 21 23 37 39 53 2 2 8 19 24 34 40 50 54 3 3 9 18 25 35 41 51 55 4 10 17 26 30 42 46 56 60 5 11 16 27 31 43 47 57 61 6 12 15 28 32 44 48 58 62 7 13 14 29 33 45 49 59 63

59 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-58 Inverse Quantization Process Inverse Quantization Arithmetic Saturation Mismatch Control QF(u, v) F(v, u)F‘(v, u)F“(v, u) W(w, v, u) Quant_scale-code Inverse Quantization Arithmetic: Intra DC In intra block F“(0, 0) shall be obtained by multiplying QF(0, 0) by a constant multiplier, intra_dc_mult. F“(0, 0) = intra_dc_mult x QF(0, 0) Relation between intra-de-precision and intra-dc-mult. Intra-dc-precision Bits of precision Intra-dc-mult 0 8 8 1 9 4 2 10 2 3 11 1

60 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-59 quantizer_scale is computed by : qc = quantizer_scale_code quantizer_scale=qc  2, if q-scale-type = 0 =8  (2 [qc/8] -1)+(qc mod 8)  2 [qc/8], if q-scale-type = 1 F“(v, u) shall be reconstructed from QF(v, u) by : F“(v, u) = ((2QF(v, u) + k)  W(w, v, u)  quantizer_scale)/32 where k = 0 for intra block k = sign(QF[v][u]) for non-intra block Inverse Quantization of AC Coeffs: Inverse Quantization Arithmetic There are two weighting matrices W(w, v, u) for 4:2:0 data and for other cases. The index, w = 0, 1, 2, 3 indicates which of the matrices is being used. The following table shows the rules governing the selection of w. w 4:2:0 4:2:2 and 4:4:4 Color comp. lumin. chrom. lumin. chrom. intra blocks 0 0 0 2 non-intra blocks 1 1 1 3

61 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-60 Saturation and Mismatch Control The coefficient resulting from the inverse quantization arithmetic are saturated to lie in the range [  2048: +2047] 2047 F“(u, v) > 2047 F‘(v, u) = F“(u, v)  2048 < F“(u, v) < 2047  2048 F“(u, v) <  2048 Saturation: Mismatch Control: Mismatch control shall be performed by any process equivalent to the following: sum = F‘(v, u) F(v, u) = F‘(v, u) for all u, v except u = v = 7 F‘(7, 7) if sum is odd F(7, 7) = F‘(7, 7)  1 if F(7, 7) is odd if sum is even F‘(7, 7) + l if F(7, 7) is even if sum is even

62 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-61 Inverse DCT * Once the DCT coefficients, F(v, u), are reconstructed, the inverse DCT transform shall be applied to obtain the inverse, transformed value f(x, y). * IF the block or the macroblock is skipped, then f(x, y) for such a block shall take the value zero. * Saturate f(x, y) in the range [-256:+255] for all x, y.

63 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-62 Simplified Motion Compensation Prediction Field/Frame Selection Half-Pixel Information Vector Predictors Combine Prediction Vector Decoding Additional Dual-Prime Arithmetic Scaling for Colour Components Half-pel Prediction Filtering Framestore Addressing Framestores Saturation f(x, y) vector[r][s][t] p(y, x) d(x, y) vector[r][s][t] From Bitstream Decoded Pixels

64 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-63 Motion Vectors * Motion vectors are coded differentially with respect to previously decoded motion vectors. * The decoded motion vector shall be scaleddepending on the sampling structure to give a motion vector for each color components. * Four motion vector predictors shall be maintained. All MV predictors shall be reset to zero in the cases: * In a P-picture when a macroblock is skipped. * In a P-picture when a non-intra macroblock is decoded in which macroblock-motion-forward is zero. * Whenever an intra macroblock is decoded which has no concealment motion vectors. * At the start of each slice.

65 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-64 Skipped Macroblock * I-picture no skipped macroblock * P-picture - reconstructed motion vector equal to zero - no reconstructed DCT coefficient * B-picture - same macroblock type as the prior macroblock - differential motion vector equal to zero - no DCT coefficient * Macroblock-address-increment is greater than one macroblock * Skipped macroblock shall not follow an intra-code

66 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-65 LevelMax. PixelsMax. LinesFrames/s Low35228830 Main72057630 High-14401440115260 High1920115260 Parameter Constraints according to levels Profiles and Levels * Profiles and levels provide a means of defining subsets of the syntax and semantics and thereby the decoder capabilities to decode a particular bitstream. * A profile is a defined sub-set of the entire bitstream syntax. * A level is a defined set of constraints imposed on parameters in the bitstream. * The purpose of defining conformance points in the form of profiles and levels is to facilitate bitstream interchange among different applications. * The profile-and level-indication in the sequence-extension indicates the profile and level to which the bitstream complies.

67 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-66 Constrain Parameters of horizontal picture size < 768 pels vertical picture size < 576 lines picture area 396 macroblocks pel rate 396x25 macroblocks/sec- picture rate < 30Hz motion vector range < -64 to +63.5 pels input buffer size < 327,680 bitrate 1,856,000 bits/sec MPEG-1 Sequence

68 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-67 Main Profile@Main Level Parameters of MPEG-2 Sequences VBV buffer size not more than 1,835,008 bits(1.75 Mb) not more than 720 pels/fine not more than 576 lines/frame not more than 30 frames/sec not more than 10.4 million pel/sec bit rate not more than 15 Mbps motion vector range vertical motion vector range from -128 to 127.5 in frame picture, -64 to 63 in field picture horizontal motion vector range from -1024 to 1023.5 4:2:0 chroma sampling structure - -

69 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-68 Differences Between MPEG-2 /1 * IDCT mismatch * Macroblock stuffing * Run-level escape syntax * Chrominance samples horizontal position * Slices * D-Picture * Full-pel motion vectors * Aspect ratio information * Forward-f-code and backward-f-codc, * Constrained-parameter-flag and maximum horizontal size * MPEG-2 syntax vs. MPEG-1 syntax

70 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-69 H.261 vs. MPEG-1 H.261MPEG-1 Sequential AccessRandom Access One basic frame rateFlexible frame rate OCIF and CIF images onlyFlexible image size I and P frame onlyI, P, and B frames MC over 1 frameMC over 1 or more frame 1 pixel MV accuracy1/2 pixel MV accuracy 121 filter in the loopNo filter Variable threshold + Uniform quantiz.Quantization Matrix No GOP structureGOP structure GOB structureSlice structure

71 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-70 MPEG-2 vs. MPEG-1 Specifications MPEG-2 MP@ML MPEG-1 Video Format 720x480x30(NTSC) 320x240x30(NTSC) 720x576x25(PAL) 320x288x25(PAL) Coded Data 4-6Mbps for CCIR601 1.8Mbps Max Speed 15Mbps Max Coded Picture Frame, Picture Frame Prediction Inter Frame, Field Interframe DCT Frame, Field Frame Resolution 12 bits 9 bits VLC Resol. 8, 9,10 bits 8 bits Quantization Non-linear Mapping Linear Mapping Pan, Scan Yes No

72 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-71 MPEG-2 Audio Coding System Decoder Video Decoder Audio Decoder Video Encoder Audio Encoder System Encoder Channel Carrier Not Covered by the Standard Covered by the Standard Overview of MPEG-2 audio coding standards will be covered in the last part if time is available!

73 數位三維視訊 MPEG-4 Video Coding Standard

74 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-73 MPEG History MPEG-1 –Started in 1988 by Lenardo Chiariglione –Compression standard for progressive frame-based video in SIF (360x240) formats –Applications: VCD MPEG-2 –Compression standard for interlaced frame-based video in CCIR- 601 (720x480) and high definition (1920x1088i) formats –Applications: DVD, SVCD, DIRECTV, GA, DVB, HDTV Studio MPEG-4 –Multimedia standard for object-based video from natural or synthetic source –Applications: Internet, cable TV, virtual studio, home LAN etc.. MPEG-7 –Multimedia content description interface –Applications: Internet, video search engine, digital library MPEG-21 –Intellectual right protection propose

75 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-74 Why New Standards? What existing standards can do:  MPEG-1: Frame-based non-interlaced video (1.5 Mbps)  MPEG-2: Frame-based interlaced video (4Mbps-270Mbps)  H.261: Low bit rate video conference (64x Kbps)  H.263: Very low bit rates video conference (10 Kbps) What the existing standard can not do:  Coding of video object with content information (Metadata)  Coding of both synthetic and natural sources  Coding of images for progressive transmission  Coding of multimedia information for various bandwidths and media  Interactive MPEG-4 AV Objects  Audiovisual Scene is composed of ‘objects’ (A&V)  ‘Compositor’ puts objects in scene (A&V, 2&3D)  Objects can be of different nature natural or synthetic A&V, text & graphics, animated faces, arbitrary shapes or rectangular  Encoding the object independently: Coding scheme can differ for individual objects  From low bitrates to (virtually) lossless quality

76 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-75 MPEG-4 AV Scene hierarchically multiplexed downstream control / data hierarchically multiplexed upstream control / data audiovisual presentation 3D objects 2D background voice sprite hypothetical viewer projection video compositor plane audio compositor scene coordinate system x y z user events audiovisual objects speaker display user input

77 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-76... Decoding Audiovisual Interactive Scene Composition and Rendering Primitive AV Objects Scene Description Information... Elementary Streams FlexMux TransMux... Ex: MPEG-2 Transport Object Descriptor Display and Local User Interaction DAI NETWORK MPEG-4 System

78 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-77 It is envisioned that the final MPEG-4 Video standard will define "tools" and "algorithms" resulting in a toolbox of relevant video tools and algorithms available to both encoder and decoder. These tools and algorithms will be defined based on the MPEG-4 Video VM algorithm. It is likely that, similar to the approach taken by the MPEG- 2 standard, profiles will be defined for tools and algorithms which include subsets of the MPEG-4 Video tools and algorithms. The MPEG-4 MSDL will provide sufficient means to flexibly glue video tools and algorithms at the encoder and decoder to suit the particular needs of diverse and specific applications. MPEG-4 Tools and Algorithms

79 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-78 Tool 1 Tool 2 Tool 3Tool 4Tool 5 Tool i... Algorithm 1 Algorithm 2Algorithm j MPEG-4 Syntactic Descriptive Language... Visual Profile Graphic/Mess Profile... MSDL Profile Audio Profile MPEG-4 Tools and Algorithms

80 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-79 Hybrid SFA ScalableTexture Animation 2D Texture Synthetic 2D 3D Graphics/Mess Profiles Visual Profiles MPEG-4 Visual Presentation Main Core SimpleScalable Natural Simple Audio Profiles Low Rate Speech Scalable Scene Description Profiles VRMLAudioVisual

81 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-80 2k4k8k16k24k 64k Parametric Coder CELP Coder T/F Coder Variable Bit Rate Coder MPEG-4 Audio Coding Standards

82 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-81 Major Natural Video Tools Binary shape Padding Motion compensation Quantization AC/ DC prediction Scanning I, P, B modes Temporal scalability Spatial Scalability Error resilience Static sprites Interlaced coding 12- bit video Static texture Overlapped motion compensation Advanced motion compensation Method 1 Method 2 Nonlinear Type 1 Type 2 Slice synchronization Extended header code Data partitioning Reversible VLC Basic Low delay Scalable

83 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-82 The coding of image sequences using MPEG-4 VOP's enables basic content-based functionalities at the decoder. Each VOP specifies particular image sequence content and is coded into a separate VOL-layer (by coding contour, motion, and texture information). Decoding of all VOP-layers reconstructs the original image sequence. Content can be reconstructed by separately decoding a single or a set VOL-layers (content-based scalability/access in the compressed domain). This allows content-based manipulation at the decoder without the need for transcoding Scene segmentation & depth layering VOP 1 contour motion texture VOP 2 contour motion texture VOP 3 contour motion texture bitstream layer VOP 3 bitstream layer VOP 2 bitstream layer VOP 1 MPEG-4 Content-based VOP Coding content-based scalability Content-based bitstream access & manipulation

84 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-83 MPEG-4 Baseline and Extensions MPEG-4 Core Coder Motion (MV) Texture (DCT) VOPbitstream Similar to H.263/MPEG-1 Extended MPEG-4 Core Coder Shape Motion (MV) Texture (DCT) VOPbitstream Similar to H.263/MPEG-1

85 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-84 MPEG-4 Decoder Bitstream Demultiplex Motion Decoding Motion Compensation Shape Coding Shape information Reconstructed VOP Texture Decoding VOP Memory Compositor Video out Composition Script MPEG-4 Content-Based Functionalities Error Resilience/Robustness Scalable Texture Coding (Wavelets) Shape Coding Sprites Scalability

86 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-85 time Enhancement Layer Base Layer Selected Object Direct Copy InterpolationMC + SADCT Object Based Video Coding

87 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-86 Macro block entirely inside VOP (coded by conventiona l DCT scheme) VOP Macro block partially outside VOP (blocks partially outside the VOP are coded by DCT after padding) Macro block entirely outside VOP (not coded) Texture Coding Tools (1/2)

88 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-87 Texture Coding Tools (2/2) Inverse AC/DC prediction Variable Length Decoding Inverse Scan VOP Memory Inverse Quantiz- ation Inverse DCT Motion Compen- sation

89 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-88 BCD YXA or Macroblock Adaptive DC Prediction

90 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-89 B A CD XY Macroblock Adaptive AC Prediction

91 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-90 VLC CodingDCT MC- PRED IDCT Q -1 Q + + Frame Buffer Contour Approximation Segmentation tool MPEG-4 DPCM/Transform Structure VOP 2 VOP 1 VOP 3

92 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-91 Error Resiliency: –Access on mobile networks (efficient coding and low bitrates still important) Scalable Coding: –Scalability based on (audio/visual) objects –Different quality, priority, error protection for different objects possible Intellectual Property Rights (IPR) –identification (V.1) and protection (V.2) More Features of MPEG-4

93 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-92 Profiles Version 1 Simple Compression Error Resilience Core Main Baseline Functionalities Extended Functionalities

94 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-93 Profile Version 2 Advanced Real Time Simple Compression Error Resilience Core Main Advanced Coding Efficiency Baseline Functionalities Extended Functionalities Skip all the slides after here

95 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-94 Shape Coding Tool (1/3) Coding modes – Opaque – Transparent – No-update – Intra Context-based Arithmetic Encoding – Inter Context-based Arithmetic Encoding Lossless Lossy – Motion compensation without update – sub-sampling by factor 2 or 4

96 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-95 x Binary arbitrary Shape Coding Tool (2/3)

97 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-96 Context-based Arithmetic Encoding -INTRA -INTER c9c8c7 c5 ? c4 c1 c3 c0? c6 c0 c2 c3c2c1 c5c6 c8 c4 c7 Current BABMotion Compensated BAB Shape Coding Tools – CAE (3/3)

98 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-97 Error Resilience Tools (1/3) Resynchronization markers Extended header code Data partitioning Reversible VLCs Network/ Protocol Bit rate Existing mobile:GSM,PDC, IS-56,IS-95 around 24-32 kbit/s Existing personal/codeless: DECT,PCS, PHS around 32-64 kbit/s Future mobile/personal: IMT-2000 (UMTS) Up to 2 mbit/s Wireless networks that could be used to transmit MPEG- 4 encoded material Error Resilience Tools:

99 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-98 Resync Marker MB Address Quant. Param. Header Extension Temporal Reference Shape DATA Motion DATA Motion Marker Texture DATA Error Burst Texture Data Reversible VLC Forward decoding Backward decoding Error Resilience Tool (2/3)

100 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-99 Error Resilience Tools Picture Start Code MPEG4 Resync Marker H.263 Resync Marker H.263 Bitstream MPEG4 Bitstream

101 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-100 Scalable Coding Scalability Preprocessor Midprocessor MPEG-4 Enhancement Layer Encoder Scalability Postprocessor MPEG-4 Enhancement Layer Decoder MPEG-4 Base Layer Encoder MPEG-4 Base Layer Decoder Midprocessor mux demux

102 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-101 Fine Granularity Scalability Tool Addressed the growing need for a video coding method that provides fine granularity scalability (FGS) in video quality while temporal and spatial resolutions may not change. + Enhancement Bitatream Bit-plane VLD Bit-plane Shift IDCT + Clipping Enhanced Video FGS Enhancement Decoding Base Layer Bitstream Base Layer Video VLD Q 1 IDCT + Clipping Motion Compensation Frame Memory

103 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-102 FGS-Possible Applications IVS Internet Video Streaming VA Video Archive VCD Video Content Distribution IMM Internet Multimedia IVG Interactive Video Games IPC Interpersonal Communications (videoconferencing, videophone, etc.) ISM Interactive Storage Media (optical disks, etc.) MMM Multimedia Mailing NDB Networked Database Services (via ATM, etc.) WMM Wireless Multimedia

104 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-103 Static Sprite Coding Tools + Sprite Foreground Object Decoded Frame

105 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-104 Warping of Reference Sprite Sprite and sprite points Sprite Image Only 2-8 global motion parameters are transmitted per frame Actual Frame VOP and reference points

106 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-105 MPEG-4 Tools - Summary I-VOP P and B Predictive Interface Predictive Binary Shape Coding Gray Scale Shape Scalability - Temporal (rectang.) - Spatial (rectang.) - Temporal (object) Error Resilience - rectangular/object 12 bit Video Texture Coding (Wavelets) - Rectang./object - Coding Efficiency - SNR Scalability - Spatial Scalability Static Sprites - Basic Sprites - Low Latency Sprites Computational Graceful Degradation

107 數位三維視訊 Advanced Media Standard: MPEG-7

108 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-107 Image and Video Standards MPEG7 MPEG21 JVT H.264/ MPEG-4 JVT H.264/ MPEG-4 JPEG2000 H.264 MPEG-1 MPEG-2 MPEG-4 JPEG G.4 G.3 Still Image Standards: ISO/IEC: (MPEG) ITU-T: (VCEG) H.263* H.261 H.262 Others: VC-2 AVS VC-1 JPEG-XR H.265 H.264/MVC H.264/AVC H.264/SVC MPEG HVC High Profile 3DV/FTV HEVC/JCT 3D-AVS Microsoft: China:

109 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-108 Standardized descriptions of multimedia information, formally called “Multimedia Content Description Interface” Competitive tests 02/1999 Final Committee Draft 03/2001 MPEG-7 Standard will produce the following documents: - ISO/IEC 15938 –1: MPEG-7 Systems - ISO/IEC 15938 –2: MPEG-7 Description Definition Language - ISO/IEC 15938 –3: MPEG-7 Visual - ISO/IEC 15938 –4: MPEG-7 Audio - ISO/IEC 15938 –5: MPEG-7 Multimedia Description Scheme - ISO/IEC 15938 –6: MPEG-7 Reference Softwares - ISO/IEC 15938 –7: MPEG-7 Performance MPEG-7 Overview

110 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-109 MPEG-7 Scope Feature Extraction: Content Analysis Feature Extraction Annotation tools Authoring Feature Extraction MPEG-7 Description Standardization Search Engine MPEG-7 Scope: XML Description Description Schemes (DSs) Descriptors (Ds) Language (DDL) Search Engine: Search and Filter Classification Manipulation Summarization Indexing

111 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-110 MPEG-7: Context and Objectives Content, content, and more content … –Increasing availability of multimedia content –More and more situation where it is necessary to have information about the content –Difficulty to manage information –Difficult to find, select, filter what is needed Objective –Standardize content-based descriptions for various types of audio-visual information, allowing quick and efficient content identification, and addressing a large range of applications –MPEG-1, -2 and -4 represent the content itself (“the bits”) –MPEG-7 should represent information about the content (“the bits about the bits”)

112 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-111 Data Targets & Types of Descriptions Most types of audio-visual info are considered (targets): –Audio and speech –Moving video, still pictures, graphics, 3D models –Object relations in a scene, etc. –Descriptions are independent of data format Descriptions can be classified into two broad categories: –Information that is present in the content low-level features that are automatically extracted e.g., (video) color, texture, motion (audio) pitch, tempo, volume high-level (semantic) features related to the human interpretation of the content, e.g., “car driving fast” or “tiger attacks deer” –Information that cannot be deduced from the content e.g., date and time, author, copyright data, genre, parental rating, links to other related material

113 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-112 Why do we need a standard? Interoperable Search & Retrieval: Search: color, texture, shape, motion, spatial Browse: video parsing, shot detection, key-frames Filter: object detection, classification Search Engine 2 Search Engine 1 Search Engine N Content Mgr 2 Content Mgr M Content Mgr 1 WWW

114 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-113 MPEG-7 Elements (I) MPEG-7 main elements <time>.... <camera>.. <annotation Tags <scene id=1> <time>.... <camera>.. <annotation Instantiation Descriptors: (Syntax & semantic of feature representation) D7 D2 D5 D6 D4 D1 D9 D8 D10 10101 1 0 Encoding & Delivery 10101 1 0 & D3 Language Description Definition Language Description Definition extension Definition D1 D3D2 D5D4D6 DS2 DS3 DS1 DS4 D1 D3D2 D5D4D6 DS2 DS3 DS1 DS4 Description Schemes D1 D3D2 D5D4D6 DS2 DS3 DS1 DS4 Structuring

115 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-114 MPEG-7 Elements (2) Description Tools –Descriptors (Ds): representations of features To describe various types of features of multimedia information To define the syntax/semantics of each feature representation Color example - specify histogram in RGB color space –Description Schemes (DSs) Specifies the structure and semantics of the relationships between its components, which may be either Ds or other DSs Shot DS Scene DS Annotation DS AudioPitch DTime D ColorHist D

116 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-115 Description Definition Language (DDL) –The language to specify DSs and possibly Ds –Will allow the creation of new DSs (and possibly Ds) and the extension and modification of existing DSs System tools –To support multiplexing of description, synchronization issues, transmission mechanisms, file format, etc. MPEG-7 Elements (3) DDL DS D D D D D In the standard Not In the standard Defined by the DDL

117 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-116 MPEG-7 Application Types  Pull Applications (Search and Browsing) –Internet search engines and databases –Advantages: queries based on standardized descriptions  Push Applications (Filtering) –Broadcast video and interactive television –Advantages: intelligent software agents filter content/channels based on standardized descriptions  Universal Multimedia Access (Perceptual QoS)  Specialized Professional and Control Applications

118 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-117 MPEG-7 Visual Descriptors  Color  color space, color quantization, dominant colors, scalable color  color layout, color-structure, GoF/GoP color  Texture  homogeneous texture, edge histogram, texture browsing  Shape  region-based, contour-based, shape 3D  Motion  camera motion, motion activity, motion trajectory, parametric motion  Localization  region locator, spatio-temporal locator  Face  face recognition

119 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-118 MPEG-7 Descriptors (Shape Example) l Region-based shape  an object may consist of either a single region or a set of regions with some holes, region-based shape is used to describe these kind of shapes  robust to minor deformation along the boundary (g, h same, but I different)  small size, fast extraction time and matching

120 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-119 Example: MPEG-7 Queries  Images: –Draw color and texture regions and search for similar regions  Graphics: –Sketch graphics and search for images with similar graphics  Object Motion: –Describe motion of objects and search for scenes with similarly moving objects  Video: –Describe actions and retrieve videos with similar actions  Voice: –Use an excerpt of a song to retrieve similar songs/video clips  Music: –Play a few notes and search for musical pieces containing them

121 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-120 MPEG-7 Content Description Data Structure Features Models Semantics Images RegionsColor,Texture Clusters Objects Video Segments Shape Classes Events Audio Grids Motion Collections Actions Multimedia Mosaics SpeechProbabilities People Formats Relationships TimbreConfidences Labels Layout(Spatio-temporal) MelodyRelationships DataModel Semantic Feature Signal Structure

122 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-121 Examples Descriptor(D) - Association of standard representative value to feature. Description Schemes (DS) Color { Histogram=Value1 MeanR=Value2 MeanG=Value3 MeanB=Value4 } Actor { Name=John Brown Face={Eigenface Parameters} Voice={Audio Parameters} Location=={Bounding Box} }

123 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-122 Content Management System Real-time multimedia data Multimedia Content Database Scene Analysis & Feature Extraction Description Generation & DDL Representation MPEG-7 Content Description Encode Translation to desired description User Query / Preferences Similarity-based Search/Filtering Engine (in the compressed/ uncompressed domain) Decode Hyperlink to content Hyperlink to meta- data Query Results Matching Content Search Engine Front-End User Interface DS D DD D D D DS - Description Scheme (D relationships & Organization) D - Descriptor (e.g. color, shape, texture feature vectors) DDL - Description Definition Language Normative parts:

124 數位三維視訊 MPEG-21 Digital Right Management

125 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-124 MPEG-21 Objectives l Vision  To define a multimedia framework to enable transparent use of multimedia resources across a wide range of networks and devices used by different communities l Purpose  Enable electronic creation, delivery, trade of digital multimedia content l Goals  Provide access to information and services from almost anywhere at anytime with ubiquitous terminals and networks  Identify, describe, manage and protect the content in order to support multimedia delivery chain that contain the content creation, production, delivery and consumption

126 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-125 MPEG-21 lWhat is MPEG-21  ISO/IEC 21000: MPEG-21 Multimedia Framework  MPEG-21 will create an open framework for multimedia delivery and consumption, with both the content creator and content consumer as focal points lWhy MPEG-21 is needed  Many elements (standards) exist for delivery and consumption of multimedia contents, but there is no 'big picture” to describe how these elements relate to each other  MPEG-21 will fill the gaps and allow existing components to be used together, thereby increasing interoperability lWhy now?  Hardware building blocks and infrastructure in place  Multimedia compression, transmission, description standards are ready

127 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-126 MPEG-21 Parts Part 1 – Vision, Technologies and Strategy Part 2 – Digital Item Declaration (DID) (02/12) Part 3 – Digital Item Identification (DII) (02/12) Part 4 – Intellectual Property Management and Protection (IPMP) (04/09) Part 5 – Rights Expression Language (REL) (03/09) Part 6 – Rights Data Dictionary (RDD) (03/09) Part 7 – Digital Item Adaptation (DIA) (03/09) Part 8 – Reference Software (04/07) Part 9 – File Format (03/09)

128 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-127 Multimedia Framework 1. Digital Item Declaration 2. Content Representation 3. Digital Item Identification and Description 4. Content Management and Usage 5. Intellectual Property Management and Protection 6. Terminals and Networks 7. Event Reporting

129 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-128 Fundamental Concept: Digital Item  A structured digital object with a standard representation, identification and meta-data  The fundamental unit of distribution and transaction in the MPEG-21 framework  Digital Item = (resource + metadata + structure)  Resource: individual asset, e.g., MPEG-2 video  Metadata: descriptive information, e.g., MPEG-7  Structure: relationships among parts of the item Technologies: To create, manage, transport, distribute, and consume Digital Items (DI). Concepts: fundamental unit of distribution and transaction, DI, and Users interaction with DI. (MPEG- 21 Overview, MPEG doc. N5231)

130 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-129 Digital Item Model Protection Layer Metadata MPEG-7 Essence MPEG-1/MPEG-2 DOI Digital object Identifier

131 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-130 Resources MPEG-1 MPEG-2 MPEG-4 MPEG-7 Metadata New Metadata & Resource Forms Structure MPEG-21 Digital Item Fundamental Concept: Digital Item

132 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-131 Example Use Case Scenarios lManaging Complex Items  Scenario: create a media collection and send it to a friend  Current solution: email a large zip file  Digital Item can aggregate multiple resources into a logical unit or package with a rich set of descriptive info lAuthoring  Scenario: create a digital magazine with locale-specific content that will play on various devices  Current solution: author and manage multiple versions for each device, locale and user preference  Digital Item allows content to be self describing and configurable

133 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-132 Digital Item A structured digital object with a standard representation, identification and description (metadata) Unit of distribution & transaction in MPEG-21 multimedia framework DI = (resource + metadata + structure)  resource – individual asset (MPEG-2 video)  metadata – description (MPEG-7)  structure – relations among parts of the item

134 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-133 Digital Item Identification Uniquely identify Digital Items and parts/collection thereof Identification: a tag (number, …), e.g., ISBN Syntax: Uniform Resource Identifier (URI) urn:mpeg:mpeg21:diid:sss:nnn sss = name of the identification system nnn = unique identifier within that identification system Ex: ISAN:3943… = Int’l Standard Audiovisual Number:3943… Digital Object Identifier (DOI) Standardized identifier for intellectual property Persistent identification (unlike URL) Handle system for resolution of DOIs Registration Authority

135 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-134 Descriptor Container Item Component Descriptor Resource Descriptor Item Component Digital Item Declaration DID model: (XML-based) Container – logical package Item – grouping sub- items and/or components bounded to relevant descriptors Component – binding of a resource to descriptors Examples: CD package with music + video + graphics

136 National Cheng Kung University, Tainan, Taiwan Department of Electrical Engineering, Institute of Computer and Communication Engineering 11-135 User Model One user interacts with another user User A User B Transaction/Use/Relationship  Digital Item   Authorization/Value Exchange  : Event Reporting Digital Item Declaration Digital Item Identification & Description Content Management and Usage Intellectual Property Management & Protection Terminals & Networks Content Representation Examples: Container Item Resource Examples: Unique Identifiers Content Descriptors Examples: Storage Management Content Personalization Examples: Encryption Authentication Watermarking Examples: Resource Abstraction Resource Mgt. (QoS) Examples: Natural and Synthetic Scalable


Download ppt "數位三維視訊 楊 家 輝 Jar-Ferr Yang 電腦與通信工程研究所 電機工程學系 國立成功大學 Institute of Computer and Communication Engineering Department of Electrical Engineering National."

Similar presentations


Ads by Google