Briefly introduction to image/ video coding standard and FGS for MPEG-4 卓傳育.

Briefly introduction to image/ video coding standard and FGS for MPEG-4 卓傳育

Video Compression Standards ITU-T –International Telecommunication Union — Telecommunication Standardization (ITU-T) MPEG –Moving Picture Experts Group

International Telecommunication Union — Telecommunication Standardization (ITU-T) CCITT H.261 –ITU-T Study Group 15 –Videophone and video conferencing –1988-1990: p x 64 kbps (p = 1 … 30) ITU-T H.263 –PSTN and mobil network: 10 to 24 kbps –1994: H.263, H.263+ … ITU-T H.26l –Merging to JVT in MPEG-4 Part 10

MPEG: Moving Picture Experts Group Coding of Moving Video and Audio MPEG-1: CD-I, for Digital Storage, … -1992 MPEG-2: … + TV, HDTV, for Broadcast – 1994 MPEG-3: HDTV -> merged into MPEG-2 MPEG-4: Coding of Audiovisual Objects-V.1:1998; V.2:1999 Extensions ongoing MPEG-7: MM Description Interface – Fall 2001 ‘ Describing ’ audiovisual material MPEG-21: Digital Multimedia Framewrok – 1 st parts early 2002 ‘ The Big Picture and The Glue ’

Block-Based Coding Why divide to blocks? Image->Blocks

H.261 Video Formats Video Forma t Luminance (Y)Chrominance(Cb, Cr) pixels/lin e lines/fram e pixels/linelines/fram e CIF352288176144 QCIF1761448872 Y pixel Cb, Cr pixel Block boundary

Arrangement of H.261 12 34 56 78 910 1112 176 352 48 288 1 3 5 176 48 QCIF CIF

Arrangements of data structure in H.261 1 3 5 176 144 QCIF picture 1234567891011 1213141516171819202122 2324252627282930313233 176 48 GOB (Group Of Block) Y1Y2 Y3Y4 UV 8 8 8 8 16 MB (Macro Block)

Transform coding Encoder Decoder TQ Entropy coding Entropy coding Q -1 T -1 Image block Transform Coefficients Zigzag Scan (2D->1D) Bitstream Inverse Zigzag Scan (1D->2D) Reconstructed Transform Coefficients Reconstructed Image block

Transform (0,1) (1,0) (-1,1)(1,1) (0.2,1.8) = 0.2(1,0)+1.8(0,1) = 1(1,1)+0.8(-1,1)

Basis of Transform Basis vectors{v 1,v 2, …,v n } Orthogonal : (v i ) · (v j ) = 0 if i!=j Normalized : (v i ) · (v i ) = 1 Orthonormal : orthogonal and normalized –eg. orthonormal : {(0,1),(1,0)} Orthogonal : {(1,1),(-1,1)}

Why DCT is used for image compressing KLT(Karhunen-Loeve transform): –Statistically optimal transform: minimal MSE for any specific bandwidth reduction –KLT depends on the type of signal statistics –No fast algorithm DCT approaches KLT for highly correlated signals: –sample values typically vary slowly from point to point across an image =>Highly correlated signals –Fast algorithm(but not optimal)

DCT-basis

DCT :Discrete Cosine Transform Frequency DomainSpatial Domain [8,8,8,8,8,8,8,8] [8,8,8,8,8,8,8,9] [8,8,10,9,7,8,8,9] [8,90,-100,3,4,-10,2,80] DCT [44,0,0,0,0,0,0,0] [44,-2,0,-2,0,-2,0,-2] [46,-2,-2,-4,-2,2,0,-2] [48,-56,146,6,74,-148,-158,-136]

DCT 52 55 61 66 70 61 64 73 63 59 66 90 109 85 69 72 62 59 68 113 144 104 66 73 63 58 71 122 154 106 70 69 67 61 68 104 126 88 68 70 79 65 60 70 77 68 58 75 85 71 64 59 55 61 65 83 87 79 69 68 65 76 78 94 -415 -29 -62 25 55 -20 -1 3 7 -21 -62 9 11 -7 -6 6 -46 8 77 -25 -30 10 7 -5 -50 13 35 -15 -9 6 0 3 11 -8 -13 -2 -1 1 -4 1 -10 1 3 -3 -1 0 2 -1 -4 -1 2 -1 2 -3 1 -2 -1 -1 -1 -2 -1 -1 0 -1 Example of DCT

Quantization 目的：提高壓縮倍率缺點：還原後的值會有誤差原則：希望還原後的值，與原值差距較小再經過較佳的 IQ 1 1 1 4 4 4 7 7 7 10 10 10 再直接乘以 3 ( 一般的 IQ) 0 0 0 3 3 3 6 6 6 9 9 9 經過 Q( 整除以 3) 0 0 0 1 1 1 2 2 2 3 3 3 原值 0 1 2 3 4 5 6 7 8 9 10 11

-415 -29 -62 25 55 -20 -1 3 7 -21 -62 9 11 -7 -6 6 -46 8 77 -25 -30 10 7 -5 -50 13 35 -15 -9 6 0 3 11 -8 -13 -2 -1 1 -4 1 -10 1 3 -3 -1 0 2 -1 -4 -1 2 -1 2 -3 1 -2 -1 -1 -1 -2 -1 -1 0 -1 16 11 10 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 -415/16 = -26 Example of JPEG Coding(Encoder)

-26 -3 -6 2 2 0 0 0 1 -2 -4 0 0 0 0 0 -3 1 5 -1 -1 0 0 0 -4 1 2 -1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -415 -29 -62 25 55 -20 -1 3 7 -21 -62 9 11 -7 -6 6 -46 8 77 -25 -30 10 7 -5 -50 13 35 -15 -9 6 0 3 11 -8 -13 -2 -1 1 -4 1 -10 1 3 -3 -1 0 2 -1 -4 -1 2 -1 2 -3 1 -2 -1 -1 -1 -2 -1 -1 0 -1 Example of JPEG Coding(Encoder)

Zigzag Scan 2D->1D DC term AC term BACK

-26 -3 -6 2 2 0 0 0 1 -2 -4 0 0 0 0 0 -3 1 5 -1 -1 0 0 0 -4 1 2 -1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -26 –3 1 –3 2 –6 2 –4 1 –4 1 1 5 0 2 0 0 –1 2 0 0 0 0 0 –1 –1 EOB 2D->1D Example of JPEG Coding(Encoder) Transform coding(DCT) Quantization Zigzag Scan Zigzag Scan Entropy Coding (bit stream)

Entropy Coding (Variable-Length Coding) Huffman coding Run-length coding Arithmetic coding

Huffman Coding 設法讓 ” 出現次數最多 ” 的字 (word) ，使用最短的代碼 (code) 111110100Variable- Length Code 11100100Fixed- Length Code 1/24 1/63/4 出現機率 ‘D’‘C’‘B’‘A’ 範例 1*(3/4)+2*(1/6)+3*(1/24)+3 *(1/24) = 1.333 2*(3/4)+2*(1/6)+2*(1/24)+2 *(1/24) = 2 平均長度

DPCM : Differential PCM 若連續出現重複字 (word) 或相近字的機率很高，則 coding ” 差值 ” 會比個別 coding 每個字效果好例如 ‘ AAFFFFFCCC ’ –PCM => ’ 65,65,70,70,70,70,70,67,67,67 ’ or ‘ 0,0,5,5,5,5,5,2,2,2 ’ –DPCM => ’ 0,0,5,0,0,0,0,-3,0,0 ’

Run-Length Coding 0 0 0 –1 6 0 3 EOB ^^^^^^^ ^ ^^^ (3,1) (0,6) (1,3) 001111 001000010 001001010 10 3 1 0011 1s 3 2 0010 0100 s 0 5 0010 0110 s 0 6 0010 0001 s 1 2 0001 10s 1 3 0010 0101 s

Video Compression Encoder For Still Image TQ Entropy coding Image block Transform Coefficients Zigzag Scan (2D->1D) Bitstream Encoder For Video Sequence Q -1 T -1 Reconstructed Transform Coefficients Reconstructed Image block MC -

H.261 Intra frame – 傳整個 frame 的 information Inter frame – 會 reference 上一張 frame 傳 motion vector 傳差值

H.261 Coder DCTQ Inverse DCT Motion Compensation Loop Filter Video in Inverse Q

Motion Estimation (32,16) (-10,4) (22,20) Referenced frame Current frame Macro block 16*16 31*31

Full-search algorithm Current original frame Current referenced frame Maximum check ： 31*31=961

3-step search algorithm Current original frame Current referenced frame 距離 8->4->2->1 maximum check ： 1+8+8+8+8=33

NTSS(new 3-step search) algorithm -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

FSS(4-step search) algorithm -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

Overview of Fine Granularity Scalability in MPEG-4 Video Standard Weiping Li, Fellow, IEEE

Illustration of video coding performance

Multi-layer Coding

SNR scalability decoder defined in MPEG-2

Layered scalable coding Tech. Temporal scalability

Layered scalable coding Tech. Spatial scalability

BIT-PLANE CODING OF THE DCT COEFFICIENTS

FGS USING BIT-PLANE CODING OF DCT COEFFICIENTS Overall Coding Structure of FGS Some Details of FGS Coding Profile Definitions in the Amendment of MPEG-4

Overall Coding Structure of FGS FGS encoder structure

Overall Coding Structure of FGS FGS decoder structure

Some Details of FGS Coding 1)Different Numbers of Bit-Planes for Individual Color Components 2)Variable-Length Codes 3)Decoding Truncated Bitstreams

Different Numbers of Bit- Planes for Individual Color Components

Variable-Length Codes Statistics of the (RUN, EOP) symbols in the four VLC tables

Coding patterns for syntax element fgs_cbp

Decoding Truncated Bitstreams Decoding of the truncated bitstream is not standardized in MPEG-4. One possible method –To look ahead 32 bits at every byte-aligned position in the bitstream. –If the 32 bits are not fgs vop start code, the first 8 bits of the 32 bits are information bits of the FGS frame to be decoded. The decoder slides the bitstream pointer by one byte and looks ahead another 32 bits to check for fgs vop start code.

Briefly introduction to image/ video coding standard and FGS for MPEG-4 卓傳育.

Similar presentations

Presentation on theme: "Briefly introduction to image/ video coding standard and FGS for MPEG-4 卓傳育."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Briefly introduction to image/ video coding standard and FGS for MPEG-4 卓傳育.

Similar presentations

Presentation on theme: "Briefly introduction to image/ video coding standard and FGS for MPEG-4 卓傳育."— Presentation transcript:

Similar presentations

About project

Feedback