Download presentation
Presentation is loading. Please wait.
Published byCandice Charles Modified over 8 years ago
1
Signal Prediction, Transformation & Decomposition Trần Duy Trác ECE Department The Johns Hopkins University Baltimore MD 21218
2
Outline Prediction Open-loop differential pulse-code modulation (DPCM) Closed-loop DPCM Optimal linear prediction Transformation & Signal Decomposition Transform fundamentals Basis functions, transform coefficients Invertibility, unitary 1D, 2D Karhunen-Loeve Transform (KLT) Optimal linear transform Discrete Cosine Transform (DCT) Filter Banks and the Discrete Wavelet Transform
3
Reminder reconstructed signal original signal compressed bit-stream Information theory VLC Huffman Arithmetic Run-length Quantization Prediction Transform
4
Predictive Coding We have only dealt with memory-less model so far Each symbol/sample is quantized and/or coded without much knowledge on previous ones There is usually a strong correlation between neighboring symbols/samples Simplest prediction scheme: take the difference! The range of the differences should be a lot smaller – a reduction in the signal entropy!
5
Open-Loop DPCM input signal communication channel encoder decoder reconstructed signal
6
Open-Loop DPCM: Analysis Q Q + Quantization Error Accumulation Encoder Decoder
7
Open-Loop DPCM: Analysis There seems to be a model mismatch Encoder: Decoder: How about using the reconstructed sample for the difference?
8
Closed-Loop DPCM: Analysis Modified prediction scheme Decoder remains the same No error accumulation!
9
Closed-Loop DPCM input signal communication channel encoder decoder reconstructed signal
10
Closed-Loop DPCM: Observations Quantization error does not accumulate Minor modification in prediction scheme leads to major encoder modification Encoder now has decoder embedded inside Closed-loop & open-loop DPCM has the same decoder DPCM predicts current sample from last reconstructed one Generalization? Replace the simple delay operator by more complicated & more sophisticated predictor P(z)
11
Linear Prediction communication channel encoder decoder
12
Optimal Linear Prediction Problem Assumptions Signal is WSS High bit rates, i.e., fine quantization Approach
13
Optimal Linear Prediction
15
Correlation matrix of WSS RP x[n] Cross correlation vector between x[n] and past observed samples approximated from time averaging
16
Linear Signal Representation input signal transform coefficient basis function Synthesis Analysis Approximation using as few coefficients as possible
17
Transform Fundamentals Q AnalysisSynthesis 1D Analysis Transform 1D Synthesis Transform
18
Invertibility & Unitary Invertibility perfect reconstruction, bi-orthogonal, reversible Unitary orthogonal, orthonormal same analysis & synthesis basis functions
19
Norm Preservation Norm preservation property of orthonormal transform From a coding perspective Q error in the transform domain equals Q error in the spatial domain! Concentrate on the quantization of the transform coefficients
20
2D Separable Transformation 2D Analysis 2D Synthesis transforming columns transforming rows N x N 2D Orthogonal Synthesis
21
Example: Fourier Transform Discrete Fourier Transform (DFT) k n Jean Baptiste Joseph Fourier
22
KLT: Optimal Linear Transform Karhunen-Loeve Transform (KLT) Hotelling transform, principle component analysis (PCA) Question: Amongst the linear transforms, which one is the best decorrelator, offering the best “compression”? Problem: Assumptions: x is zero-mean WSS RP.
23
KLT Reminder: Orthonormal constraint: Autocorrelation matrix
24
KLT basis = Eigenvectors Reminder: eigenvector eigenvalue Lagrange Multiplier Optimal coding scheme: send the larger eigenvalues first!
25
KLT Problems KLT problems Signal dependent Computationally expensive statistics need to be computed no structure, no symmetry, no guarantee of stability Real signals are really stationary Encoder/Decoder communication Practical solutions Assume a reasonable signal model Blocking the signals to ensure stationary assumption holds Making the transform matrix sparse & symmetric Good KLT approximation for smooth signals: DCT! Assumptions: x is zero-mean WSS RP.
26
KLT: Summary eigenvectors Signal dependent Require stationary signals How do we communicate bases to the decoder? How do we design “good” signal-independent transform?
27
Best Basis Revisited Fundamental question: what is the best basis? energy compaction: minimize a pre-defined error measure, say MSE, given L coefficients maximize perceptual reconstruction quality low complexity: fast-computable decomposition and reconstruction intuitive interpretation How to construct such a basis? Different viewpoints! Applications compression, coding signal analysis de-noising, enhancement communications
28
Discrete Cosine Transforms Type I Type II Type III Type IV
29
DCT Type-II DC 8 x 8 block low frequency high frequency middle frequency horizontal edges vertical edges DCT basis orthogonal orthogonal real coefficients real coefficients symmetry symmetry near-optimal near-optimal fast algorithms fast algorithms
30
DCT Symmetry DCT basis functions are either symmetric or anti-symmetric Prof. K. R. Rao
31
DCT: Recursive Property An M-point DCT–II can be implemented via an M/2-point DCT–II and an M/2-point DCT–IV Butterfly input samples DCT coefficients
32
Fast DCT Implementation 13 multiplications and 29 additions per 8 input samples
33
Block DCT input blocks, each of size M output blocks of DCT coefficients, each of size M
34
Filter Bank First FB designed for speech coding, [Croisier-Esteban-Galand 1976] Orthogonal FIR filter bank, [Smith-Barnwell 1984], [Mintzer 1985] Q 2 2 2 2 Analysis Bank Synthesis Bank low-pass filter high-pass filter
35
FB Analysis Q 2 2 2 2 Alternating-Sign Construction
36
History: Wavelets Early wavelets: for geophysics, seismic, oil-prospecting applications, [Morlet-Grossman-Meyer 1980-1984] Compact-support wavelets with smoothness and regularity, [Daubechies 1988] Linkage to filter banks and multi-resolution representation, fast discrete wavelet transform (DWT), [Mallat 1989] Even faster and more efficient implementations: lattice structure for filter banks, [Vaidyanathan-Hoang 1988]; lifting scheme, [Sweldens 1995] Prof. Y. MeyerProf. I. DaubechiesProf. S. Mallat
37
From Filter Bank to Wavelet [Daubechies 1988], [Mallat 1989] Constructed as iterated filter bank Discrete Wavelet Transform (DWT): iterate FB on the lowpass subband 0 frequency spectrum
38
1-Level 2D DWT input image LP HP LP HP decompose columns decompose rows LP HP LL LH HLHH LL: smooth approximation LH: horizontal edges HL: vertical edges HH: diagonal edges LH
39
LP HP LP HP LP HP decompose rowsdecompose columns 2-Level 2D DWT LP HP LP HP decompose rowsdecompose columns LP HP LH HLHH
40
2D DWT
41
Time-Frequency Localization best time localization best frequency localization STFT uniform tiling wavelet dyadic tiling Heisenberg’s Uncertainty Principle: bound on T-F product Wavelets provide flexibility and good time-frequency trade-off
42
Wavelet Packet Iterate adaptively according to the signal Arbitrary tiling of the time-frequency plain 0 frequency spectrum Question: can we iterate any FB to construct wavelets and wavelet packets?
43
Scaling and Wavelet Function Discrete Basis Continuous-time Basis product filters scaling function wavelet function
44
Convergence & Smoothness Not all FB yields nice product filters Two fundamental questions Will the infinite product converge? Will the infinite product converge to a smooth function? Necessary condition for convergence: at least a zero at Sufficient condition for smoothness: many zeros at with 2 zeros at with 4 zeros at
45
Sparsity in Video Signals: Motion Estimation & Compensation Trac D. Tran ECE Department The Johns Hopkins University Baltimore MD 21218
46
Outline Video coding algorithms and standards: an overview Main observations. Several simple video codecs Motion-compensated DCT coding (MC-DCT) Motion-compensated wavelet coding (MC-DWT) 3D wavelet and subband coding Block motion estimation (BME) and motion compensation Principle. Error measures. Optimality. Search strategies. Examples of fast sub-optimal algorithms Fast optimal algorithms: spiral search, successive elimination, projection matching... Combination searches: hierarchical, spatial correlation, rate- constrained... Advanced topics in motion estimation / motion compensation Global motion estimation Advanced motion models: sub-pixel motion, affine motion, meshes, variable-block-size MEMC...
47
Main Observations Video signal is a sequence of still images or frames Correlation in video sequence Temporal correlation: similar background with a few moving objects in the foreground Spatial correlation: similar pixels seem to group together – just like spatial correlation in images There is usually much more temporal correlation then spatial Motion model in video sequence Natural motion moving people & moving objects translational, rotational, scaling Camera motion camera panning, camera zooming, fading
48
Examples of Video Sequences Frame 1 51 71 91 111 Observations of Visual Data There is a lot of redundancy, correlation, strong structure within natural image/video Images Spatial correlation: a lot of smooth areas with occasional edges Video Temporal correlation: neighboring frames seem to be very similar
49
Simple Video Coders JPEG/JPEG2000 encodes every frame independently Quite popular: Motion-JPEG, Motion-JPEG2000 Does not take into account any temporal correlation Advantage: very simple and fast, no latency (frame delay), easy frame access for video editing Disadvantage: low coding performance JPEG/JPEG2000 encodes frame difference Requires very stationary background to be effective Advantage: improves compression, low latency Disadvantage: still low coding performance, open-loop design, quantization error accumulation from coder/decoder mismatch JPEG/JPEG2000 encodes frame difference – closed-loop Advantage: no error accumulation, low latency, simple Disadvantage: too simple motion model, still low compression
50
Motion-Compensated Framework Transform-based coding on motion-compensated prediction error (residue) Popular transformations: block DCT or wavelet Closed-loop DPCM to prevent error propagation (drifting) Usually employed block-translation motion model All international video coding standards are based on this coding framework Video teleconferencing: H.261, H.263, H.263++, H.26L/H.264 Video archive & play-back: MPEG-1, MPEG-2 (in DVDs) MPEG-4
51
Typical MC Codec Entropy Decoding, Inverse Q, Inverse Transform Motion Comp. Predictor Transform, Quantization, Entropy Coding Motion Estimation Frame Buffer (Delay) Motion Compensated Prediction Input Frame Encoded Residual (To Channel) Approximated Input Frame (To Display) Motion Vector and Block Mode Data (Side-Info, To Channel) Decoder
52
Motion Estimation Goal: extract correlation between adjacent video frames to improve compression efficiency General problem statement: Practical motion model: small objects or regions moving in translational fashion Block-based motion estimation (BME) and compensation (BMC) time axis
53
Intra- and Inter-Coding Inter-coding: blocks of predicted motion error labeled P-block is encoded Intra-coding any block that motion estimation fails to find a good match is labeled I-block and encoded as is (without any motion compensation) Conditional replenishment when prediction error is low, no coding or decoding is necessary we simply record the motion vector and replenish the block from the reference frame.
54
Motion Models Translation Affine Bilinear Perspective
55
Principle of BME Partition current frame into small non-overlapped blocks called macro-blocks (MB) For each block, within a search window, find the motion vector (displacement) that minimizes a pre-defined mismatch error For each block, motion vector and prediction error (residue) are encoded Search Window Reference Frame Current Frame MV
56
BME: Error Measure Sum of absolute differences Sum of squared errors (mean-squared error) Discussions: Other norms, correlation measure have been tested Approximately same coding performance SAD is less complex for some hardware architectures search range current blockreference block
57
Common Setting Macro-block Luminance: 16x16, four 8x8 blocks Chrominance: two 8x8 blocks Motion estimation only performed for luminance Motion vector range [ -15, 15] YY YY CrCb 8 8 16 15 Search Area
58
Search Strategies I All possible MV candidates within the search range are investigated Very computationally expensive Optimal, highly regular, parallel computable Exhaustive Search dx dy
59
Search Strategies II Sampling the search space At every stage, investigate a few candidates Move in the direction of the best match Can adaptively reduce the displacement size for each stage Gradient Search Sampling the search space Divide search space into regions Pick a center point for each region Perform more elaborate search on the region with best center Divide and Conquer
60
Search Strategies III Multi-resolution or Hierarchical Search current frame reference frame DD BME MV Field Full Search
61
Fast BME Algorithm: Example I Use + pattern at each stage (5 candidates) Move center to best match Reduce step size at each stage Fast, reasonable quality Similar algorithms: three-step-search, four- step-search, cross search... 2D Log Search dx dy
62
Fast BME Algorithm: Example II Divide-and-conquer Move search to region with best match Finer search at later stages Binary Search
63
Optimal BME: Spiral Search Early termination Start with a predicted search center, default=(0,0) Spiral search around center in diamond or square pattern Keep track of best match so far; update whenever a better candidate is found MPEG-4, H.263+ Spiral Search dx dy
64
Optimal BME: Partial Matching Triangle inequality Strategy Compute partial sums for macro-blocks in both current and reference frame Eliminate candidates based on partial sums Nice partial sums: row projections, column projections! Avoid full 2D matching, translate the problem into 1D matching Can be combined with spiral search and early termination partial sums
65
Spatial Correlation Based BME Exploit intra-frame spatial correlation to narrow down search space for BME Blocks covering the same object should move together MVs from causal neighboring blocks can be used to predict MV for the current block Reduce bit-rate for MV encoding: regularize the MV field Reduce BME complexity C1 234
66
BME/BMC: Example I Previous FrameCurrent FrameMotion Vector Field Frame DifferenceMotion-Compensated Difference
67
BME/BMC: Example II Previous FrameCurrent FrameMotion Vector Field Frame DifferenceMotion-Compensated Difference
68
Sub-Pixel Motion Estimation Sub-pixel motion vector resolution Use linear/bilinear interpolation to fill in sub-pixels Trade-offs: motion accuracy versus MV bit-rate and complexity increase H.26L uses down to 1/4-MEMC, maybe even 1/8 integer pixel half pixel AB CD b cd b = round[(A+B)/2] c = round[(A+C)/2] d = round[(A+B+C+D)/4]
69
B frames Possible temporal prediction for B pictures: b C1 C2 Frame k-1 Frame k Frame k+1
70
B Frames Can have better coding efficiency Average of two predictions reduces the variance New objects can be better predicted using future frames
71
Multiple Reference Frames More than one previously decoded pictures can be used as reference R-D optimization is also needed to choose the best reference frame Employed in H.264
72
Variable-Block-Size BME Motion model for H.264 Generalization of the traditional translation BME framework Sub-divide macro-block: 16x16, 16x8, 8x16, 8x8, 8x4, 4x8, 4x4 Mode 1Mode 2Mode 3Mode 4Mode 5Mode 6Mode 7
73
Mesh-Based Motion Estimation MPEG-4 object motion Affine warping motion model Deformable polygon meshes Similar MAD, SSE error measures Trade-offs: more accurate ME vs. tremendous complexity Bilinear and perspective motion models are rarely used in video coding
74
Global Motion Estimation Rarely used in practice: BME/BMC mostly suffices Reference frame resampling: an option in H.263+/H.263++ Global affine motion model: special-effect warping 3D subband & wavelet coding: align frames before temporal filtering [Taubman]
75
Conclusion Temporal correlation dominates spatial correlation in video sequences Motion estimation and motion compensation improve coding performance the most in video coding State-of-the-art video coders, such as the current H.26L verification model, still employ BME-block DCT coding framework Despite its simplicity, BME is still the bottleneck of real- time video communications
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.