EE591f Digital Video Processing Roadmap Introduction to Block Matching Algorithm Fast BMA Three classes of speed-up strategies Generalized BMA From integer-pel to fractional-pel From fixed block size to variable block size Deformable BMA (DBMA) or mesh-based BMA Experimental results How do block size and motion accuracy affect the MCP efficiency? EE591f Digital Video Processing
An Intuitive Way of Understanding Block Matching Algorithm (BMA) ? b e template f database EE591f Digital Video Processing
Block Matching in Motion Estimation d reference frame inquiry block in current frame a: (-3,-2) b: (-3,-1) c: (0,0) d: (1,2) EE591f Digital Video Processing
Motion Estimation and Compensation Motion Compensation With the estimated motion vector, the block in the reference frame is displaced to generate a prediction of the inquiry block in the current frame. Such procedure is called “motion compensation”. Motion Compensated Prediction (MCP) residues current frame displaced reference frame (d1,d2) : estimated motion vector EE591f Digital Video Processing
EE591f Digital Video Processing Two Key Elements in BMA Matching criterion: How do I measure the similarity between two blocks? Mean Square Error (MSE): L2 norm Mean Absolute Difference (MAD): L1 norm Search strategy: How do I find the best match of the given block? Exhaustive search: global minimum Non-exhaustive search: close to global minimum EE591f Digital Video Processing
Goal: Find the Best Tradeoff variance of MCP residues computational cost EE591f Digital Video Processing
EE591f Digital Video Processing Roadmap Introduction to Block Matching Algorithm Fast BMA Three classes of speed-up strategies Generalized BMA From fixed block size to variable block size From integer-pel to fractional-pel Experimental results How do block size and motion accuracy affect the MCP efficiency? EE591f Digital Video Processing
Benchmark: Exhaustive Search An example of window size T=7 It searches (2T+1)2=225 points in total EE591f Digital Video Processing
Fast Block Matching Algorithms Class-A (I-IV): ad-hoc speed-up strategies Class-B (V-VII): advanced speed-up strategies (wise use of computational resource to account for probabilities) Class-C (VIII): hierarchical strategy General Principle – trade complexity with performance EE591f Digital Video Processing
Fast BMA (I): 3-Step-Search 25 points EE591f Digital Video Processing
Fast BMA (II): Logarithmic Search search at most 5+4+2+3+2+8= 24 points EE591f Digital Video Processing
Fast BMA (III): Orthogonal Search search at most 2(3+2+2+2+2+2)= 26 points EE591f Digital Video Processing
Fast BMA (IV): Cross Search search at most 5+4+4+4= 17 points EE591f Digital Video Processing
Why does probabilistic modeling of MV help? Empirical pdf of motion vectors EE591f Digital Video Processing
Fast BMA (V): New 3-Step Search EE591f Digital Video Processing
New 3-Step Search: Examples EE591f Digital Video Processing
Fast BMA (VI): 4-Step Search Search the 9 checking points located at a 5-by-5 window to see if the point reaching the minimum distortion is found at the center? N Y N Is it at the corner or not? Search 5 additional Checking points Search 3 additional Checking points Y Repeat the procedure in the dashed box Final 3-by-3 search EE591f Digital Video Processing
4-Step Search: Examples EE591f Digital Video Processing
The Idea of Successive Refinement Note that in all previous approaches to fast BMA, we only consider the possibility of reducing the number of search points For each search point, we still need to calculate the matching criterion for a B-times-B block To further reduce the complexity, we might consider reducing the cost of each matching as well EE591f Digital Video Processing
Multi-resolution Representation of Images Multi-resolution representation by pyramid EE591f Digital Video Processing
EE591f Digital Video Processing Why does Hierarchical Strategy Help? Level-2 ME result Level-1 ME result Level-0 EE591f Digital Video Processing
Hierarchical Block Matching Algorithm (HBMA) EE591f Digital Video Processing
EE591f Digital Video Processing
EE591f Digital Video Processing Example: Three-level HBMA EE591f Digital Video Processing
Fast BMA (VIII): Hierarchical Search EE591f Digital Video Processing
EE591f Digital Video Processing Summary Why do we care fast BMA? Driven by the application demands of video coding Can we go beyond BMA? The block-based constraint is simple but not appropriate for accounting for arbitrary shape of moving objects The integer-pel accuracy is not sufficient to account for continuous nature of motion EE591f Digital Video Processing
EE591f Digital Video Processing Roadmap Introduction to Block Matching Algorithm Fast BMA Three classes of speed-up strategies Generalized BMA From integer-pel to fractional-pel From fixed block size to variable block size Deformable BMA (DBMA) or mesh-based BMA* Experimental results How do block size and motion accuracy affect the MCP efficiency? EE591f Digital Video Processing
Why Do We Need Fraction-pel? EE591f Digital Video Processing
EE591f Digital Video Processing Fractional-pel BMA 2N N linear interpolation 2M M original reference frame interpolated reference frame EE591f Digital Video Processing
EE591f Digital Video Processing Half-pel BMA 1 1 1 1 current frame digits indicate physical distances reference frame EE591f Digital Video Processing
Bilinear Interpolation (x,y) (x+1,y) (2x,2y) (2x+1,2y) (2x,2y+1) (2x+1,2y+1) (x,y+!) (x+1,y+1) O[2x,2y]=I[x,y] O[2x+1,2y]=(I[x,y]+I[x+1,y])/2 O[2x,2y+1]=(I[x,y]+I[x,y+1])/2 O[2x+1,2y+1]=(I[x,y]+I[x+1,y]+I[x,y+1]+I[x+1,y+1])/4 Generalize to 1/K pixel where K >2 EE591f Digital Video Processing
Hierarchical Strategy for Half-pel BMA Integer-pel Half-pel EE591f Digital Video Processing
Beyond Half-pel Accuracy There exist results supporting the further prediction efficiency gain from half-pel to quarter-pel; sometimes it is even worthwhile to reach 1/8-pel accuracy The improved prediction efficiency is comprised by modestly increased computational complexity and overhead Question: for what kind of video, finer-accuracy improves the MCP efficiency most? EE591f Digital Video Processing
Generalizations of BMA Variable block-size matching algorithms Widely used by various video coding standards H.264 includes three variable block sizes: 4-by-4, 8-by-8 and 16-by-16 Fractional-pel accuracy BMA Half-pel : MPEG-1/2/4, H.263/H.263+ Quarter-pel: H.264 (even 1/8-pel) Tradeoff between overhead on motion and MCP efficiency EE591f Digital Video Processing
Variable Block-size BMA 16-by-16 8-by-8 4-by-4 EE591f Digital Video Processing
BMA Strategy Adopted by H.263 Macroblock level Block level EE591f Digital Video Processing
BMA Strategy Adopted by H.264 Note: require overhead to signal which partition is adopted by the encoder EE591f Digital Video Processing
Deformable Block Matching Algorithm EE591f Digital Video Processing
EE591f Digital Video Processing Overview of DBMA Three steps: Partition the anchor frame into regular blocks Model the motion in each block by a more complex motion The 2-D motion caused by a flat surface patch undergoing rigid 3-D motion can be approximated well by projective mapping Projective Mapping can be approximated by affine mapping and bilinear mapping Estimate the motion parameters block by block independently Discontinuity problem cross block boundaries still remain EE591f Digital Video Processing
Affine and Bilinear Model Affine (6 parameters): Good for mapping triangles to triangles Bilinear (8 parameters): Good for mapping blocks to quadrangles EE591f Digital Video Processing
Mesh-Based Motion Estimation A control grid is used to partition a frame into non-overlapping polygon elements. The nodal motion is constrained so that a feasible mesh is still formed with the motion. (a) Using a triangular mesh (b) Using a quadrilateral mesh EE591f Digital Video Processing
Mesh-based vs Block-based (a) block-based ME (b) mesh-based ME (c) mesh-based motion tracking EE591f Digital Video Processing
EE591f Digital Video Processing Example: BMA vs Mesh-based Target Anchor EBMA (half-pel) (29.86dB) Predicted EE591f Digital Video Processing Mesh-based method (29.72dB)
EE591f Digital Video Processing Roadmap Introduction to Block Matching Algorithm Fast BMA Three classes of speed-up strategies Generalized BMA From fixed block size to variable block size From integer-pel to fractional-pel Experimental results How do block size and motion accuracy affect the MCP efficiency? EE591f Digital Video Processing
EE591f Digital Video Processing Experiment Results Frame #1 Frame #2 EE591f Digital Video Processing
Motion-Compensated Prediction Residues 16-by-16 block, integer-pel, var(e)=271.8 EE591f Digital Video Processing
EE591f Digital Video Processing Motion-Compensated Prediction Residues 8-by-8 block, integer-pel, var(e)=220.8 EE591f Digital Video Processing
EE591f Digital Video Processing Motion-Compensated Prediction Residues 16-by-16 block, half-pel, var(e)=164.2 EE591f Digital Video Processing
EE591f Digital Video Processing Motion-Compensated Prediction Residues 8-by-8 block, half-pel, var(e)=123.8 EE591f Digital Video Processing