Philipp Merkle, Aljoscha Smolic Karsten Müller, Thomas Wiegand CSVT 2007.

Philipp Merkle, Aljoscha Smolic Karsten Müller, Thomas Wiegand CSVT 2007

Outline Multi-view video coding (MVC) introduction Requirements and test conditions for MVC Prediction structures Experimental results Conclusion 2

MVC Introduction MVC: Multi-view Video Coding Multi-view video (MVV): A system that uses multiple camera views of the same scene is called. Usage: 3DTV, free viewpoint video(FVV), etc. 3

Requirements for MVC Temporal random access View random access Scalability Backward compatibility Quality consistency Parallel processing 4

Temporal and inter-view correlation 5 T T T temporal/inter-view mixed mode Inter-view temporal/inter-view mixed mode Temporal

Temporal and inter-view correlation analysis 6 H.264/AVC encoder was used with the following settings: Motion compensation block size of 16*16 Search range of ±32 pixels Lagrange parameter (λ) of 29.5 denotes the decrease of the average in comparison to temporal prediction only.

Simply including temporal and inter-view prediction modes 7 Temporal and inter-view correlation analysis (cont’d)

Lagrangian cost function Lagrangian cost function: D denotes distortion. R denotes number of bits to transmit all components of the motion vector. For each block in a picture, algorithm chooses MV within a search rage that minimizes. The distortion in the subject macroblock B is calculated by: 8 (1) (2) (3)

1D camera: Ballroom, Exit, Rena, Race1, Uli, (line) Breakdancers (arched) 2D camera: Flamenco2 (cross), AkkoKayo (array) Use 5 to 16 camera views Target high quality TV-type video (640*480 or 1024*768) then limited channel communication- type video. 9 Test data and test conditions

Knowledge – hierarchical B picture, QP cascading Hierarchical B picture, key picture, non-key picture: QP cascading : [1] 10 key picture [1] “Analysis of hierarchical B pictures and MCTF”, ICME 2006, IEEE International Conference on Multimedia and Expo, Toronto, Ontario, Canada, July 2006

Knowledge – DPB size Decoded Picture Buffer (DPB) size is increased to: [2] 11 [2] “Efficient Compression of Multi-view Video Exploiting Inter-view Dependencies Based on H.264/AVC”, ICME 2006, IEEE International Conference on Multimedia and Expo, Toronto, Ontario, Canada, July 2006 Memory-efficient reordering of multi-view input for compression

Two tasks 1. To adapt the multi-view prediction schemes to the specific camera arrangements of the test data sets. 2. To adapt the prediction structures to the random access specification. 12

Prediction structure Simulcast coding structure To allow synchronization and random access, all key pictures are coded in intra mode. 13

Prediction structure (cont’d) The first view is called base view (remains the I frame). 14

Prediction structure (cont’d) Alternative structures of inter-view for key pictures 15 KS_IPPKS_PIPKS_IBP KS_IPP KS_PIP KS_IBP Linear camera arrangement2D Camera array

Prediction structure (cont’d) Inter-view prediction for key and non-key pictures 16 AS_IPP mode

Experimental results – objective evaluation 17 Ballroom test result Average coding gains compared with anchor coding

Experimental results – subjective evaluation Different bit-rates were selected for the different data sets. 18 Ballroom test result Race1 test result

Experimental results – subjective evaluation AS_IBP outperforms the anchors significantly. The gain decreases slightly with higher bit-rates. 19 Average results over all test sequences

Influence of camera density Using Rena sequence, and consisting of 16 linear arranged cameras with a 5 cm distance between two adjacent cameras Repeated for each shifted set of 9 adjacent cameras The structure are applied to every time instance of the MVV sequence without temporal prediction. 20

Results of experiments on camera density Coding gain increases with decreasing camera distance and decreasing reconstruction quality. 21

Results of experiments on camera density (cont’d) Results of average per camera rate relative to the one camera case(→) A larger QP value leads to a larger coding gain 22

Conclusion Resulting multi-view prediction: achieving significant coding gains and being highly flexible. Parallel processing is supported by the presented sequential processing approach. Problems: Large disparities between the different views of multi- view video sequences Illumination and color inconsistencies across views 23

Philipp Merkle, Aljoscha Smolic Karsten Müller, Thomas Wiegand CSVT 2007.

Similar presentations

Presentation on theme: "Philipp Merkle, Aljoscha Smolic Karsten Müller, Thomas Wiegand CSVT 2007."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Philipp Merkle, Aljoscha Smolic Karsten Müller, Thomas Wiegand CSVT 2007.

Similar presentations

Presentation on theme: "Philipp Merkle, Aljoscha Smolic Karsten Müller, Thomas Wiegand CSVT 2007."— Presentation transcript:

Similar presentations

About project

Feedback