Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pré-analyse de la vidéo pour un codage adapté Application au codage de la TVHD en flux H.264 Olivier Brouard École Doctorale Sciences et Technologie de.

Similar presentations


Presentation on theme: "Pré-analyse de la vidéo pour un codage adapté Application au codage de la TVHD en flux H.264 Olivier Brouard École Doctorale Sciences et Technologie de."— Presentation transcript:

1 Pré-analyse de la vidéo pour un codage adapté Application au codage de la TVHD en flux H.264 Olivier Brouard École Doctorale Sciences et Technologie de lInformation et Mathématiques (EDSTIM) Spécialité : Automatique, Robotique, Traitement du Signal et Informatique Appliquée 20 juillet 2010 Encadrants : Dominique Barba et Vincent Ricordel

2 Pre-analysis of video for its advanced coding Application to the HDTV coding in H.264 streams Olivier Brouard École Doctorale Sciences et Technologie de lInformation et Mathématiques (EDSTIM) Spécialité : Automatique, Robotique, Traitement du Signal et Informatique Appliquée July 20 th 2010 Supervisors : Dominique Barba and Vincent Ricordel

3 Motivations Emergence of the HDTV New displays SDTV: 720x576 pixels HDTV: 1920x1080 pixels 26 April, 2014Olivier Brouard Introduction better immersion for the users more pixels (5x) Need for a new video coding standard H.264 (or MPEG-4 AVC) From SDTV to HDTV from 4% to 20% of the visual field Slide 3/47

4 H.264 Advanced video coder (dissymetrical coding) But short term decisions, « low level » signal based no coding consistency 26 April, 2014Olivier Brouard Introduction + prediction modes richness + advanced entropy coding higher bit rate reduction (up to 50% MPEG-2) Reference frames Slide 4/47

5 26 April, 2014Olivier Brouard Introduction Human as the final observer Needs Control the perceptual quality Ensure the coding temporal coherence of the objects avoid the perceptible distortions the rendering of an object has to be consistent temporally - blocking effects - flickering effects Slide 5/47

6 Objectives & proposals 26 April, 2014Olivier Brouard Introduction no such tools within the current encoders Solution realize a video pre-analysis before the encoding step guide the encoder in its decisions How to do ? medium/long term decisions « high level » considerations Slide 6/47

7 26 April, 2014Olivier Brouard Outline 1.Video pre-analysis 2.Applications: H.264 video coding 2.1 GOP structure adaptation 2.2 Adaptive quantization 1.1 Advanced motion estimation 1.2 Spatio-temporal segmentation 1.3 Visual attention modeling 1.Video pre-analysis Slide 7/47

8 26 April, 2014Olivier Brouard 1- Video pre-analysis Based on HVS properties « high level » information to the encoder Video pre-analysis The Human Visual System (HVS) Luminance perception Color perception Contrast sensibility Masking effects Visual Attention Bottom-Up guided by the saliency Top-Down guided by the tasks Slide 8/47

9 26 April, 2014Olivier Brouard 1- Video pre-analysis Visual attention Attributes guiding the deployment of visual attention [Wolfe 04] Contrast, Motion, Color, Orientation, … Visual attention modeling [Itti 01; Le Meur 07; Marat 10] based on the Koch and Ullman model [Koch 85] Perceptually important regions most salient objects (physically and semantically) Shapes of regions (saliency maps) shape of objects [Milanese 1993] Slide 9/47 moving objects attract our visual attention

10 26 April, 2014Olivier Brouard 1- Video pre-analysis Video pre-analysis Slide 10/47

11 26 April, 2014Olivier Brouard 1- Video pre-analysis – Advanced motion estimation Spatio-temporal tube (1) Visual fixing time in the HVS ~ 200 ms Next generation of HDTV 1920x1080 in progressive mode at 50Hz temporal segment of 9 frames: 180ms [Péchard 2007] Assumption - uniform motion coherence of the motion along a perceptually significant duration spatio-temporal tube motion vectors field more homogeneous Slide 11/47

12 26 April, 2014Olivier Brouard Spatio-temporal tube (2) Implementation spatial down-sampling temporal down-sampling - central frame current frame - 4 reference frames The spatio-temporal tube minimizes => MSE G MSE k based on the 3 YUV components with k = -4, -2, +2, +4 Slide 12/47 1- Video pre-analysis – Advanced motion estimation

13 26 April, 2014Olivier Brouard 1- Video pre-analysis – Spatio-temporal segmentation Global motion Apparent motions due to moving objects camera motion Motion segmentation based on the residual motion a 1, a 2, a 3, a 4 : deformation parameters t x, t y : translation parameters V x, V y : horizontal and vertical components of each MV (spatio-temporal tube) Affine model Slide 13/47

14 2. Accumulation of the residual MVs (tubes) 2-D histogram (tx, ty) 26 April, 2014Olivier Brouard 1- Video pre-analysis – Spatio-temporal segmentation Global motion parameters estimation Motion vectors fields parameters estimation [Coudray 2005] Global motion estimation in 2 steps: 1. For each MV (tube) calculation of the derivatives accumulation of the parameters assumptions localization of the main peak Slide 14/47

15 26 April, 2014Olivier Brouard 1- Video pre-analysis – Spatio-temporal segmentation Motion segmentation 2-D Histogram of the translation parameters residual MVs (tx, ty) Each histogram peak => a moving object analysis of all the peaks Iterative approach 1.Initialisation detection of the main peak greedy approach (local gradient) 2. Detection of the other peaks greedy approach Main peak Secondary peakSegmented space Accumulation histogram Slide 15/47

16 26 April, 2014Olivier Brouard 1- Video pre-analysis – Spatio-temporal segmentation Motion segmentation – results need of a spatial and temporal regularization Slide 16/47

17 26 April, 2014Olivier Brouard 1- Video pre-analysis Video pre-analysis Slide 17/47

18 26 April, 2014Olivier Brouard 1- Video pre-analysis – Spatio-temporal segmentation Spatio-temporal regularization Motion-based segmentation some blocks are misclassified more criteria to improve the segmentation connexity color texture motion Markovian approach Slide 18/47

19 26 April, 2014Olivier Brouard 1- Video pre-analysis – Spatio-temporal segmentation Markovian approach The Hammersley-Clifford theorem [Besag 1974] Gibbs distribution Markov Random Field the optimal label configuration minimize a global energy function E: label field O: observation field Slide 19/47 Markovian property U(o, e): sum of potential functions defined on cliques site spatio-temporal tube

20 Texture features texture distributions 2 spatial gradients (Sobel filters) Bhattacharrya coefficient 26 April, 2014Olivier Brouard 1- Video pre-analysis – Spatio-temporal segmentation Spatial regularization Spatial connexity Segmented region locally homogeneous Color features color distributions Bhattacharrya coefficient discrete densities Slide 20/47

21 26 April, 2014Olivier Brouard 1- Video pre-analysis – Spatio-temporal segmentation Temporal regularization Motion features distance between the MVs Regions tracking criteria - color, texture, recovery video objects tracking Temporal connexity Segmented region => temporally homogeneous segmentation map of the previous temporal segment Slide 21/47

22 26 April, 2014Olivier Brouard 1- Video pre-analysis – Spatio-temporal segmentation Energy minimization The global energy function Sequential sites processing stack of instability - potential functions - weigthing factors Slide 22/47

23 26 April, 2014Olivier Brouard 1- Video pre-analysis – Spatio-temporal segmentation Results motion segmentation only regularized spatio- temporal segmentation Slide 23/47

24 26 April, 2014Olivier Brouard 1- Video pre-analysis Video pre-analysis Slide 24/47

25 26 April, 2014Olivier Brouard 1- Video pre-analysis – Visual attention modeling Spatial saliency Spatial saliency based on the color contrast [Aziz 2008] color transformation: YUV to HSV Spatial saliency: S SP => combination of these 7 features color features influencing the visual attention 1- Saturation Contrast 2- Intensity Contrast 3- Hue Contrast 4- Opponents Contrast 5- Warm and Cold colors Contrast 6- Dominance of the warm colors 7- Dominance of the luminance and saturation Slide 25/47

26 26 April, 2014Olivier Brouard 1- Video pre-analysis – Visual attention modeling Temporal saliency Temporal saliency based on the relative motion maximum velocity of smooth pursuit of the eye [Daly 1998]: => 80°/s => temporal saliency S T : MV of the site s : dominant motion : relative motion of s => Slide 26/47

27 26 April, 2014Olivier Brouard 1- Video pre-analysis – Visual attention modeling Spatio-temporal saliency Fusion of the spatial saliency and temporal saliency maps Observers => focus on the center of the screen [Le Meur 2005] weighting by a 2-D gaussian function Slide 27/47

28 26 April, 2014Olivier Brouard 1- Video pre-analysis – Visual attention modeling Results Slide 28/47

29 26 April, 2014Olivier Brouard 1- Video pre-analysis Possible applications Video pre-analysis information - moving objects segmentation, objects tracking - color, texture - salient regions applications - advanced video coding - video transmission with priority (saliency maps) - video summarization, indexation - … ArchiPEG (ANR Project) - HD MPEG-4 AVC real-time compression - pre-analysis video resource Slide 29/47

30 26 April, 2014Olivier Brouard Outline 1.Video pre-analysis 2.Applications: H.264 video coding 2.1 GOP structure adaptation 2.2 Adaptive quantization 1.1 Advanced motion estimation 1.2 Spatio-temporal segmentation 1.3 Visual attention modeling 2.Applications: H.264 video coding Slide 30/47

31 26 April, 2014Olivier Brouard 2- Applications: H.264 video coding – GOP structure adaptation GOP structure Three kinds of frames: I, P, B GOP begins by a I frame intra coded P frames at regular intervals predicted B frames between P frames bi-predicted Fixed interval between I frames not adapted to changing scenes and temporal variations of the video => more bits dynamic GOP size irregular I-frames insertion Typically: number of B frames = 1 or 2 good trade-off between bitrate and quality low motion or panning of the camera increase the number of B-frames Slide 31/47

32 26 April, 2014Olivier Brouard 2- Applications: H.264 video coding – GOP structure adaptation B frames adaptation (1) Analysis of the video sequences x264 encoder different fixed number of B frames: 0, 1, 2, 3 optimal number of B frames => content dependent classify videos according to their content Slide 32/47

33 26 April, 2014Olivier Brouard 2- Applications: H.264 video coding – GOP structure adaptation B frames adaptation (2) Spatio-temporal characterization For each temporal segmentFor the entire sequence -> 2 indices to evaluate the spatio-temporal activity - IT: temporal activity => MVs - IS: spatial activity => MSE G Slide 33/47

34 26 April, 2014Olivier Brouard 2- Applications: H.264 video coding – GOP structure adaptation B frames adaptation (3) Classification space function of IT and IS classe C i => i B frames between P-P or I-P frames IT constant between P-P or I-P frames same rule for IS Slide 34/47

35 26 April, 2014Olivier Brouard 2- Applications: H.264 video coding – GOP structure adaptation GOP size adaptation (1) Changes detection within a video shot high motion significant changes reduce the interval low motion little variation increase the interval mid-range motion classical approach => fixed GOP size 2 thresholds to detect critical changes - s h => high motion - s b => low motion Slide 35/47

36 26 April, 2014Olivier Brouard 2- Applications: H.264 video coding – GOP structure adaptation GOP size adaptation (2) Analysis of IT evolution 3 cases Mid-range motionHigh motion Low motion Slide 36/47

37 26 April, 2014Olivier Brouard 2- Applications: H.264 video coding – GOP structure adaptation Performances 8 video sequences 4 different bitrates defined by an experts group Comparison between - x264 encoder: GOP size = 25, 2 B frames - a modified version => GOP structure adaptation Slide 37/47

38 26 April, 2014Olivier Brouard 2- Applications: H.264 video coding – GOP structure adaptation Results Rate – Distortion (PSNR) [Bjontegaard 2001] Slide 38/47

39 26 April, 2014Olivier Brouard 2- Applications: H.264 video coding – GOP structure adaptation Subjective tests Setup display resolution 1920x1080 normalized room [BT.500-11] ~30 naïve observers (72=8x4x2+8) video sequences Methodology ACR for each sequence observers have to assess the quality Slide 39/47

40 26 April, 2014Olivier Brouard 2- Applications: H.264 video coding – GOP structure adaptation Results Q GOP : MOS modified coder Q x264 : MOS x264 coder sequences with a high IT value high motion GOP structure adaptation Slide 40/47

41 Objective control the distribution of binaries resources saliency maps increase the perceived visual quality Modification of the saliency maps quantization and morphological filtering Modification of the coder 26 April, 2014Olivier Brouard 2- Applications: H.264 video coding – Adaptive quantization Adaptive quantization Slide 41/47

42 26 April, 2014Olivier Brouard 2- Applications: H.264 video coding – Adaptive quantization Results (1) Rate – Distortion (PSNR) [Bjontegaard 2001] Slide 42/47

43 26 April, 2014Olivier Brouard 2- Applications: H.264 video coding – Adaptive quantization Subjective assessments Results Q QA : MOS modified coder (adaptive quantization) Q x264 : MOS x264 coder no specific content suitable unsuitable for coding and broadcasting of HDTV at high bitrate overhead, linear law ? Slide 43/47

44 26 April, 2014Olivier Brouard Conclusion Conclusion (1) Video pre-analysis visual attention modeling saliency maps spatio-temporal segmentation detection of moving objects objects tracking Applications advanced video coding video transmission with priority based on the saliency maps [Boulos 2010] video summarization, indexation … Slide 44/47

45 26 April, 2014Olivier Brouard Conclusion Conclusion (2) Applications of the video pre-analysis GOP structure adaptation - B frames dynamic variation temporal segment classification IT and IS - GOP size adaptation I frame insertion change detection: IT Adaptive quantization based on the saliency maps Slide 45/47

46 26 April, 2014Olivier Brouard Conclusion Conclusion (3) Subjective quality assessment tests GOP structure adaptation no significant differences +0.18 (on a scale of 1 to 5) well suited for sequences with high motion Adaptive quantization no clearly content suitability seems unsuitable for coding and broadcasting of HDTV at high bitrate … adaptation law could be modified … Slide 46/47

47 Conclusion Perspectives 26 April, 2014Olivier Brouard Slide 47/47 Better performance evaluation of our visual attention model eye-tracking experiments Psychophysical experiments to optimize the model parameters improve the fusion process [Marat 2010] Add high-level visual information face, flesh hue, …

48 Thank you. Questions ? 26 April, 2014Olivier Brouard Slide 48


Download ppt "Pré-analyse de la vidéo pour un codage adapté Application au codage de la TVHD en flux H.264 Olivier Brouard École Doctorale Sciences et Technologie de."

Similar presentations


Ads by Google