Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

Slides:



Advertisements
Similar presentations
Parallel Scalability and Efficiency of HEVC Parallelization Approaches
Advertisements

Wen-Hsiao Peng Chun-Chi Chen
Parallelizing Video Transcoding With Load Balancing On Cloud Computing Song Lin, Xinfeng Zhang, Qin Y, Siwei Ma Circuits and Systems, 2013 IEEE.
Parallel H.264 Decoding on an Embedded Multicore Processor
POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? ILP: VLIW Architectures Marco D. Santambrogio:
Time Optimization of HEVC Encoder over X86 Processors using SIMD
MULTIMEDIA PROCESSING STUDY AND IMPLEMENTATION OF POPULAR PARALLELING TECHNIQUES APPLIED TO HEVC Under the guidance of Dr. K. R. Rao By: Karthik Suresh.
MULTIMEDIA PROCESSING
A Highly Parallel Framework for HEVC Coding Unit Partitioning Tree Decision on Many-core Processors Chenggang Yan, Yongdong Zhang, Jizheng Xu, Feng Dai,
Software Architecture of High Efficiency Video Coding for Many-Core Systems with Power- Efficient Workload Balancing Muhammad Usman Karim Khan, Muhammad.
Efficient Bit Allocation and CTU level Rate Control for HEVC Picture Coding Symposium, 2013, IEEE Junjun Si, Siwei Ma, Wen Gao Insitute of Digital Media,
1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.
{ Fast Disparity Estimation Using Spatio- temporal Correlation of Disparity Field for Multiview Video Coding Wei Zhu, Xiang Tian, Fan Zhou and Yaowu Chen.
CABAC Based Bit Estimation for Fast H.264 RD Optimization Decision
Evaluation of Data-Parallel Splitting Approaches for H.264 Decoding
Wei Zhu, Xiang Tian, Fan Zhou and Yaowu Chen IEEE TCE, 2010.
Highly Parallel Rate-Distortion Optimized Intra-Mode Decision on Multicore Graphics Processors Ngai-Man Cheung, Oscar C. Au, Senior Member, IEEE, Man-Cheung.
Shaobo Zhang, Xiaoyun Zhang, Zhiyong Gao
1 Efficient Multithreading Implementation of H.264 Encoder on Intel Hyper- Threading Architectures Steven Ge, Xinmin Tian, and Yen-Kuang Chen IEEE Pacific-Rim.
1 Slice-Balancing H.264 Video Encoding for Improved Scalability of Multicore Decoding Michael Roitzsch Technische Universität Dresden ACM & IEEE international.
1 An Efficient Mode Decision Algorithm for H.264/AVC Encoding Optimization IEEE TRANSACTION ON MULTIMEDIA Hanli Wang, Student Member, IEEE, Sam Kwong,
Source-Channel Prediction in Error Resilient Video Coding Hua Yang and Kenneth Rose Signal Compression Laboratory ECE Department University of California,
A New Rate-Complexity-QP Algorithm for HEVC Intra-Picture Rate Control LING TIAN, YIMIN ZHOU, AND XIAOJUN CAO 2014 INTERNATIONAL CONFERENCE ON COMPUTING,
BIN LI, HOUQIAN LI, LI LI, AND JINLEI ZHANG IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL.23, NO.9, SEPTEMBER
Block Partitioning Structure in the HEVC Standard
BY AMRUTA KULKARNI STUDENT ID : UNDER SUPERVISION OF DR. K.R. RAO Complexity Reduction Algorithm for Intra Mode Selection in H.264/AVC Video.
Complexity Model Based Load- balancing Algorithm For Parallel Tools Of HEVC Yong-Jo Ahn, Tae-Jin Hwang, Dong-Gyu Sim, and Woo-Jin Han 2013 IEEE International.
G. Valenzise *, M. Tagliasacchi *, S. Tubaro *, L. Piccarreta Picture Coding Symposium 2007 November 7-9, 2007 – Lisboa, Portugal * Dipartimento di Elettronica.
HARDEEPSINH JADEJA UTA ID: What is Transcoding The operation of converting video in one format to another format. It is the ability to take.
1. 1. Problem Statement 2. Overview of H.264/AVC Scalable Extension I. Temporal Scalability II. Spatial Scalability III. Complexity Reduction 3. Previous.
Liquan Shen Zhi Liu Xinpeng Zhang Wenqiang Zhao Zhaoyang Zhang An Effective CU Size Decision Method for HEVC Encoders IEEE TRANSACTIONS ON MULTIMEDIA,
Online Dictionary Learning for Sparse Coding International Conference on Machine Learning, 2009 Julien Mairal, Francis Bach, Jean Ponce and Guillermo Sapiro.
PROJECT PROPOSAL HEVC DEBLOCKING FILTER AND ITS IMPLIMENTATION RAKESH SAI SRIRAMBHATLA UTA ID: EE 5359 Under the guidance of DR. K. R. RAO.
Authors: Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn [Systems Technology Lab, Intel Corporation] Source: 2007 ACM/IEEE conference on Supercomputing.
Philipp Merkle, Aljoscha Smolic Karsten Müller, Thomas Wiegand CSVT 2007.
Performance Evaluation of Parallel Processing. Why Performance?
Performance Tuning on Multicore Systems for Feature Matching within Image Collections Xiaoxin Tang*, Steven Mills, David Eyers, Zhiyi Huang, Kai-Cheung.
EE 5359 PROJECT PROPOSAL FAST INTER AND INTRA MODE DECISION ALGORITHM BASED ON THREAD-LEVEL PARALLELISM IN H.264 VIDEO CODING Project Guide – Dr. K. R.
Adaptive Multi-path Prediction for Error Resilient H.264 Coding Xiaosong Zhou, C.-C. Jay Kuo University of Southern California Multimedia Signal Processing.
Low-Power H.264 Video Compression Architecture for Mobile Communication Student: Tai-Jung Huang Advisor: Jar-Ferr Yang Teacher: Jenn-Jier Lien.
3 rd Nov CSV881: Low Power Design1 Power Estimation and Modeling M. Balakrishnan.
2 3 Be introduced in H.264 FRExt profile, but most H.264 profiles do not support it. Do not need motion estimation operation.
Rate-GOP Based Rate Control for HEVC SHANSHE WANG, SIWEI MA, SHIQI WANG, DEBIN ZHAO, AND WEN GAO IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING,
Figure 1.a AVS China encoder [3] Video Bit stream.
-BY KUSHAL KUNIGAL UNDER GUIDANCE OF DR. K.R.RAO. SPRING 2011, ELECTRICAL ENGINEERING DEPARTMENT, UNIVERSITY OF TEXAS AT ARLINGTON FPGA Implementation.
A Robust Luby Transform Encoding Pattern-Aware Symbol Packetization Algorithm for Video Streaming Over Wireless Network Dongju Lee and Hwangjun Song IEEE.
IEEE Transactions on Consumer Electronics, Vol. 58, No. 2, May 2012 Kyungmin Lim, Seongwan Kim, Jaeho Lee, Daehyun Pak and Sangyoun Lee, Member, IEEE 報告者:劉冠宇.
UNDER THE GUIDANCE DR. K. R. RAO SUBMITTED BY SHAHEER AHMED ID : Encoding H.264 by Thread Level Parallelism.
-BY KUSHAL KUNIGAL UNDER GUIDANCE OF DR. K.R.RAO. SPRING 2011, ELECTRICAL ENGINEERING DEPARTMENT, UNIVERSITY OF TEXAS AT ARLINGTON FPGA Implementation.
Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences Processes Scheduling on Heterogeneous Multi-core Architecture.
COMPARATIVE STUDY OF HEVC and H.264 INTRA FRAME CODING AND JPEG2000 BY Under the Guidance of Harshdeep Brahmasury Jain Dr. K. R. RAO ID MS Electrical.
UNDER THE GUIDANCE DR. K. R. RAO SUBMITTED BY SHAHEER AHMED ID : Encoding H.264 by Thread Level Parallelism.
Time Optimization of HEVC Encoder over X86 Processors using SIMD
Time Optimization of HEVC Encoder over X86 Processors using SIMD Kushal Shah Advisor: Dr. K. R. Rao Spring 2013 Multimedia.
Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,
1 Hierarchical Parallelization of an H.264/AVC Video Encoder A. Rodriguez, A. Gonzalez, and M.P. Malumbres IEEE PARELEC 2006.
Vishnu Vardhan Reddy Mukku Mav ID : Under the guidance of.
“Temperature-Aware Task Scheduling for Multicore Processors” Masters Thesis Proposal by Myname 1 This slides presents title of the proposed project State.
E ARLY TERMINATION FOR TZ SEARCH IN HEVC MOTION ESTIMATION PRESENTED BY: Rajath Shivananda ( ) 1 EE 5359 Multimedia Processing Individual Project.
Early termination for tz search in hevc motion estimation
Steven Ge, Xinmin Tian, and Yen-Kuang Chen
CLARA Based Application Vertical Elasticity
Quad-Tree Motion Modeling with Leaf Merging
Study and Optimization of the Deblocking Filter in H
PROJECT PROPOSAL HEVC DEBLOCKING FILTER AND ITS IMPLIMENTATION RAKESH SAI SRIRAMBHATLA UTA ID: EE 5359 Under the guidance of DR. K. R. RAO.
Fast Decision of Block size, Prediction Mode and Intra Block for H
/ Fast block partitioning method in HEVC Intra coding for UHD video /
Project Title Team Members EE/CSCI 451: Project Presentation
Bongsoo Jung, Byeungwoo Jeon
An Efficient Spatial Prediction-Based Image Compression Scheme
Presentation transcript:

Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)

Outline  Introduction  Parallelism Evaluation Of HEVC Encoding  Proposed Method  Experimental Results  Conclusion 2

Introduction  Great increment of computational complexity introduced by the enhanced coding tools makes HEVC difficult for application.  By developing the parallelism among the encoding tasks, the encoding speed can be significantly improved. 3

Introduction  Compared with slices, WPP can achieve similar parallelism with less loss of coding efficiency.  In [11], Chi et al. proposed an Overlapped WaveFront (OWF) method based on WPP. [11] C. C. Chi, M. Alvarez-Mesa, B. Juurlink, G. Clare, F. Henry, S. Pateux, and T. Schierl, “Parallel Scalability and Efficiency of HEVC Parallelization Approaches,” IEEE Trans. Circuits Syst. Video Technol., vol. 22, pp , Dec

Parallelism Evaluation Of HEVC Encoding(1/3)  T i,j,k : Self Encoding Complexity (SEC) of C i,j,k.  SEC can be evaluated by the encoding time.  Determined by the frame content and RDO design and does not change with parallel methods.  ETF(C i,j,k ) : Required Encoding Complexity (REC) to encode C i,j,k using parallel method F.  REC can be regarded as the earliest ending time.  Affected by the data dependence. 5

Parallelism Evaluation Of HEVC Encoding(2/3) i, j, k : order of frame, line, and CTU. DEP F,inter (C i,j,k ) : CTBs that C i,j,k depends on when using parallel encoding method F. 6

Parallelism Evaluation Of HEVC Encoding(3/3)  From (1) and (2), it is clear that the parallelism of different parallel methods can be evaluated:  This criterion is easy to be proved with (1) and (2) and can be simply explained as the less dependence in HEVC encoding, the higher parallelism can be obtained. 7

Data Dependence Analysis of WPP and OWF Method(1/4)  For intra : 8

Data Dependence Analysis of WPP and OWF Method(2/4)  SEC of each CTB is of significant difference.  Variance of the SEC in inter frame is much greater than that of intra frame.  Under the given encoding algorithm, the unbalanced SEC is determined, thus being the bottleneck of intra-frame parallelism. 9

Data Dependence Analysis of WPP and OWF Method(3/4) 10

Data Dependence Analysis of WPP and OWF Method(4/4)  For inter : i, j, k : order of frame, line, and CTU. W : the width of a frame measured by CTB. L_OWF : a positive integer parameter denoting the safe range. In [11], L_OWF is roughly set to the upper round of 1/4 height of a frame measured by CTB. 11

Proposed Method(1/5)  To best exploit the inter-frame parallelism, we designed a new Inter-frame Wavefront (IFW) coding order. 12

Proposed Method(2/5)  For intra :  For inter : 13

Proposed Method(3/5)  Frame Thread (FT) is assigned to each frame to develop inter-frame parallelism.  Wavefront Thread (WT) is assigned to each frame to develop intra-frame parallelism. 14

Proposed Method(4/5)  If L_IFW is no greater than L_OWF, for any i, j, k we can deduce that: 15

Proposed Method(5/5)  It is also confirmed that the unbalanced SEC is a bottleneck for intra-frame parallelism.  Parallelism of IFW significantly increases as B-frames increase, because the effectively reduced inter-frame dependence makes much greater contribution in improving the overall parallelism. 16

Experimental Results  The common test conditions and software reference configurations [12].  The hardware platform is a shared memory system with two AMD Opteron 6272 processors. 17

Experimental Results(2/) 18

Experimental Results  Frame Thread = 9, Wavefront Thread = 8 19

20

x265 21

Conclusion  A parallelism evaluation criterion and an IFW method are proposed to improve the encoding speed of HEVC.  IFW method achieves significant speedup on various sequences, being a promising technology for large-scale HEVC video applications. 22