Presentation on theme: "GPU Virtualization Support in Cloud System Ching-Chi Lin Institute of Information Science, Academia Sinica Department of Computer Science and Information."— Presentation transcript:
GPU Virtualization Support in Cloud System Ching-Chi Lin Institute of Information Science, Academia Sinica Department of Computer Science and Information Engineering, Nation Taiwan University Chih-Yuan Yeh, Chung-Yao Kao, Wei-Shu Hung Department of Computer Science and Information Engineering, Nation Taiwan University Pangfeng Liu Department of Computer Science and Information Engineering, Nation Taiwan University Graduate Institute of Networking and Multimedia, Nation Taiwan University Jan-Jan Wu Institute of Information Science, Academia Sinica Research Center for Information Technology Innovation, Academia Sinica Kuang-Chih Liu Cloud Computing Center for Mobile Applications, Industrial Technology Research Institute, Hsinchu, Taiwan
Introduction Cloud computing is very popular. Virtualization ◦ Share hardware resources. CPU Memory What about GPU?
GPU Graphic Processing Unit (GPU) ◦ A specialized microprocessor that accelerates images rendering and displaying. ◦ Hundreds of cores. Advantage ◦ Better performance/cost ratio. ◦ Powerful parallel computing capability.
GPGPU “General-purpose computing on graphics processing units”. ◦ supercomputing, molecular dynamics, protein folding, and planetary system simulation, etc. Various programming environment to support GPGPU. ◦ CUDA ◦ OpenCL
Our Goal Virtualize GPU ◦ Cloud user can rent VM to execute CUDA programs. Difficulties ◦ No built-in time sharing mechanism. ◦ Information of GPU is not available. Driver source code
Main Idea Gather and pack GPU kernels into batch. Concurrent kernel execution ◦ A technique that execute CUDA kernels concurrently.
Concurrent Kernel Execution Use NVidia Fermi architecture with CUDA v4.0.
Combiner Runs in domain-0. Packing kernels ◦ Parses the source codes. ◦ Creates different CUDA streams. ◦ Prepares the combined kernel for concurrent execution.
Combining Policies Chooses the kernels by FIFO. Send a batch to the Executor if ◦ The combined kernel will use at least 90% of the GPU resources. ◦ There are 16 kernels in the batch. ◦ There are no incoming kernels within the last 10 seconds.
Experiment Setting Xen hypervisor Physical machine ◦ Intel Core i processor with 4 cores running at 3.40GHz ◦ 8GB memory ◦ NVidia GTX 560-Ti GPU ◦ CUDA 4.0 Virtual machine ◦ Dual core CPU, 1GB memory, 20GB disk ◦ Ubuntu with linux 3.5 kernel
Performance Evaluation The ratio decreases while the concurrency increasing. ◦ However, not linear due to overhead.
Different Programs Mixture
Conclusion This paper propose a GPU virtualization architecture using Nvidia Fermi GPU. ◦ Reduces execution time ◦ Increases system throughput. Future works ◦ Apply Nvidia Kepler GPU. ◦ Better kernel packing policies.