Presentation is loading. Please wait.

Presentation is loading. Please wait.

OpenSSL acceleration using Graphics Processing Units

Similar presentations


Presentation on theme: "OpenSSL acceleration using Graphics Processing Units"— Presentation transcript:

1 OpenSSL acceleration using Graphics Processing Units
Pedro Miguel Costa Saraiva

2 Introduction Cryptography: The study of security techniques
OpenSSL acceleration using Graphics Processing Units Introduction Cryptography: The study of security techniques SSL: A set of rules governing authentication and encrypted client/server communication De facto standard for secure electronic communications Computationally intensive Large volumes of SSL traffic impact performance Pedro Miguel Costa Saraiva

3 OpenSSL acceleration using Graphics Processing Units
Introduction GPU: A specialised processing unit designed to manipulate graphics Originally used solely for graphics calculations Recent developments enable its use for general purpose computing Massive computational power Pedro Miguel Costa Saraiva

4 Introduction OpenSSL Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Introduction OpenSSL Open-source implementation of the SSL and TLS protocols Core-library implements a variety of cryptographic functions Intensively used by an extremely large number of both open and proprietary applications Pedro Miguel Costa Saraiva

5 Introduction Objectives Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Introduction Objectives Efficiently offload cryptographic operations onto a GPU Add GPU functionality to OpenSSL Lighten the load on the CPU Pedro Miguel Costa Saraiva

6 Introduction Pedro Miguel Costa Saraiva Structure State of the art
OpenSSL acceleration using Graphics Processing Units Introduction Structure State of the art OpenSSL GPU Programming the GPU OpenCL CUDA OpenCL vs CUDA Main challenges Implementation Results Conclusion Pedro Miguel Costa Saraiva

7 State of the art OpenSSL Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units State of the art OpenSSL Commercial-grade full-featured open source toolkit Divided into libssl and libcrypto Core library written in C Supports accelerator hardware via engines Pedro Miguel Costa Saraiva

8 State of the art GPU Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units State of the art GPU Massive parallel processing power Roughly ten times the floating point capability of a high end CPU Faster growth rate than CPUs Pedro Miguel Costa Saraiva

9 State of the art GPU - Programming Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units State of the art GPU - Programming At the end of the 90s, graphics cards could not be programmed Things changed in 2001 with the release of DirectX 8 and OpenGL Programmers had to express their computations in terms of textures, vertices and shader programs Pedro Miguel Costa Saraiva

10 State of the art GPU - Programming Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units State of the art GPU - Programming 2006: NVIDIA created the CUDA framework ATI created the CTM low-level framework 2008: NVIDIA and ATI joined the Khronos Group Development of an industry standard for hybrid computing OpenCL version 1.0 released in December 2008 Pedro Miguel Costa Saraiva

11 State of the art GPU - OpenCL Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units State of the art GPU - OpenCL Open, royalty-free standard for general purpose programming Supports CPUs, GPUs, and other types of processors Maintained by the non-profit consortium Khronos Group Adopted by Intel, AMD, NVIDIA, and ARM Holdings Pedro Miguel Costa Saraiva

12 State of the art GPU - OpenCL Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units State of the art GPU - OpenCL API for coordinating parallel computation across different processors Cross-platform programming languages Subset of ISO C99 Low performance on NVIDIA GPUs Pedro Miguel Costa Saraiva

13 State of the art GPU - CUDA Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units State of the art GPU - CUDA Proprietary hardware and software architecture Designed by NVIDIA Manages computations on a GPU API is programmed with “C for CUDA” Third party wrappers available for other languages Pedro Miguel Costa Saraiva

14 State of the art GPU - Main Challenges Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units State of the art GPU - Main Challenges Well suited to extremely parallel problems Interaction between threads should be minimal Diverging executions paths are slow Limited memory Slow memory swapping Data-intensive operations are discouraged No file or standard I/O operations Pedro Miguel Costa Saraiva

15 Implementation Structure Pedro Miguel Costa Saraiva OpenSSL AES
OpenSSL acceleration using Graphics Processing Units Implementation Structure OpenSSL AES RSA Key Generation RSA Cipher Pedro Miguel Costa Saraiva

16 Implementation OpenSSL Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Implementation OpenSSL ENGINE component supports alternative cryptography implementations Supports dynamic loading of external engines Pedro Miguel Costa Saraiva

17 Implementation OpenSSL Engine Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Implementation OpenSSL Engine Binding function defines supported algorithms Pointers to functions implementing the defined algorithms Pedro Miguel Costa Saraiva

18 Implementation AES Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Implementation AES CBC mode encryption cannot be parallelised Previous ciphertext block is required to begin encryption of the next one CBC mode decryption can be parallelised All blocks are decrypted in parallel ECB mode can be parallelised Pedro Miguel Costa Saraiva

19 Implementation AES Pedro Miguel Costa Saraiva Initialisation Cipher
OpenSSL acceleration using Graphics Processing Units Implementation AES Initialisation Key expansion is performed on the CPU Cipher Initialises the GPU Allocates host and GPU memory for input and output data Pedro Miguel Costa Saraiva

20 Implementation AES Pedro Miguel Costa Saraiva Cipher
OpenSSL acceleration using Graphics Processing Units Implementation AES Cipher Input data transferred to the GPU memory All data transferred at once GPU Kernel is called Output data is transferred from the GPU memory Pedro Miguel Costa Saraiva

21 Implementation AES Pedro Miguel Costa Saraiva GPU Kernel
OpenSSL acceleration using Graphics Processing Units Implementation AES GPU Kernel For CBC encryption, a single thread is called Encrypts every block serially For CBC decryption and ECB operations, a thread is called for every block All blocks are processed in parallel Pedro Miguel Costa Saraiva

22 Implementation RSA Key Generation Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Implementation RSA Key Generation Generation function (CPU side) Calls the GPU to generate a large amount of prime candidates No more numbers are generated until the initial pool is exhausted Pedro Miguel Costa Saraiva

23 Implementation RSA Key Generation Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Implementation RSA Key Generation Generation function (GPU call) GPU RNG is initialised Device memory is allocated A large amount of threads is called to generate prime BIGNUMs Pedro Miguel Costa Saraiva

24 Implementation RSA Key Generation Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Implementation RSA Key Generation Generation function (GPU kernel) Random BIGNUM is generated BIGNUM p is tested for primality Miller-Rabin probabilistic primality test BIGNUMs determined to be prime are written into global memory Each thread tests one BIGNUM Pedro Miguel Costa Saraiva

25 Implementation RSA Key Generation Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Implementation RSA Key Generation Generation function (GPU call) Output data copied back to the host Required implementing the entire OpenSSL BIGNUM library on the GPU Pedro Miguel Costa Saraiva

26 Implementation RSA Cipher Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Implementation RSA Cipher BIGNUMs used in RSA must be broken down into small words Multiple threads can each process a word Chinese Remainder Theorem can split private key operations in half Pedro Miguel Costa Saraiva

27 Implementation RSA Cipher Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Implementation RSA Cipher Multi-Precision Algorithm K-bit integer A is broken into s k/64 words O(s) parallel implementation Runs s threads in two phases Pedro Miguel Costa Saraiva

28 Implementation RSA Cipher Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Implementation RSA Cipher First phase accumulates s partial products in 2s steps Carries accumulated in a separate array Second phase adds the carries to the intermediate result\ Worst case scenario is s-1 iterations Usually only one or two Pedro Miguel Costa Saraiva

29 Results Testing Framework Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Results Testing Framework Intel Core i7 950 CP, 3.07GHz NVIDIA GeForce GTX 580 Stress tool used on heavy CPU load tests 300 threads looping on sqrt, malloc/free and sync Pedro Miguel Costa Saraiva

30 Results AES – CBC Decryption Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Results AES – CBC Decryption Slower until the amount of data reaches 3KB Up to 43 times faster Pedro Miguel Costa Saraiva

31 Results AES – CBC Encryption Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Results AES – CBC Encryption Slower than the CPU Only 2.7% impact on CPU load Pedro Miguel Costa Saraiva

32 Results AES – ECB Encryption Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Results AES – ECB Encryption Slower until the amount of data reaches 3KB Up to 43 times faster Pedro Miguel Costa Saraiva

33 Results AES – ECB Decryption Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Results AES – ECB Decryption Slower until the amount of data reaches 3KB Up to 43 times faster Pedro Miguel Costa Saraiva

34 Results RSA Key Generation Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Results RSA Key Generation Slower until the amount of data reaches 3KB Up to 43 times faster Pedro Miguel Costa Saraiva

35 Results RSA Key Generation – Heavy CPU load Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Results RSA Key Generation – Heavy CPU load Slower until the amount of data reaches 3KB Up to 43 times faster Pedro Miguel Costa Saraiva

36 Results RSA Cipher RSA Cipher Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Results RSA Cipher RSA Cipher Single message Slower until the amount of data reaches 3KB Up to 43 times faster Single message, heavy CPU load Multiple messages (4096-bit) Pedro Miguel Costa Saraiva

37 Results RSA Key Generation – Heavy CPU load Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Results RSA Key Generation – Heavy CPU load Slower until the amount of data reaches 3KB Up to 43 times faster Pedro Miguel Costa Saraiva

38 Results RSA Key Generation – Heavy CPU load Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Results RSA Key Generation – Heavy CPU load Slower until the amount of data reaches 3KB Up to 43 times faster Pedro Miguel Costa Saraiva

39 Conclusion Pedro Miguel Costa Saraiva
OpenSSL acceleration using Graphics Processing Units Conclusion Significant performance boost for AES ECB and CBC Decryption AES CBC Encryption is slower, but significantly lighter on the CPU RSA Key Generation is significantly faster for multiple keys RSA Cipher is significantly slower Pedro Miguel Costa Saraiva

40 Future Work Pedro Miguel Costa Saraiva AES CTR Cipher Mode
OpenSSL acceleration using Graphics Processing Units Future Work AES CTR Cipher Mode OpenSSL implementation still unstable Manager to cache RSA requests for more effective use of the GPU Pedro Miguel Costa Saraiva


Download ppt "OpenSSL acceleration using Graphics Processing Units"

Similar presentations


Ads by Google