Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography.

Similar presentations


Presentation on theme: "Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography."— Presentation transcript:

1 Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

2 Presentation Outline Introduction & Motivation Related Work Design Methodology Design Description Algorithm Implementations Comparison with other Work Programming Paradigm Conclusion/Work in Progress

3 Motivating Factors Need for high speed cryptography Need for algorithm independence Need for more secure implementations Need for implementing both Symmetric and Asymmetric key encryption

4 Need for High Speed Implementations Software implementations cannot provide real time rates Hardware implementations essential for  IPSec end points  SSL servers  VPN at rates exceeding ATM Algorithm implementation must be able to sustain the network bandwidth

5 Need for Algorithm Independence IPSec  Cipher Algorithm Specified in Security Association (SA) SSL Transactions  Algorithm Negotiable for both Key Exchange & Encryption Need for Both Secret Key and Public Key Encryption  Session establishment - Large Number of transactions  Dedicated hardware not cheap!

6 Hardware Implementation Benefits More secure implementations Implementing both algorithms in hardware removes bottleneck associated with slow computations in key establishment Single hardware implementation supporting both algorithms reduce costs of separate hardware

7 Advantages of Reconfigurable Hardware Implementations Algorithm Agility Algorithm Upload/Modification Architecture Efficiency/Throughput Cost Efficiency

8 Comparison of Different Approaches

9 FPGAs? Post Fabrication Customization Low Cost Design Cycle Fast turnaround time Potential for Parallelism  Instruction-level – Multiple operations  Data-level – Multiple blocks of data  Task-level – Parallel tasks (e.g. secret key)

10 FPGA: The basics General purpose logic elements (LUTs) Very flexible interconnect Basically fine grained to support both data paths and random logic

11 FPGA: Disadvantages Too much flexible – inefficiencies Too fine grained – again inefficiencies Block ciphers primarily data flow oriented – implemented using a large number of small elements Ciphers have a well defined data flow – general purpose interconnect end up being slow and overkill in terms of area

12 FPGA vs. Specialized Reconfigurable Logic Coarse grained vs. Fine grained Specialized interconnect vs. generic interconnect Reduced reconfiguration times End result  Faster performance with reduced area while maintaining enough flexibility to support the application domain

13 Issues in Reconfigurable Hardware Designs How much of what to support?  How many functional units?  What kinds of functional units?  How much support for random logic?  How much interconnect flexibility to allow? Programming/CAD tools  What kind of programming model to target  How to design efficient automated tools

14 Custom Reconfigurable Hardware Design- What’s involved? Looking for commonalities/overlaps as well as disjoint elements  Identify crucial components  Utilize potential overlap or partial reuse  Generic enough but fast components  Minimizing the differences in component types Balancing the resources  Upper bounds/Lower bounds  Logic units vs. memory blocks  Determining exact number of each type of unit Make the common case fast- IMPORTANT ALWAYS!

15 Related Work Cavium Networks’ SSL & IPSEC Protocol Aware Security Processor USC Mark II ‘s Advanced Cryptographic Engine for IPsec Worcester Polytechnic Institute’s COBRA Architecture

16 SSL/IPsec Security Processor Support for both public key and secret key encryption Not Reconfigurable Dedicated hardware blocks for each operation

17 Advanced Cryptographic Engine (ACE) Designed to implement flexible cipher needs of IPsec Only supports block ciphers Support for any algorithm through a library of general purpose FPGA implementations

18 COBRA Architecture Custom Reconfigurable Hardware for block ciphers Each RCE is a macro block supporting various component operations Configured using VLIW instructions

19 Design Methodology Literature Survey  Block cipher implementations  Public key cipher implementations  Identifying essential components of efficient implementations Iterative Development of Architecture Validation by mapping several representative algorithms Identification of Programming Methodology

20 Categorizing Implementation Requirements Essential step to handle the design complexity  Logic Requirements  Interconnection Requirements  Memory (RAM/ROM) Requirements Area and Performance directly affected by these

21 Prioritizing Support Ordered by importance and then by relative hardware complexity  AES (Rijndael)  DES  Modular Exponentiation (RSA)  Serpent  Twofish  RC6, MARS, and others

22 Block Ciphers: Key Elements Bitwise XOR, AND, OR. Addition or subtraction modulo 2n Shift or rotation by a constant number of bits. Data-dependent rotation by a variable number of bits. Multiplication modulo the table entry value. Multiplication in the Galois field specified by the table entry value. Inversion modulo the table entry value. Look-up-table substitution

23 Block Cipher: Core Operations

24 Modular Multiplication and Exponentiation Modular Exponentiation implemented with multiple and square algorithm Montgomery Multiplication algorithm the most popular for modulo multiplication Various Approaches for Implementation  Systolic Array  Word Based

25 ME & MM ME primarily requires fast adders CSA based implementation most common The highest throughput implementation used redundant representation with carry save adders for computation of partial results The same implementation style thus selected for ME

26 Our Design: Key Insight CSA made up of 2 half adders with 1 OR gate Each half adder itself 1 XOR & 1 AND Add some configurability to the basic CSA Result: A fast basic element with support for most of primitive operations

27 So What Else is needed? Shifts between rounds of addition (for modulo exponentiation) support for fixed length shifts, rotates & arbitrary permutes of 32-bit operands (for symmetric key) Solution: A Permutation Unit!

28 Structure of Proposed Design Final Design arrived upon by iterative refinement Hierarchical Design  Cell  Block/Cluster  Groups  Top of Hierarchy

29 The Cell

30 The Block/Cluster

31 Group

32 Interconnects In a Group

33 Overall Structure

34 Random Logic Support


Download ppt "Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography."

Similar presentations


Ads by Google