Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Handel-C Implementation of a Computationally Intensive Problem in GF(3) Joey C. Libby, Jonathan P. Lutes, and Kenneth B. Kent The Handel-C Language Handel-C.

Similar presentations


Presentation on theme: "A Handel-C Implementation of a Computationally Intensive Problem in GF(3) Joey C. Libby, Jonathan P. Lutes, and Kenneth B. Kent The Handel-C Language Handel-C."— Presentation transcript:

1 A Handel-C Implementation of a Computationally Intensive Problem in GF(3) Joey C. Libby, Jonathan P. Lutes, and Kenneth B. Kent The Handel-C Language Handel-C is a behavioral HDL from Agility Design Solutions It consists of: * A subset of ANSI-C language elements * Extensions for concurrency * A set of variable width primitive types * Does not support runtime recursion Example par and seq statements: The Hardware solution Most of the Ansi C code was re-used. Elements that required changes: * Storage elements redefined to static size. * Nested function calls were un-nested. * Recursive functions were made loop based. Verification was done using boundary cases and a large number of random inputs in 2 steps: 1. Test cases were ran comparing the recursive and non-recursive versions of the functions and were found to be functionally equivalent. 2. Test cases were ran comparing the non-recursive software and hardware versions of the functions and were again found to be equivalent. The verification above allows us to deem that the hardware definitions is equivalent to the software algorithm. Benchmarking was done in a simulation environment for the purpose of gathering statistics. Optimization followed the initial benchmarking. Parallelism was added by wrapping compatible statements in par blocks. In total 17 par blocks were identified. Results: * 0.290 Mhz increase in clock speed by moving to a parallel design. * Number of slices and flips used was less in the parallel design. * Parallel HW design continually outperforms SW design in tests. Time (sec) vs. Order of Polynomial Execution Time % Difference Between HW and SW Conclusions: The work was a success, the hardware design proved faster in all tests and the data leads us to conclude that the parallel HW solution will continue to outperform the SW design. Galois Fields A Galois Field is a finite order denoted by GF(p) where p is a prime or a power of primes. A Galois Field of order 3 has only 3 elements {0,1,2}. Irreducible polynomials are polynomials such that p(x) in F(x) is called irreducible over F if it is non-constant and cannot be represented as the product of two or more non-constant polynomials from F(x) [3]. A primitive polynomial is a polynomial such that F(X), with coefficients in GF(p) = Z/pZ, is a primitive polynomial if it has a root α in GF(pm) such that is the entire field GF(p m ), and moreover, F(X) is the smallest degree polynomial having α as root [3]. Future Work Two paths for optimization: * Parallelism: Only simple parallelism has been identified in the system. Next loop based parallelism or parallelism between different functional units must be exploited. * Pipelined Data Path: A pipelined data path can be identified and implemented. A pipelined data path may increase the throughput of the algorithm by increasing the amount of work completed per clock cycle by breaking the algorithm down into functional units that can operate in parallel. 15 2 3 6 int 8 a,b,c,d,e,f,g,h; a = 1; b = 2; c = 3; d = 4; par { d = a + b; e = c + d; } seq { f = d + e; g = d * e; } Introduction What: A hardware implementation of an algorithm to count primitive and irreducible polynomials over a Galois Field of order 3. How: Using Handel-C, a behavioural HDL, which is a subset of C with hardware specific additions. Why: To demonstrate a more efficient implementation of the algorithm. Also, these fields are used in pairing based cryptographic systems. History: This work takes an existing algorithm created in C, later implemented in SW/HW co-design, and implements it in hardware using Handel-C. [3] G. Birkhoff, S. Mac Lane. A Survey of Modern Algebra, 5th ed. New York: Macmillan, 1996. The Co-Design solution The C algorithm was profiled and the multmod function was found to be the most computationally intensive. Multmod was implemented in VHDL for an Amirix AP1000 board. Overview: Two limiting factors: Slow embedded processor & communications 4


Download ppt "A Handel-C Implementation of a Computationally Intensive Problem in GF(3) Joey C. Libby, Jonathan P. Lutes, and Kenneth B. Kent The Handel-C Language Handel-C."

Similar presentations


Ads by Google