Presentation is loading. Please wait.

Presentation is loading. Please wait.

Kris Gaj Office hours: Monday, 6:00-7:00 PM Tuesday, Thursday, 7:30-8:30 PM, and by appointment Research and teaching interests: cryptography computer.

Similar presentations


Presentation on theme: "Kris Gaj Office hours: Monday, 6:00-7:00 PM Tuesday, Thursday, 7:30-8:30 PM, and by appointment Research and teaching interests: cryptography computer."— Presentation transcript:

1 Kris Gaj Office hours: Monday, 6:00-7:00 PM Tuesday, Thursday, 7:30-8:30 PM, and by appointment Research and teaching interests: cryptography computer arithmetic VLSI design and testing Contact: Science & Technology II, room 223 kgaj@gmu.edu (703) 993-1575

2 ECE 645 Part of: MS in EE MS in CpE Digital Systems Design – pre-approved course Other concentration areas – elective course Certificate in VLSI Design/Manufacturing PhD in IT PhD in ECE

3 Spring 2007 Enrollment as of January 22, 2006 MS in CpE 7 MS in EE 4 PhD in CS 1 Non-Degree 4

4 My general area of interest is… I want to specialize primarily in… VLSI Digital Systems Design ASICs & FPGAs VHDL/Verilog CAD Tools Reconfigurable Computing Microelectronics VLSI Fabrication Nanoelectronics CAD tools & Design Automation Hardware description languages FPGAs & Reconfigurable computing Computer arithmetic Front-end ASIC design (algorithmic downto gate level) Back-end ASIC design (transistor downto device level) Analog & Mixed Circuit Design VLSI Fabrication Micro- and Nanoelectronics Semiconductor Devices MS CpE Digital Systems Design MS EE Microelectronics Recommended degree & concentration

5 algorithmic register-transfer gate transistor layout devices Computer Arithmetic Introduction to VHDL Digital Integrated Circuits Mixed Signals VLSI VLSI Test Concepts ECE 545 ECE 645 ECE 586 ECE 699 ECE 682 ECE684ECE 584 Semiconductor Device Fundamentals ECE 681 VLSI Design Automation MOS Device Electronics ECE 745 ECE 699 ULSI Microelectronics Nano- electronics ECE 587 Analog Integrated Circuits CpE core EE core

6 MS CpE: DIGITAL SYSTEMS DESIGN Concentration advisors: Kris Gaj, Ken Hintz, David Hwang 1. ECE 545 Introduction to VHDL – K. Gaj, D. Hwang, project, VHDL, Aldec/Synplicity/Xilinx and Synopsys Design Analyzer/PrimeTime 2. ECE 645 Computer Arithmetic: HW and SW Implementation – K. Gaj, project, VHDL, Aldec/Synplicity/Xilinx and Synopsys Design Analyzer/PrimeTime 3. ECE 681 VLSI Design Automation – T. Storey, project/lab, back-end design with Synopsys tools 4. ECE 586 Digital Integrated Circuits – D. Ioannou

7 Prerequisites Permission of the instructor, granted assuming that you know VHDL or Verilog,High level programming language (preferably C) ECE 545 Introduction to VHDL or

8 Course web page ECE web page  Courses  Course web pages  ECE 645 http://teal.gmu.edu/courses/ECE645/index.htm

9 Computer Arithmetic LectureProject Project 1 20 % Project 2 30 % Homework 10 % Midterm exam 1 (in class) 20 % Midterm exam 2 (take-home) 20 %

10 Advanced digital circuit design course covering addition and subtraction multiplication division and modular reduction exponentiation Efficient Integers unsigned and signed Real numbers fixed point single and double precision floating point Elements of the Galois field GF(2 n ) polynomial base

11 Lecture topics (1) 1. Applications of computer arithmetic algorithms 2. Number representation Unsigned Integers Signed Integers Fixed-point real numbers Floating-point real numbers Elements of the Galois Field GF(2 n ) INTRODUCTION

12 1. Basic addition, subtraction, and counting 2. Carry-lookahead, carry-select, and hybrid adders 3. Adders based on Parallel Prefix Networks ADDITION AND SUBTRACTION

13 MULTIOPERAND ADDITION 1. Carry-save adders 2. Wallace and Dadda Trees 3. Adding multiple signed numbers

14 MULTIPLICATION 1. Tree and array multipliers 2. Sequential multipliers 3. Multiplication of signed numbers and squaring

15 DIVISION 1.Basic restoring and non-restoring sequential dividers 2. SRT and high-radix dividers 3. Array dividers

16 LONG INTEGER ARITHMETIC 1.Modular Exponentiation 2. Multi-Precision Arithmetic in Software

17 FLOATING POINT AND GALOIS FIELD ARITHMETIC 1.Floating-point units 2. Galois Field GF(2 n ) units

18 University of California, Santa Barbara, Behrooz Parhami, ECE252B: Computer Arithmetic. University of Massachusetts, Amherst, Israel Koren, ECE666: Digital Computer Arithmetic Lehigh University, Michael Schulte, ECE496: High-Speed Computer Arithmetic. Worcester Polytechnic Institute, Berk Sunar, EE-579 V Computer Arithmetic Circuits. Stanford University, Michael Flynn, EE486: Advanced Computer Arithmetic. University of California, Davies, Vojin Oklobdzija, ECE278: Computer Arithmetic for Digital Implementation. Similar courses at other universities

19 New in this course real-life project based on VHDL or Verilog HDL operations in the Galois Field (with application in cryptography and communications)

20 Possible topics for a Scholarly Paper or Research Project for the CpE & EE students Advanced Computer Arithmetic Square root Exponential and logarithmic functions Trigonometric functions Hyperbolic functions Fault-Tolerant Arithmetic Low-Power Arithmetic High-Throughput Arithmetic

21 Literature (1) Required textbook: Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design, Oxford University Press, 2000. Milos D. Ercegovac and Tomas Lang Digital Arithmetic, Morgan Kaufmann Publishers, 2004. Isreal Koren, Computer Arithmetic Algorithms, 2nd edition, A. K. Peters, Natick, MA, 2002. Recommended textbooks:

22 Literature (2) 1. Sundar Rajan, Essential VHDL: RTL Synthesis Done Right, S & G Publishing, 1998. 2. Volnei A. Pedroni, Circuit Design with VHDL, The MIT Press, 2004. VHDL books (used in ECE 545 in Fall 2005)

23 Literature (3) Supplementary books: 1.E. E. Swartzlander, Jr., Computer Arithmetic, vols. I and II, IEEE Computer Society Press, 1990. 2. Alfred J. Menezes, Paul C. van Oorschot, and Scott A. Vanstone, Handbook of Applied Cryptology, Chapter 14, Efficient Implementation, CRC Press, Inc., 1998. 3. Christof Paar, Efficient VLSI Architectures for Bit Parallel Computation in Galois Fields, VDI Verlag, 1994.

24 Literature (3) Proceedings of conferences ARITH - International Symposium on Computer Arithmetic ASIL - Asilomar Conference on Signals, Systems, and Computers ICCD - International Conference on Computer Design CHES - Workshop on Cryptographic Hardware and Embedded Systems Journals and periodicals IEEE Transactions on Computers, in particular special issues on computer arithmetic: 8/70, 6/73, 7/77, 4/83, 8/90, 8/92, 8/94, 7/00, 3/05. IEEE Transactions on Circuits and Systems IEEE Transactions on Very Large Scale Integration IEE Proceedings: Computer and Digital Techniques Journal of VLSI Signal Processing

25 Homework reading assignments (main textbook + articles) analysis of hardware and software algorithms and implementations design of small hardware units using VHDL or Verilog Optional assignments Possibility of trading analysis vs. design vs. coding

26 Midterm exams Exam 1 - 2 hrs 30 minutes, in class multiple choice + short problems Exam 2 – 48 hrs, take-home conceptual questions, analysis and design of arithmetic units using VHDL or Verilog HDL Practice exams on the web Exam 1 - Monday, March 26 Exam 2 - Saturday-Sunday, May 5-6 Tentative days of exams:

27 Project (1) Project I (20% of grade) Design and comparative analysis of fast adders (several hundred bits long) Final report due Monday, March 19 Optimization criteria: minimum latency maximum throughput minimum area minimum product latency · area maximum ratio throughput/area scalability Similar for all studentsDone individually

28 Project II (30% of grade) Fast multiplication squaring division modular reduction, or modular exponentiation Project (2) or Fast addition or multiplication Long unsigned or signed integers Floating-point numbers

29 Written report & oral presentation Monday, May 14 Real life application Requirements derived from the analysis of the application Typically both hardware and software design Several project topics proposed on the web You can choose project topic by yourself Can be done in a group of 1-3 students Project II (rules)

30 Cooperation (but not exchange of codes) between teams is encouraged Every team works on a slightly different problem Project topics should be more complex for larger teams Project II (rules)

31 Project Hardware Software VHDL (or Verilog) code Latency and/or throughput Area High level language (C preferred) Execution time Memory requirements Scalability

32 Degrees of freedom and possible trade-offs speedarea power testability ECE 645 ECE 682 ECE 586, 681

33 speed area latency throughput Degrees of freedom and possible trade-offs

34 Timing parameters definitionunitspipelining latency throughput delay clock period clock frequency time input  output #output bits/time unit time point  point rising edge  rising edge of clock 1 clock period ns Mbits/s ns MHz bad good

35 Project technologies semi-custom Application Specific Integrated Circuits and Field Programmable Gate Arrays

36 Levels of design description Algorithmic level Register Transfer Level Logic (gate) level Circuit (transistor) level Physical (layout) level Level of description most suitable for synthesis

37 Register Transfer Logic (RTL) Design Description Combinational Logic Combinational Logic … Clock Registers

38 CAD software available at GMU (1) Aldec Active-HDL (under Windows) ModelSim Xilinx Edition III (under Windows) available in the FPGA Lab, S&T II, room 203 limited version available for free for individual use at home as a part of Xilinx WebPACK available in the FPGA Lab, S&T II, room 203 VHDL simulators student edition can be purchased on an individualstudent edition basis ($59.95 + S&H)

39 CAD software available at GMU (2) Synplicity Synplify Pro (under Windows) Synopsys Design Compiler and PrimeTime (under Unix) available from all PCs in the ECE educational labs using an X-terminal emulator available remotely from home using a fast Internet connection available in the FPGA Lab, S&T II, room 203 available for free as a part of WebPACK Tools used for logic synthesis Xilinx XST (under Windows) FPGA synthesis ASIC synthesis

40 CAD software available at GMU (3) Xilinx ISE (under Windows) available in the FPGA Lab, S&T II, room 203 Tools used for implementation (mapping, placing & routing) in the FPGA technology Xilinx WebPACK (under Windows) limited version available for free for individual use at home as a part of Xilinx WebPACK

41 How to learn VHDL for synthesis by yourself? Lecture slides for ECE 545 from Fall 2005 Sundar Rajan, Essential VHDL: RTL Synthesis Done Right, S & G Publishing, 1998. Volnei A. Pedroni, Circuit Design with VHDL, The MIT Press, 2004. Individual or small-group hands-on sessions with the TA Practice, Practice, Practice!!!

42 Testbench testbench design entity Architecture 1 Architecture 2 Architecture N.. Non-synthesizable Synthesizable

43 Representative Inputs HDL Design (VHDL or Verilog) Reference Model (C or MAGMA ) expected results Testbench actual results = ? Hardware Design Verification

44 Primary applications (1) Execution units of general purpose microprocessors Integer units Floating point units Integers (8, 16, 32, 64 bits) Real numbers (32, 64 bits)

45 Primary applications (2) Digital signal and digital image processing Real or complex numbers (fixed-point or floating point) e.g., digital filters Discrete Fourier Transform Discrete Hilbert Transform General purpose DSP processors Specialized circuits

46 Primary applications (3) Coding Elements of the Galois fields GF(2 n ) (4-64 bits) Error detection codes Error correcting codes

47 Secret-key (Symmetric) Cryptosystems key of Alice and Bob - K AB Alice Bob Network Encryption Decryption

48 Primary applications (4) Cryptography Integers (16, 32 bits) Secret key cryptography IDEA, RC6, MarsTwofish, Rijndael Elements of the Galois field GF(2 n ) (4, 8 bits)

49 RC6 MARS Twofish MUL32, 2 x ROL32, S-box 9x32 Main operations Auxiliary operations XOR, ADD/SUB32 2 x SQR32, 2 x ROL32 XOR, ADD/SUB32 96 S-box 4x4, 24 MUL GF(2 8 ) XOR ADD32 Rijndael Serpent 8 x 32 S-box 4x4 XOR 16 S-box 8x8 24 MUL GF(2 8 ) XOR

50 Public Key (Asymmetric) Cryptosystems Public key of Bob - K B Private key of Bob - k B Alice Bob Network Encryption Decryption

51 RSA as a trap-door one-way function M C = f(M) = M e mod N C M = f -1 (C) = C d mod N PUBLIC KEY PRIVATE KEY N = P  Q P, Q - large prime numbers e  d  1 mod ((P-1)(Q-1))

52 RSA keys PUBLIC KEY PRIVATE KEY { e, N } { d, P, Q } N = P  Q e  d  1 mod ((P-1)(Q-1)) P, Q - large prime numbers

53 Primary applications (5) Cryptography Long integers (1000-16,000 bits) Public key cryptography RSA, DSA, Diffie-Hellman Elliptic Curve Cryptosystems Elements of the Galois field GF(2 n ) (150-500 bits)

54 Primary applications (5) Cipher Breaking Public key cryptography RSA PUBLIC KEY RSA PRIVATE KEY { e, N } { d, P, Q } N = P  Q P, Q e  d  1 mod ((P-1)(Q-1))

55 Estimation of RSA Security Inc. regarding the number and memory of PCs necessary to break RSA-1024 Attack time:1 year Single machine:PC, 500 MHz, 170 GB RAM Number of machines:342,000,000

56 Factoring 1024-bit RSA keys using Number Field Sieve (NFS) Polynomial Selection Linear Algebra Square Root RelationCollection Sieving Norm Factoring/Cofactoring 200 bit numbers & 350 bit ECM p-1 method Pollard rho Sashisu Bajracharya Ramakrishna Bachimanchi

57 Comparison among technologies SRC COPACOBANA MicroprocessorsASICsFPGAs

58 FPGAs vs. Microprocessors Spartan3s5000 Virtex2v6000 Pentium4 2.8GHz 637 869 635 857 315 435 80 76 40 rho p-1 ECM 10.8x 7.8x 11.3x 8.4x 10.8x 7.9x

59 Local Memory Global Memory Rho in an ASIC 130 nm

60 51x ASIC 130 nm vs. Virtex II 6000 – rho (24 units) 19.80 mm 19.68 mm 2.7 mm 2.82 mm Area of Virtex II 6000 (estimation by R.J. Lim Fong, MS Thesis, VPI, 2004) Area of an ASIC with equivalent functionality

61 Number of computations per second using the same chip area 88,405 869 21,739 435 101x 50x Virtex2v6000 FPGA 130 nm ASIC library rhoECM

62 Cofactorization Unit interesting Computer Arithmetic project

63 Famous computer arithmetic bugs and flaws

64 Learn to deal with approximations In digital arithmetic one has to come to grips with approximation and questions like: –When is approximation good enough –What margin of error is acceptable Be aware of the applications you are designing the arithmetic circuit or program for Analyze the implications of your approximation

65 Calculators u = 10 times v = 2 1/1024 = 1.000 677 131= 1.000 677 131 x = (((u 2 ) 2 )…) 2 = 1.999 999 963 10 times x’ = u 1024 = 1.999 999 973 y = (((v 2 ) 2 )…) 2 = 1.999 999 983 10 times y’ = v 1024 = 1.999 999 994 Hidden digits in the internal representation of numbers Different algorithms give slightly different results Very good accuracy

66 Consequences of bad approximations Example: Failure of Patriot Missile (1991 Feb. 25) Source http://www.math.psu.edu/dna/455.f96/disasters.html American Patriot Missile battery in Dharan, Saudi Arabia, failed to intercept incoming Iraqi Scud missile The Scud struck an American Army barracks, killing 28 Cause, per GAO/IMTEC-92-26 report: “software problem” (inaccurate calculation of the time since boot) Specifics of the problem: time in tenths of second as measured by the system’s internal clock was multiplied by 1/10 to get the time in seconds. Internal registers were 24 bits wide 1/10 = 0.0001 1001 1001 1001 1001 100 (chopped to 24 b) Error  0.1100 1100  2 –23  9.5  10 –8 Error in 100-hr operation period  9.5  10 –8  100  60  60  10 = 0.34 s Distance traveled by Scud = (0.34 s)  (1676 m/s)  570 m This put the Scud outside the Patriot’s “range gate” Ironically, the fact that the bad time calculation had been improved in some (but not all) code parts contributed to the problem, since it meant that inaccuracies did not cancel out

67 Example: Explosion of Ariane Rocket (1996 June 4) Source http://www.math.psu.edu/dna/455.f96/disasters.html Unmanned Ariane 5 rocket launched by the European Space Agency veered off its flight path, broke up, and exploded only 30 seconds after lift-off (altitude of 3700 m) The $500 million rocket (with cargo) was on its 1st voyage after a decade of development costing $7 billion Cause: “software error in the inertial reference system” Specifics of the problem: a 64 bit floating point number relating to the horizontal velocity of the rocket was being converted to a 16 bit signed integer An SRI* software exception arose during conversion because the 64-bit floating point number had a value greater than what could be represented by a 16-bit signed integer (max 32 767) Consequences of bad approximations

68 Pentium bug (1) October 1994 Thomas Nicely, Lynchburg Collage, Virginia finds an error in his computer calculations, and traces it back to the Pentium processor Tim Coe, Vitesse Semiconductor presents an example with the worst-case error c = 4 195 835/3 145 727 Pentium = 1.333 739 06... Correct result = 1.333 820 44... November 7, 1994 Late 1994 First press announcement, Electronic Engineering Times

69 Pentium bug (2) Intel admits “subtle flaw” Intel’s white paper about the bug and its possible consequences Intel - average spreadsheet user affected once in 27,000 years IBM - average spreadsheet user affected once every 24 days Replacements based on customer needs Announcement of no-question-asked replacements November 30, 1994 December 20, 1994

70 Pentium bug (3) Error traced back to the look-up table used by the radix-4 SRT division algorithm 2048 cells, 1066 non-zero values {-2, -1, 1, 2} 5 non-zero values not downloaded correctly to the lookup table due to an error in the C script

71


Download ppt "Kris Gaj Office hours: Monday, 6:00-7:00 PM Tuesday, Thursday, 7:30-8:30 PM, and by appointment Research and teaching interests: cryptography computer."

Similar presentations


Ads by Google