1 Lucas-Lehmer Primality Tester Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design Manager: Prateek Goenka.

Slides:



Advertisements
Similar presentations
Programmable FIR Filter Design
Advertisements

Multiplication and Division
Registers and Counters
Combinational Logic with Verilog Materials taken from: Digital Design and Computer Architecture by David and Sarah Harris & The Essentials of Computer.
Introduction So far, we have studied the basic skills of designing combinational and sequential logic using schematic and Verilog-HDL Now, we are going.
Kazi Spring 2008CSCI 6601 CSCI-660 Introduction to VLSI Design Khurram Kazi.
Idongesit Ebong (1-1) Jenna Fu (1-2) Bowei Gai (1-3) Syed Hussain (1-4) Jonathan Lee (1-5) Design Manager: Myron Kwai Overall Project Objective: Design.
Team M1 Enigma Machine Milestone 5 Adithya Attawar (M11) Shilpi Chakrabarti (M12) Zavo Gabriel (M13) Mike Sokolsky (M14) Design Manager: Prateek Goenka.
1 4-bit Decimation Filter Rashmi Joshi Siu Kuen(Steve) Leung Cuong Trinh Advisor: Dr. David Parent December 5, 2005.
1 Lucas-Lehmer Primality Tester Presentation 8 March 22nd 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
1 Lucas-Lehmer Primality Tester Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design Manager: Prateek Goenka.
Virtual Wallet Gates Winkler Yin Shen Jordan Samuel Fei /23/2009 A handheld device that saves time and money through smart budget management and.
1 Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) Presentation 7 MAD MAC th March, 2006 Functional Block.
Noise Canceling in 1-D Data: Presentation #13 Seri Rahayu Abd Rauf Fatima Boujarwah Juan Chen Liyana Mohd Sharipp Arti Thumar M2 April 20 th, 2005 Short.
EE 141 Project 2May 8, Outstanding Features of Design Maximize speed of one 8-bit Division by: i. Observing loop-holes in 8-bit division ii. Taking.
1 Lucas-Lehmer Primality Tester Presentation 6 March 1st 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
Viterbi Decoder: Presentation #11 M1 Overall Project Objective: Design a high speed Viterbi Decoder Stage 11: 12 th April 2004 Short Final Presentation.
Huffman Encoder Project. Howd - Zur Hung Eric Lai Wei Jie Lee Yu - Chiang Lee Design Manager: Jonathan P. Lee Huffman Encoder Project Final Presentation.
Noise Canceling in 1-D Data: Presentation #10 Seri Rahayu Abd Rauf Fatima Boujarwah Juan Chen Liyana Mohd Sharipp Arti Thumar M2 Mar 28 rd, 2005 Chip Level.
Team W3: Anthony Marchetta Derek Ritchea David Roderick Adam Stoler Milestone 11: April 12th Short Final Presentation Overall Project Objective: Design.
1 Team M1 Enigma Machine 3rd May, 2006 Adithya Attawar (M11) Shilpi Chakrabarti (M12) Mike Sokolsky (M14) Design Manager: Prateek Goenka Adithya Attawar.
[M2] Traffic Control Group 2 Chun Han Chen Timothy Kwan Tom Bolds Shang Yi Lin Manager Randal Hong Wed. Oct. 27 Overall Project Objective : Dynamic Control.
Group M3 Nick Marwaha Craig LeVan Jacob Thomas Darren Shultz Project Manager: Zachary Menegakis April 4, 2005 MILESTONE 11 LVS & Simulation DSP 'Swiss.
Lucas-Lehmer Primality Tester Presentation 1: Proposal Team: Nathan Stohs Joe Hurley Brian Johnson Marques Johnson.
Idongesit Ebong (1-1) Jenna Fu (1-2) Bowei Gai (1-3) Syed Hussain (1-4) Jonathan Lee (1-5) Design Manager: Myron Kwai Overall Project Objective: Design.
1 Design Goal Design an Analog-to-Digital Conversion chip to meet demands of high quality voice applications such as: Digital Telephony, Digital Hearing.
Lucas-Lehmer Primality Tester Presentation 2: Architecture Proposal February 1, 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques.
Lucas-Lehmer Primality Tester Presentation 4 February 15, 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
Lucas-Lehmer Primality Tester Presentation 5 February 22, 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
Digital Design – Optimizations and Tradeoffs
1 EECS Components and Design Techniques for Digital Systems Lec 21 – RTL Design Optimization 11/16/2004 David Culler Electrical Engineering and Computer.
High Dynamic Range Emeka Ezekwe M11 Christopher Thayer M12 Shabnam Aggarwal M13 Charles Fan M14 Manager: Matthew Russo 6/26/
1 Lucas-Lehmer Primality Tester Presentation 6 March 1st 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
Sprinkler Buddy Presentation #8: “Testing/Finalization of all Modules and Global Placement” 3/26/2007 Team M3 Kartik Murthy Panchalam Ramanujan Sasidhar.
1. 2 Farhan Mohamed Ali Jigar Vora Sonali Kapoor Avni Jhunjhunwala 1 st May, 2006 Final Presentation MAD MAC 525 Design Manager: Zack Menegakis Design.
1 Lucas-Lehmer Primality Tester Presentation 9 March 29, 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
1 Lucas-Lehmer Primality Tester Presentation 11 April 24th 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
1 GPS Waypoint Navigation Team M-2: Charles Norman (M2-1) Julio Segundo (M2-2) Nan Li (M2-3) Shanshan Ma (M2-4) Design Manager: Zack Menegakis Presentation.
Team W3: Anthony Marchetta Derek Ritchea David Roderick Adam Stoler Milestone 5: Feb. 18 th Component Layout Overall Project Objective: Design an Air-Fuel.
1 Lucas-Lehmer Primality Tester Presentation 8 March 22nd 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
Team W1 Design Manager: Rebecca Miller 1. Bobby Colyer (W11) 2. Jeffrey Kuo (W12) 3. Myron Kwai (W13) 4. Shirlene Lim (W14) Stage II: February 4 th 2004.
Team W1 Design Manager: Rebecca Miller 1. Bobby Colyer (W11) 2. Jeffrey Kuo (W12) 3. Myron Kwai (W13) 4. Shirlene Lim (W14) Stage II: 26 th January 2004.
Idongesit Ebong (1-1) Jenna Fu (1-2) Bowei Gai (1-3) Syed Hussain (1-4) Jonathan Lee (1-5) Design Manager: Myron Kwai Overall Project Objective: Design.
1 GPS Waypoint Navigation Team M-2: Charles Norman (M2-1) Julio Segundo (M2-2) Nan Li (M2-3) Shanshan Ma (M2-4) Design Manager: Zack Menegakis Presentation.
Camera Auto Focus Group W1 Tom Goff Dave Hwang Kate Killfoile Greg Look Design Manager: Bowei Gai Final Presentation, April 30 th, 2007 Project Objective:
[M2] Traffic Control Group 2 Chun Han Chen Timothy Kwan Tom Bolds Shang Yi Lin Manager Randal Hong Mon. Nov. 24 Overall Project Objective : Dynamic Control.
1 Design Goal Design an Analog-to-Digital Conversion chip to meet demands of high quality voice applications such as: Digital Telephony, Digital Hearing.
Chapter 5 Arithmetic Logic Functions. Page 2 This Chapter..  We will be looking at multi-valued arithmetic and logic functions  Bitwise AND, OR, EXOR,
Group M3 Jacob Thomas Nick Marwaha Craig LeVan Darren Shultz Project Manager: Zachary Menegakis April 20, 2005 MILESTONE 13 Short Final Presentation DSP.
Lucas-Lehmer Primality Tester Presentation 2: Architecture Proposal February 1, 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques.
Virtual Wallet Gates Winkler Yin Shen Jordan Fei Project Manager: Prajna Shetty /02/2009 A handheld device that saves time and money through smart.
Charles Kime & Thomas Kaminski © 2004 Pearson Education, Inc. Terms of Use (Hyperlinks are active in View Show mode) Terms of Use Lecture 12 – Design Procedure.
Registers CPE 49 RMUTI KOTAT.
CS1Q Computer Systems Lecture 9 Simon Gay. Lecture 9CS1Q Computer Systems - Simon Gay2 Addition We want to be able to do arithmetic on computers and therefore.
Chapter 8 Problems Prof. Sin-Min Lee Department of Mathematics and Computer Science.
Copyright 1995 by Coherence LTD., all rights reserved (Revised: Oct 97 by Rafi Lohev, Oct 99 by Yair Wiseman, Sep 04 Oren Kapah) IBM י ב מ 10-1 The ALU.
1/8/ L3 Data Path DesignCopyright Joanne DeGroat, ECE, OSU1 ALUs and Data Paths Subtitle: How to design the data path of a processor.
EKT 221/4 DIGITAL ELECTRONICS II  Registers, Micro-operations and Implementations - Part3.
6.375 Final Presentation Jeff Simpson, Jingwen Ouyang, Kyle Fritz FPGA Implementation of Whirlpool and FSB Hash Algorithms.
Anurag Dwivedi. Basic Block - Gates Gates -> Flip Flops.
Divide Calculation Latency
Group M1 - Enigma Machine Design Manager: Prateek Goenka Adithya Attawar (M1-1) Shilpi Chakrabarti (M1-2) Zavo Gabriel (M1-3) Mike Sokolsky (M1-4) Milestone.
Concepts of Engineering and Technology Copyright © Texas Education Agency, All rights reserved.
Reconfigurable Computing - Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound,
1 The ALU l ALU includes combinational logic. –Combinational logic  a change in inputs directly causes a change in output, after a characteristic delay.
ADPCM Adaptive Differential Pulse Code Modulation
Swamynathan.S.M AP/ECE/SNSCT
ADPCM Adaptive Differential Pulse Code Modulation
Alpha Blending and Smoothing
EE216A – Fall 2010 Design of VLSI Circuits and Systems
Presentation transcript:

1 Lucas-Lehmer Primality Tester Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design Manager: Prateek Goenka

2 Ancient Greeks 300 BC Euclid’s Proof Proved that were an infinite number of Prime numbers that were irregularly spaced

3 How to find Prime Numbers The method used for smaller numbers is called Sieve of Eratosthenes from 240 BC Trial Division is another method for smaller numbers

4 43rd Known Mersenne Prime Found!! December 2005 Dr. Curtis Cooper and Dr. Steven Boone Professors at Central Missouri State University 2 30,402,457 -1

5 rankprimedigitswhowhenreference G92005Mersenne G82005Mersenne G72004Mersenne G62003Mersenne G52001Mersenne SB SB G41999Mersenne SB SB92005

6 Prime Number Competitions Electronic Frontier Foundation $50,000 to the first individual or group who discovers a prime number with at least 1,000,000 decimal digits (awarded Apr. 6, 2000) $100,000 to the first individual or group who discovers a prime number with at least 10,000,000 decimal digits $150,000 to the first individual or group who discovers a prime number with at least 100,000,000 decimal digits $250,000 to the first individual or group who discovers a prime number with at least 1,000,000,000 decimal digits

7 Mersenne Prime Algorithm For P > 2 2 P -1 is prime if and only if S p-2 is zero in this sequence: S 0 = 4 S N = (S N ) mod (2 P -1)

8 Example to Show is Prime 2 7 – 1 = 127 S 0 = 4 S 1 = (4 * 4 - 2) mod 127 = 14 S 2 = (14 * ) mod 127 = 67 S 3 = (67 * ) mod 127 = 42 S 4 = (42 * ) mod 127 = 111 S 5 = (111 * ) mod 127 = 0

9 Computations needed: -Squaring (not a problem…) -Add/Subtract (not a problem…) -Modulo (2^n – 1) multiplication (?) Algorithmic description We knew the computations needed, but how to translate that to gates?

10 Mechanisms behind the math If done with brute force, modulo 2^n-1 could have been ugly. –Would need to square and find the remainder via division. Luckily, for that specific computation, math is on our side, the 2^n-1 constraint saves us from division, as will be seen. A quick search on produced inspiration. Taken from “Efficient VLSI Implementation of Modulo (2^n +- 1) Addition and Multiplication” Reto Zimmermann Swiss Federal Institute of Technology (ETH)

11 Useful Math: Multiplication Just like any other multiplication, a modulo multiplication can be computed by (modulo) summing the partial products. So modulo multiplication is multiplication using a modulo adder.

12 Useful Math: The Modulo Adder The more logic driven math that is the basis of our modulo adder.

13 Last Bits: Modulo Reduction At various points, such as when finding the partial product, the result has to be reduced. There is a nifty way to do that as well.

14 Mod Calc Mod Multiply Count Subtract 2 Block Diagram P Out 16 1 FSM start 1 done 16 Register r4 16 Compare clk 16

15 Mod Multiply Block Diagram Mod add Register 2 p Counter 16 Next Partial Product 16 FSM clk 2 FSM clk P to sub 2 from register

16 Mod Calc Mod Multiply Count Subtract 2 Block Diagram P Out 16 1 FSM start 1 done 16 Register r2 16 Compare clk 16

17 Design Process The Process So far: - Found Mathematical Means (core algorithm) - Found Computational Means (modulo multiplier, adder) From the above, a high level C program was written in a manner that would easily translate to verilog and gates, or at least more standard operations int mod_square_minus(int value, int p, int offset) { int acc, i; int mod = (1 << p) - 1; for(acc=offset, i=0; i<(sizeof(int)*8-1); i++) { int a = (value >> i) & 1; int temp; if (a) { if (i-p > 0) temp = value << (i-p); else temp = value >> (p-i); acc = acc + temp + ((value << i) & ((1 << p) - 1)); } if (acc >= mod) acc = acc - mod; } return acc; } This easily translated into behavorial verilog, and readily turned into a gate- level implementation. Essentially it was written in a more low-level manner.

18 Design Process The rest of the design can simply be thought of as a wrapper for the modulo multiplier. The following slides contain Verilog code that was directly taken from the C code below. module mod_mult(out, itrCount, x, y, mod, p, reset, en, clk); input [15:0] x, y, mod, p; output [15:0] out; input reset, en, clk; wire [15:0] pp, ma0, temp; output [3:0] itrCount; counter mycount(itrCount, reset, en, clk); partial_product ppg(pp, x, y, itrCount, mod, p); mod_add modAdder(out, pp, temp, mod); dff_16_lp partial(clk, out, temp, reset, en); endmodule Top level of multiplier

19 module partial_product(out, x, y, i, mod, p); output [15:0] out; input [15:0] x, y, mod, p; input [3:0] i; wire [15:0] diff1, diff2, added, result, corrected, final; wire [15:0] high, low, shifted, toadd; wire cout1, cout2, ithbith, toobig; sub_16 difference1(diff1, cout1, {12'b0, i}, p); sub_16 difference2(diff2, cout2, p, {12'b0, i}); shift_left shiftL(high, y, diff1[3:0]); shift_right shiftR(low, y, diff2[3:0]); mux16 choose(high, low, shifted, cout1); shift_left shiftL2(toadd, y, i); and16 bigand(added, toadd, mod); fulladder_16 addhighlow(.out(result),.xin(added),.yin(shifted),.cin({1'b0}),.cout(nowhere)); sub_16 correct(.out(corrected),.cout(toobig),.xin(mod),.yin(result)); mux16 correctionMux(.out(final),.high(corrected),.low(result),.sel(toobig)); shift_right ibit({15'b0, ithbit}, x, i); select16 checkfor0(.out(out),.x(result),.sel(ithbit)); endmodule Partial Product Unit w/ modulo reduction

20 module mod_add(out, x, y, mod); input [15:0] x, y, mod; output [15:0] out; wire cout, isDouble, cin; wire [15:0] plus, lowbits, done, mod_bar, check; fulladder_16 add(.out(plus),.xin(x),.yin(y),.cin(cin),.cout()); invert_16 inverter(mod_bar, mod); and16 hihnbits(check, plus, mod_bar); and16 lownbits(done, plus, mod); or8 (cin, check[0], check[1], check[2], check[3], check[4], check[5], check[6], check[7], check[8], check[9], check[10], check[11], check[12], check[13], check[14], check[15]); compare_16 checkfordouble(isDouble, done, 16'b1111_1111_1111_1111); mux16 fixdouble(.out(out),.high(16'b0),.low(done),.sel(isDouble)); endmodule Modulo Adder

21 Final Design Process Notes Lessons learned: Never tweak the schematics without retesting the verilog first. Considering total time spent during this phase, roughly half was on the “core” and the FSM, the rest on the “wrapper”.

22 Road to verification : C 2 Examples of the high-level C implementations: Tyrion:~/Desktop/15525 nstohs$./prime4 7 round 1: (4 * 4 - 2) mod 127 = 14 round 2: (14 * ) mod 127 = 67 round 3: (67 * ) mod 127 = 42 round 4: (42 * ) mod 127 = 111 round 5: (111 * ) mod 127 = 0 2^7-1 is prime Tyrion:~/Desktop/15525 nstohs$./prime4 11 round 1: (4 * 4 - 2) mod 2047 = 14 round 2: (14 * ) mod 2047 = 194 round 3: (194 * ) mod 2047 = 788 round 4: (788 * ) mod 2047 = 701 round 5: (701 * ) mod 2047 = 119 round 6: (119 * ) mod 2047 = 1877 round 7: (1877 * ) mod 2047 = 240 round 8: (240 * ) mod 2047 = 282 round 9: (282 * ) mod 2047 = ^11-1 is not prime

23 Road to verification: Verilog Samples of Verilog Verification output: Partial Product Unit p = ppOut= 56, x= 14, y= 14, i= 2, mod= 127, p= ppOut= 112, x= 14, y= 14, i= 3, mod= 127, p= ppOut= 0, x= 14, y= 14, i= 4, mod= 127, p= ppOut= 0, x= 14, y= 14, i= 5, mod= 127, p= 7 Top Level p = 7 itrOut= x itrOut= 4 itrOut= 14 itrOut= 67 itrOut= 42 itrOut= 111 itrOut= 0 Top Level p = 11 itrOut= x itrOut= 4 itrOut= 14 itrOut= 194 itrOut= 788 itrOut= 701 itrOut= 119 itrOut= 1877 … Tests were either specific tests on important units such as Partial_Product …our top level tests. Note that these are the same results generated from the C code

24 Road to verification: Schematic I Schematic Test of our modular adder Mod 127 = 69

25 Road to verification: Schematic II Plot of the top level output after a single iteration, p=7 Output after a single iteration is 14, the expected value.

26 Road to verification: Schematic III The simulation outputs after a full run, showing the results of all iterations. Simulations start taking a long time. More on that later.

27 Road to verification: Intermission Disk Space required for a full-length schematic test of p=7 : 6 GB Time required for a full-length schematic test of p=7 : 4 hours Disk Space required for a full-length extracted test of p=7 : more Time required for a full-length extracted test of p=7 : longer Disk Space required for a full-length extractedRC test of p=7 : 1 iPod Time required for a full-length extractedRC test of p=7 : T_T Simulations become very demanding and lengthy due to tests needing to be “deep” to be useful. To meet such demands, be sure to use Genuine AMD© CPUs.

28 Road to verification: Layout I 3 words: “the net-lists match” Of course, there is far more to be concerned about. Due to simulator issues, layout simulations were delayed on some major modules. Partial Product Sims In Progress (I Hope)

29 Road to verification: Layout II Top Level layout Sims in Progress

30 Road to verification: Timing Layout Timing Sims in progress Pathmill was useful to help us gauge our critical path, which is one cycle through our modulo multiplier. When run on the top level, a critical path of ns was found. This was in the ballpark relative to our research.

31 Issues extractedRC of partial_product module Registers switch Switching from parallel calculations to series –Transistor count vs. clock cycles Syncing up design between people –Transferring files –Different design styles LONG simulation times Floorplanning –Too much emphasis on aspect ratios and not enough on wiring –Couldn’t decide on one set floorplan

32 Floorplan v1.0 Prime Logic Mod Multiplier Mod Adder FSM Memory

33 Floorplan v2.0

34 Floorplan v3.0

35 Floorplan v4.0

36 Floorplan v5.0

37 Final Floorplan

38 Pin Specs PinType# of Pins Vdd!In/Out1 Gnd!In/Out1 p In16 clkIn1 startIn1 DoneOut1 outOut1 Total-22

39 Initial Part Specs ModuleTransistor Count Area (µm²) Transistor Density FSM mod_p2,4407, mod_add1,2829, partial_product8,67665, count1,6566, sub_167043, Registers1,8486, compare Total16,94297,700.17

40 Final Part Specs ModuleTransistor Count Area (µm²) Transistor Density FSM1521, mod_p1,2808, mod_add1,1685, partial_product7,52054, count1,4248, sub_165762, Registers8966, compare Total13,70286, Aspect Ratio

41 Chip Specs Transistor Count: 13,702 Size: µm x µm Area: 86,621µm² Aspect Ratio: 1.01:1 Density: 0.16 transistors/µm²

42 Final Floorplan

43 Final Floorplan

44 Poly Layer Density: 7.14%

45 Active Layer Density: 8.76%

46 Metal1 Layer Density: 23.86%

47 Metal2 Layer Density: 19.97%

48 Metal3 Layer Density: 11.30%

49 Metal4 Layer Density: 10.34%

50 Conclusions Plan for buffers –Can’t put them in after the fact Your design will change dramatically from start to finish so be flexible Communication is key Do layout in parallel