Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reconfigurable Computing - FPGA structures John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western.

Similar presentations


Presentation on theme: "Reconfigurable Computing - FPGA structures John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western."— Presentation transcript:

1 Reconfigurable Computing - FPGA structures John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western Australia

2 FPGA Architectures  Programmable logic takes many forms  Originally devices contained 10’s of gates and flip-flops  These early devices were generally called PAL’s (Programmable Array Logic)  A typical structure was  With 10-20 inputs and outputs and ~20 flip-flops, they could implement small state machines and replace large amounts of discrete ‘glue’ logic Programmable And-Or array FF Inputs (  20) Outputs (  20)

3 Programmable Logic  Memory should also be included in the class of programmable logic!  It finds application in LUTs, state machines,...  From early UV EPROMs with ~kbytes, we now have many styles of memory which retains values when power is removed and capacities in Mbytes Memory is an important consideration when designing reconfigurable systems. FPGA technology does not provide large amounts of memory and this can be a constraint - especially if you are trying to produce a compact, single chip solution to your problem!

4 Modern Programmable Logic  As technology has evolved, so have programmable devices  Today’s FPGAs contain  Millions of ‘gates’  Memory  Support for several I/O protocols - TTL, LVDS, GTL, …  Arithmetic units - adders, multipliers  Processor cores

5 FPGA Architecture  The ‘core’ architecture of most modern FPGAs consists of  Logic blocks  Interconnection resources  I/O blocks

6 Typical FPGA Architecture  Logic blocks embedded in a ‘sea’ of connection resources  CLB = logic block IOB = I/O buffer PSM = programmable switch matrix This particular arrangement is similar to that in Xilinx 4000 (and onwards) chips - devices from other manufacturers are similar in overall structure

7 Logic Blocks  Combination of  And-or array or Look-Up-Table (LUT)  Flip-flops  Multiplexors  General aim  Arbitrary boolean function of several variables  Storage  Designers try to estimate what combination of resources will produce the most efficient application  circuit mappings Xilinx 4000 (and on) CLB 3 LUT blocks 2 Flip-Flops (Asynch Reset) Multiplexors Clock / Reset Lines

8 Adders  Adders appear in most designs  Arithmetic Adders (including subtracters)  Other arithmetic operators  eg multipliers, dividers  Counters (including program counters in processors)  Incrementors, decrementors, etc  They also often appear on the critical path  Adder performance can be crucial for system performance  Because of their importance, researchers are still searching for better ways to add!  Adder structures proposed already  Ripple carry  Carry select  Carry skip  Carry look-ahead  Manchester  … and several dozen more variants

9 Ripple Carry Adder  The simplest and most well known adder  How long does it take an n -bit adder to produce a result?  n x propagation delay( FA: (a or b)  carry )  We can do better than this - using one of many known better structures  but  What are the advantages of a ripple carry adder?  Small  Regular  Fits easily into a 2-D layout! FA a1a1 b1b1 c in c out s1s1 FA a0a0 b0b0 c in c out s0s0 FA a n-1 b n-1 c in c out s n-1 FA a n-2 b n-2 c in c out s n-2 carry out Very important in packing circuitry into fixed 2-D layout of an FPGA!

10 Ripple Carry Adders  Ripple carry adder performance is limited by propagation of carries FA a1a1 b1b1 c in c out s1s1 FA a0a0 b0b0 c in c out s0s0 FA a n-1 b n-1 c in c out s n-1 FA a n-2 b n-2 c in c out s n-2 carry out FA a3a3 b3b3 c in c out s3s3 FA a2a2 b2b2 c in c out s2s2 On an FPGA, this link is often the major source of time delay … because one or two FA blocks will often fit in a logic block! LB Connections within a logic block are fast! Connections between logic blocks are slower

11 ‘Fast Carry’ Logic  Critical delay  Transmission of carry out from one logic block to the next  Solution (most modern FPGAs)  ‘Fast carry’ logic  Special paths between logic blocks used specifically for carry out  Very fast ripple carry adders!  More sophisticated adders?  Carry select  Uses ripple carry blocks - so can use fast carry logic  Should be faster for wide datapaths?  Carry lookahead  Uses large amounts of logic and multiple logic blocks  Hard to make it faster for small adders!

12 Carry Select Adder n-bit Ripple Carry Adder a 0-3 sum 0-3 b 0-3 cin a 4-7 sum0 4-7 b 4-7 cout 7 cout 3 0 sum1 4-7 cout 7 1 n-bit Ripple Carry Adder b 4-7 n-bit Ripple Carry Adder 01 sum 4-7 01 carry Here we build an 8-bit adder from 4-bit blocks ‘Standard’ n -bit ripple carry adders n = any suitable value

13 Carry Select Adder n-bit Ripple Carry Adder a 0-3 sum 0-3 b 0-3 cin a 4-7 sum0 4-7 b 4-7 cout 7 cout 3 0 sum1 4-7 cout 7 1 n-bit Ripple Carry Adder b 4-7 n-bit Ripple Carry Adder 01 sum 4-7 01 carry After 4*t pd it will produce a carry out This block adds the 4 low order bits These two blocks ‘speculate’ on the value of cout 3 One assumes it will be 0 the other assumes 1

14 Carry Select Adder n-bit Ripple Carry Adder a 0-3 sum 0-3 b 0-3 cin a 4-7 sum0 4-7 b 4-7 cout 7 cout 3 0 sum1 4-7 cout 7 1 n-bit Ripple Carry Adder b 4-7 n-bit Ripple Carry Adder 01 sum 4-7 01 carry After 4*t pd it will produce a carry out This block adds the 4 low order bits After 4* tpd we will have: sum 0-3 (final sum bits) cout 3 (from low order block) sum0 4-7 cout0 7 (from block assuming 0 c in ) sum1 4-7 cout1 7 (from block assuming 1 c in )

15 Carry Select Adder n-bit Ripple Carry Adder a 0-3 sum 0-3 b 0-3 cin a 4-7 sum0 4-7 b 4-7 cout 7 cout 3 0 sum1 4-7 cout 7 1 n-bit Ripple Carry Adder b 4-7 n-bit Ripple Carry Adder 01 sum 4-7 01 carry Cout 3 selects correct sum 4-7 and carry out All 8 bits + carry are available after 4*t pd (FA) + t pd (multiplexor)

16 Carry Select Adder  This scheme can be generalized to any number of bits  Select a suitable block size ( eg 4, 8)  Replicate all blocks except the first  One with c in = 0  One with c in = 1  Use final c out from preceding block to select correct set of outputs for current block

17 Fast Adders  Many other fast adder schemes have been proposed eg  Carry-skip  Manchester  Carry-save  Carry Look Ahead  If implementing an adder ( eg in programmable logic) do a little research first!

18 Fast Adders  Challenge: What style of adder is fastest / most compact for any FPGA technology?  Answer is not simple  For small adders ( n < ?), fast carry logic will certainly make a simple ripple carry adder fastest  It will also use the minimum resources - but will need to be laid out as a column or row  For larger adders ( ? < n < ? ), carry select styles are likely to be best -  They use ripple carry blocks efficiently  For very large adders ( n > ? ), a carry look ahead adder may be faster?  But it will use considerably more resources!

19 Exploiting a manufacturer’s fast carry logic  To use the Altera fast carry logic, write your adder like this: LIBRARY ieee; USE ieee.std_logic_1164.all; LIBRARY lpm ; USE lpm.lpm_components.all ; ENTITY adder IS PORT (c_in : IN STD_LOGIC ; a, b : IN STD_LOGIC_VECTOR(15 DOWNTO 0) ; sum : OUT STD_LOGIC_VECTOR(15 DOWNTO 0) ; c_out : OUT STD_LOGIC ) ; END adderlpm ; ARCHITECTURE lpm_structure OF adder IS BEGIN instance: lpm_add_sub GENERIC MAP (LPM_WIDTH => 16) PORT MAP (cin => Cin, dataa => a, datab => b, result => sum, cout => c_out ) ; END lpm_structure ;

20 What about that carry in?  In an ALU, we usually need to do more than just add!  Subtractions are common also  Observe  c = a - b is equivalent to  c = a + (-b)  So we can use an adder for subtractions if we can negate the 2 nd operand  Negation in 2’s complement arithmetic?

21 Adder / Subtractor  Negation in 2’s complement arithmetic?  Rule:  Complement each bit  Add 1 eg BinaryDecimal 0001 1 Complement 1110 Add 1 1111 -1 0110 6 Complement 1001 Add 1 1010 -6

22 Adder / Subtractor  Using an adder  Complement each bit using an inverter  Use the carry in to add 1! a b carry c c in FA 0 1 add/ subtract

23 Example - Generate ENTITY adder IS GENERIC ( n : INTEGER := 16 ) ; PORT (c_in : IN std_ulogic ; a, b : IN std_ulogic_vector(n-1 DOWNTO 0) ; sum : OUT std_ulogic_vector(n-1 DOWNTO 0) ; c_out : OUT std_ulogic ) ; END adder; ARCHITECTURE rc_structure OF adder IS SIGNAL c : STD_LOGIC_VECTOR(1 TO n-1) ; COMPONENT fulladd PORT (c_in, x, y : IN std_ulogic ; s, c_out : OUT std_ulogic ) ; END COMPONENT ; BEGIN FA_0: fulladd PORT MAP ( c_in=>c_in, x=>a(0), y=>b(0), s=>sum(0), c_out=>c(1) ) ; G_1: FOR i IN 1 TO n-2 GENERATE FA_i: fulladd PORT MAP ( c(i), a(i), b(i), sum(i), c(i+1) ) ; END GENERATE ; FA_n: fulladd PORT MAP (C(n-1),A(n-1),B(n-1),Sum(n-1),Cout) ; END rc_structure ;

24 IEEE 1164 standard logic package  Bus pull-up and pull-down resistors can be ‘inserted’  Initialise a bus signal to ‘H’ or ‘L’:  ‘0’ or ‘1’ from any driver will override the weak ‘H’ or ‘L’: SIGNAL not_ready : std_logic := ‘H’; IF seek_finished = ‘1’ THEN not_ready <= ‘0’; END IF; /ready 10k V DD DeviceADeviceBDeviceC


Download ppt "Reconfigurable Computing - FPGA structures John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western."

Similar presentations


Ads by Google