Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE241A VLSI Digital Circuits Winter 2003 Recitation 2

Similar presentations


Presentation on theme: "CSE241A VLSI Digital Circuits Winter 2003 Recitation 2"— Presentation transcript:

1 CSE241A VLSI Digital Circuits Winter 2003 Recitation 2
CSE241A VLSI Digital Circuits Winter Recitation 2.5: Performance Coding

2 Introduction: Performance Coding
Overview Critical Paths Hierarchy RTL Operators Multiplexers Parallelism Order

3 Overview: Why Code? Motivation Increase speed! Reduce cycle times
Build portability

4 Critical Paths Identify critical signals Slow signals, slow paths
Reduce logic path depth Less, gates in path = higher clock rate Connect critical net closest to output Decrease overall delay for function a b c e d a b e d c

5 Hierarchy Block size Too many blocks increase delay (logic depth)
Too much hierarchy Signals have to traverse whole hierarchy. Possible for synthesizer to reduce this Synopsys “ungroup” command

6 + + Resource Sharing Bad Example Adders use a lot of resources A sum B
MUX C + D select if (select) sum <= a + b; else sum <= c + d;

7 + Resource Sharing Good Example Infer two muxes A B sum select C D
if (select) tmp1 <= a; tmp2 <= b; else tmp1 <= c; tmp2 <= d; sum <= tmp1 + tmp2;

8 Loops Move operators outside loops I.e move + (adder) outside the loop
Will reduce adders instantiated Make sure: Critical signals are addressed

9 Muxes Instantiate Muxes Don’t wait for the tool to infer muxes
Create a gate-level mux and use it in code If – then else Might create unwanted logic Use explicit case statements to infer muxes If not using muxes Technology independence

10 Parentheses Use parentheses to optimize e.g.: out = a + b + c + d; vs
The first statement creates 3 adders in series The second statement creates 2 adders in parallel

11 Operators * / % Multiply, divide, and modulo High Cost of operators
* / % Multiply, divide, and modulo High Cost of operators Will create extra non-optimized logic Better to create design blocks Instantiate fast tree adder Instantiate wallace tree Otherwise have synthesis tool create RCA (Ripple Carry Adder)

12 Adder example Adder Carry look ahead
module Add_prop_gen (sum, c_out, a, b, c_in); // generic 4-bit carry // look-ahead adder // behavioral model output [3:0] sum; output c_out; input [3:0] a, b; input c_in; reg [3:0] carrychain; wire [3:0] g = a & b; // carry generate, contin assignment, bitwise and wire [3:0] p = a ^ b; // carry propagate, contin assignment, bitwise xor or b or c_in) // event "or" begin: carry_generation // usage: block name integer i; #0 carrychain[0] = g[0] | (p[0] & c_in); / Eliminate race for(i = 1; i <= 3; i = i + 1) begin carrychain[i] = g[i] | (p[i] & carrychain[i-1]); end wire [4:0] shiftedcarry = {carrychain, c_in} ; // concatenation wire [3:0] sum = p ^ shiftedcarry; // summation wire c_out = shiftedcarry[4]; // carry out bit select endmodule

13 Instantiate complex DesignWare parts
Fast DesignWare parts are inferred through simple HDL operators Some high performance parts require explicit instantiation Part NumberFunctionDelayAreaRun-timeMultiply-AccumulateDW02_macA*B+C Sum-of-ProductsDW02_prod_sumA*B+C*D+… Vector AdderDW02_sumA+B+C+D+…  

14 Parallelizing Use hierarchy Like in hardware, use parallel connections
+ operator will create RCA Example: Carry lookahead adder C1 = G0 | P0C0 C2 = G1 | P1G0 | P1P0 C0 C3 = G2 | P2G1 | P2P1G0 | P2P1P0 C0 C4 = G3 | P3G2 | P3P2G1 | P3P2P1G0 | P3P2P1P0 C0 Only 4 gate delay vs 9 gates for RCA Only trade-off is more logic (more area) Silicon has lots of real estate (to a point; think wiring)

15 If – then else , Case Use of if statement
Introduces priority of nested “ifs” Case has no order Can mix cas and if Similar delay for all data signals CASE state IS WHEN s1 => Z <= a; WHEN s2 => Z <= b; WHEN s3 => Z <= c; WHEN OTHERS => Z <= d; END CASE; Speeds up signal s2 CASE state IS WHEN s1 => tmp <= a; WHEN s3 => tmp <= c; WHEN OTHERS => tmp <= d; END CASE; IF (state = s2) THEN Z <= b; ELSE Z <= tmp; END IF;

16 Order Dependency Blocking vs. Non-Blocking Assignments
Use non-blocking statements when doing sequential assignments like pipelining and modeling of several mutually exclusive data transfers Blocking assignments within sequential processes may cause race conditions Non-blocking assignments are order independent


Download ppt "CSE241A VLSI Digital Circuits Winter 2003 Recitation 2"

Similar presentations


Ads by Google