Download presentation

Presentation is loading. Please wait.

Published byJeremiah Leblanc Modified over 2 years ago

1

2
Slide: 1 interra confidential Synthesis in EDA Flow by: Saikat Bandyopadhyay © Interra Systems India Pvt Ltd

3
Slide: 2 interra confidential Content Defining Synthesis History IC Design Flow Synthesis Flow Analysis and Elaboration Synthesis Scheduling and Allocation Optimization Technology Mapping Synthesis Goals and Constraints Synthesizing Big Design Variations in Synthesis Q and A

4
Slide: 3 interra confidential Defining Synthesis Conversion of High Level Hardware Description to Gate Level Hardware Description Level of Hardware Description Gate level Data Flow level RTL level Behavioural level

5
Slide: 4 interra confidential Gate Level Description of the hardware is purely in terms nets connecting pins of gate instances and ports Example implements a 2 input mux using gate level components module select(out, s, a, b); output out; input s, a, b; INT_NOT (s_bar, s); //s_bar=!s INT_AND2 (t1, a, s); //t1=a&s INT_AND2 (t2, b, s_bar);//t2=a&s_b INT_OR2 (out, t1, t2);//out=t1|t2 endmodule

6
Slide: 5 interra confidential Data Flow Level Gate level + assign statements normally used to represent combinational circuit Can represent sequential circuit if used with instance of latch or ff Example: computes absolute value module abs (out, in); output [7:0] out; input [7:0] in; wire [7:0] twosCIn; assign twosCIn = ~in + 1; assign out = in[7] ? twosCIn : in; endmodule

7
Slide: 6 interra confidential RTL Level Explicit clock and state machine Technology independent Fixed Architechture Synthesizable Example : RTL level description for recognizing overlapping 101 pattern State diagram S1 S0 S2 S0 1/0 0/0 1/0 0/0 1/1

8
Slide: 7 interra confidential RTL Level module recognize101(match,in,ck); input in, ck; output match; reg match; reg [1:0] state; ck) begin case (state) 2b00: begin if (in == 1) begin state = 2b01; end match = 1b0; end 2b01: begin if (in == 0) begin state = 2b10 end match = 1b0; end case 2b10: begin if (in == 1) begin state = 2b01; match = 1b1; end else begin state = 2b00; match = 1b0; end default: begin state = 2b00; match = 1b0; end endcase endmodule

9
Slide: 8 interra confidential Behavioural Level Implicit clock and scheduling of events Architechture independent Mostly used for modeling only (not synthesizable) Can be synthesized with special behavioural synthesis tools. Example: The following module computes sqrt Uses logic n-1 (2i+1) = n 2 0 module sqrt(in, out); input [7:0] in; output [3:0] out; reg [3:0] out, tmp; reg [7:0] odd; begin tmp = in; out = 0; odd = 1; while (tmp > 0) begin if (tmp >= odd) begin out = out+1; tmp = tmp - odd; odd = odd + 2; end else begin tmp = 0; end endmodule

10
Slide: 9 interra confidential History of Synthesis Initial IC Designs were handmade at Mask level Polygon pushing tools(example Calma®) were used for design. Simulation was done at this level by Simulators like HiLo®. Next tools were developed for automatic generation of operators Some generators were developed for generating operators from parameters like input/output width and architecture.(e.g 16 bit carry look ahead adder) The operators were connected by hand Later Schematic entry tools came to market. Gates or operators can be drawn and connected schematically Automatic tools would generate the mask from the schematic. Mentor graphics Idea Station® had integrated schematic entry and simulation

11
Slide: 10 interra confidential History of Synthesis(cont) Next came High Level Hardware Description Language Gateways Design came up with Verilog Language Verilog was essentially developed to model behavior of Electronic Circuits. Not for simulation. Gateways developed the Verilog Simulator now called Verilog-XL. From High Level Description to Gate Level Synopsys was at earlier called optimal design Inc. It specialized in gate level logic optimization. Synthesis happened as a after thought. Since this modeling language(verilog) was available, Synopsys engineers tried to convert various of high level verilog constructs into gate level where ever possible. synthesis as we know today was born.

12
Slide: 11 interra confidential IC Design Flow Develop and verify algorithm (C, Mathlab etc) Hand convert to RTL level Hardware Description Verify the RTL Design by Simulation. Power and Timing estimation tools can also be used at RTL level. Synthesis tools used to convert description to gate level. Simulation or Formal Verification done to verify functionality Design Flow Algorithm in C, Mathlab RTL Description Gate Description Synthesis Execute and verify Algo Simulate to verify Functionality Estimate Timing and Power Verify Timing and Power Verify Functionality with Simulation or Formal Verification Tech Library Constraints

13
Slide: 12 interra confidential IC Design Flow (cont) Placement tool in now used to assign place(x,y coordinates) for gates Timing verification is done with better estimate of wire delays Routing tool assigns location for nets that connect the instance gates. Timing Verification is again done with still refined wire delays Mask is used to prepare the IC Design Flow Gate Description Placement Mask (GDSII) Placed Gates Routing Verify Timing Verify and Correct Placement Rules Verify Timing Verify and Correct Mask Rules To IC foundry Floor Plan Physical Library

14
Slide: 13 interra confidential Synthesis Flow Translate RTL level Design description in HDL to gate level netlist In description only synthesizable subset of the HDL are supported for synthesis Different steps in Synthesis flow Elaboration DFA Allocation CDFG generation Analysis CDFG Traversal Optimization Writing Netlist Technology Mapping RTL Description Gate Level Description Macro Generation

15
Slide: 14 interra confidential Synthesis Flow (analysis) Analysis Input : Design description in HDL (Verilog/VHDL file) Output : Analyzed design units in an intermediate form either in memory or in disk Functionality : Perform syntax and semantics checks on the design description Creates Data Structure in an language dependent form (Obejct Model) module my_mod(z, a, b, c); input [1:0] a, b, c; output [1:0] z; or b or z = a + b – c; end endmodule module my_mod always expr ports

16
Slide: 15 interra confidential Synthesis Flow (elaboration) Elaboration Input : Analyzed design unit list Output : Elaborated design unit list Functionality : Expand the complete design hierarchy Generate a design unit list consisting of distinct design units Resolve all parameter values Compute all the constant expression module top (o, i1, i2); input [7:0] i1, i2; output [7:0] o; my_mod#(1) (o[1:0], i1[1:0], i2[1:0]); my_mod#(3) (o[7:2], i1[7:2], i2[7:2]); endmodule module my_mod(z, a, b); parameter w; input [2*w-1:0] a, b; output [2*w-1:0] z; assign z = a + b – c; endmodule module top (o, i1, i2); input [7:0] i1, i2; output [7:0] o; my_mod_1 (o[1:0], i1[1:0], i2[1:0]); my_mod_3 (o[7:2], i1[7:2], i2[7:2]); endmodule module my_mod_1(z, a, b); input [1:0] a, b; output [1:0] z; assign z = a + b – c; endmodule module my_mod_3(z, a, b); input [5:0] a, b; output [5:0] z; assign z = a + b – c; endmodule

17
Slide: 16 interra confidential Synthesis Flow (cdfg) Generation of Control and Data Flow Graphs Input : Elaborated Language dependent Data Structure Output : Language Independent Control and Data Flow Graphs(CDFG) module my_mod(z,a,b,c,m,n); input [1:0] a, b, c; input m, n; reg[1:0] z; reg [1:0] t; or b or c or m or n) begin if(m) t = a; else if (n) t = b; z = t + c; end endmodule START END IF ENDIF IF ENDIF = = NOP + t c z t a b t m n

18
Slide: 17 interra confidential Synthesis Flow (cdfg) Distinct component of synthesis routine: CDFG Generation Populate Language independent representation of the input design as a Control and Data Flow Graph Functional flow input language dependent Input: Inmemory representation of the entire design created by analyzer Output: Language independent representation of the entire design as a directed graph Graph is created for each concurrent block and represents sequential behaviour of the design Each node in Graph represents either control node or data node Each edge in Graph represents either control flow or data flow

19
Slide: 18 interra confidential Synthesis Flow (dfa) Data Flow Analysis and Creating Logic with Generic Gates Traverse the CDFG created for each concurrent block Calculate the driving logic for each assign object in each path and store them as logic equation Both data logic and control logic are evaluated Realize an abstract structure of the input design START END IF ENDIF IF ENDIF = = NOP + t c z t a b t m n MUX LATCH adder b a m m n c z

20
Slide: 19 interra confidential Synthesis Flow (dfa) We analyze the cdfg and store the data in intermediate forms called path variable array(PVA) and path variable matrix(PVM) Path Variable Array(PVA) one for each path array of lhs-rhs pair. p = a + b; q = ~en ~enq a+bp rhslhs

21
Slide: 20 interra confidential Synthesis Flow (dfa) Path Variable Matrix(PVM) Created each time paths join rows represent lhs(signals getting assigned) columns are paths For each column(path) there is enabling condition nNULLmr b q a+bbap m == 3m == 2m == 1lhs\cond

22
Slide: 21 interra confidential Synthesis Flow (dfa) Data Flow Analysis Each path consists of path segments and for each path segment data and control value are evaluated for each assigned object. These values are stored in PVA (Path Variable Array) A special construct PVM (Path Variable Matrix) is created out of PVAs to hold value of the objects in different paths. Each column in PVM represents a particular path and each row represents a particular object. Each entry in Matrix represents logic value of a particular object in a particular path.

23
Slide: 22 interra confidential Synthesis Flow (dfa) Data Flow Analysis (Example) START END IF ENDIF IF ENDIF = = NOP + t c z t a b t m n PVA : P1 PVM: M1 PVA : P11 PVA : P121 PVA : P12 PVM : M2 PVA : M3 PVA : P12

24
Slide: 23 interra confidential Synthesis Flow (dfa) Data Flow Analysis (Example) For each sequential block, one root PVA and one root PVM are allocated (P1, M1) Starting from each branch node new PVA is created for each path segment.(P11 and P12) When hit a join node, new PVM (M2) is created out of PVAs (P11 and P12) This PVM is passed to allocator for allocating current data and control logic Clock, Tristate and Hold logic is allocated only from Root PVM (M1)

25
Slide: 24 interra confidential Synthesis Flow (dfa) Inferring Logic from PVM Each row of PVM is analyzed and logic inferred. For row in which all colums have values one hot mux is inferred For row in which some columns are empty, latch is infered Latch, flip-flop and tristate are allocated from root PVM: M1 lhs\condm~m dab lhs\condm~m daNULL MUX b a m d LATCH m ad

26
Slide: 25 interra confidential Synthesis Flow(dfa example) Lets now infer logic for the CDFG that we had created Initial PVM just has initial values(NULL) At first join node PVM M2 is created Since infers to latch we wait till root PVM:M3 Since t_1 is not yet allocated. The PVM is divided into PVM for data and PVM for hold logic lhs\condn~n t_1bNULL lhs\condm~m tat_1

27
Slide: 26 interra confidential Synthesis Flow(dfa example) PVM for data logic PVM for hold logic t_data goes to data pin. t_hold goes to hold pin and the output is t Finally logic for z is infered for root PVM lhs\condm~m t_2ab lhs\condm~m t_2NULL~n MUX b a n t_data m n t_hold + t c z

28
Slide: 27 interra confidential Synthesis Flow(dfa example) Inferred netlist for the CDFG RTL_MUX RTL_LD M_RTL_ADD b a m m n c

29
Slide: 28 interra confidential Synthesis Flow (cont.) Allocation and Scheduling Schedule the clock cycle in which to perform the operation Allocate actual hardware resource for each logic operation Bind the allocated resource with the input and output data Transform the design into netlist form by instantiating cell/macro and connects them to achieve the functionality

30
Slide: 29 interra confidential Synthesis Flow (cont.) Allocation and Scheduling Example of Data Flow Path for scheduling Trivial Scheduling Assumes infinite resources All operations in 1 clock cycle Large clock cycle Latency is 0 ****+ ** - - +< Clock Period

31
Slide: 30 interra confidential Synthesis Flow (cont.) Allocation and Scheduling ASAP Scheduling One operation per clock cycle Independent operations done parallel Operations done ASAP Smaller clock Latency is number of levels ****+ ** - - +< T1 T2 T3 T4

32
Slide: 31 interra confidential Synthesis Flow (cont.) Allocation and Scheduling Scheduling under resource constraint Resource available –1 multiplier –1 add/sub Small clock(same as ASAP) Small area Large latency * * * * + * * < T1 T2 T3 T4 T7 T6 T5

33
Slide: 32 interra confidential Synthesis Flow(cont) Macro Generation Operators in Data Flow Paths like adders, multipliers which are allocated as Macros are build in terms of primitive cells Input: Netlist with macro Instances Ouput: Netlist in terms of primitive instances only Functionality Based on the macro(operator type), input width and input type(signed, unsigned) appropriate operator generator are called. generator replaces the macro with primitive gates like PRIM_AND, PRIM_XOR.

34
Slide: 33 interra confidential Synthesis Flow (cont.) Optimization Circuit cost whether area or speed is optimized. Optimization in concorde is mainly done by SIS Hanging logic removal, removal of not gates connected in series, parallel instance removal etc. is done traversing the netlist in concorde code.

35
Slide: 34 interra confidential Synthesis Flow(cont) Logic Optimization Lets discuss algorithm for one such case (expand) Function to optimize is F ON = abc + abc + abc + abc F dont care = abc F OFF can be computed to abc + abc + abc Tabular representation F ON F OFF a b c a b c abc abc abc abc abc abc abc Cube Representation of function a b c

36
Slide: 35 interra confidential Synthesis Flow(cont) Expand Algo Foreach row of F ON foreach column of row if (F ON [row][column] != *) F = F ON F[row][column] = * if (F F OFF == ) foreach row2 of F if (row != row2 && F[row] F[row2] == F[row]) { erase F[row2]; F ON = F

37
Slide: 36 interra confidential Synthesis Flow(cont) Expand Algo Tabular Representation Cube Representation F ON F OFF * 0 0 * * erase erase * * * * * 0 * * 0 * * * * *

38
Slide: 37 interra confidential Synthesis Flow(cont) Sequential Optimization Several Kinds of Sequential Optimization Techniques are also present. Lets consider one such Optimization(retiming) Flip Flop or Latch position is moved along the path to optimize area and speed

39
Slide: 38 interra confidential Synthesis Flow (cont.) Technology Mapping & Optimization Map the generic synthesized netlist using customer specific library cell Rule Based Mapping Algorithm Based Mapping Mapping criteria get minimum area get minimum delay

40
Slide: 39 interra confidential Synthesis Flow (cont.) Technology Mapping & Optimization Lets consider Dynamic Programming based mapping to optimize area Library cells are converted to NAND, INV tree based on its logic Library and NAND-INV tree INV 2 NAND 5 AND 6 IOR 5

41
Slide: 40 interra confidential Synthesis Flow (cont.) Technology Mapping & Optimization Design is also converted to NAND_INV tree Algorithm Cost of a cell is its Area Cost of Input pins is 0 Cost of a vertex is cost of cell whose pattern matches the pattern at vertex + vertex cost at inputs If multiple cell patterns match pattern at the vertex. We will take the cell which results in minimum vertex cost Compute cost for all vertex from input to output

42
Slide: 41 interra confidential Synthesis Flow (cont.) Technology Mapping & Optimization Cost of V1 = cost(NAND) = 5 Cost of V2 = min(cost(INV)+cost(V1), cost(AND)) = 6 Cost of V3 = min(cost(IOR)+cost(V1),cost(NAND)+cost(V2)) = 10 INPUT DESIGN MIN AREA IMPLEMETATION 12 3

43
Slide: 42 interra confidential Synthesis Flow (cont.) Writing Structural Netlist Write synthesized netlist in any desired format to output text files Output netlist is in structural form.

44
Slide: 43 interra confidential Synthesis Goals and Constraints RTL Level hardware description can be implemented in many ways[macro(architectural), or micro(logic) level] a+b+c a a b c cbc b a Architectural choices x y z Logic choices x y z

45
Slide: 44 interra confidential Synthesis Goals and Constraints Goals and Constraints help Synthesis Tool to make the choices Goals can be maximize speed or minimize area, power Constraints are more detailed Goals Constraints at Chip Level Minimize area for a given Clock speed Maximize speed as long as the design fits into a FPGA of specific size Constraints at Block Level are more complex

46
Slide: 45 interra confidential Constraints at Block Level Input Delay specifying the data arrival time at each input seperately. Output Delay specifies the extra delay after the output. The current design must make the output data arrive earlier to take care of this case. Clock waveform needs to be specified. Specific paths can be specified with specific delay to meet

47
Slide: 46 interra confidential Synthesizing Big Design Big Designs take too much memory and time to be Synthesized together. Divided into blocks(modules) and the blocks are synthesized separately Synthesis is done bottom up. Leaf level blocks are synthesized first. Constraints need to be computed from the Top, since constraint at each block comes from constraint of the whole chip.

48
Slide: 47 interra confidential Synthesizing Big Design Designers divide the total chip area into area constraint for each block The block constraints can be total area or width and height of each block. Pin positions of each block are determined. Synthesis tool only takes in the area. The other constraints (width, height, pin positions) are for placement tools B1 B3 B2 B4 B5 B6 B7 Chip Layout

49
Slide: 48 interra confidential Synthsizing Big Design Similarly designers divides the clock period into timing constraints for each block. Say the clock period is 20ns. For B1 Flip Flop to output can be 7ns, for B2 input to output can be 5 ns. For B3 input to Flip Flop is 8ns. B1B2 B3 Design with Blocks(abstract)

50
Slide: 49 interra confidential Synthesizing Big Designs This process of dividing chips resources is called bugeting. Buggeting is mostly manual but there are some tools to help in bugeting The process is mostly iterative. After Synthesis designers often find blocks that couldnt meet the constraints. Designers normally redo the buggeting and Synthesizes again.

51
Slide: 50 interra confidential Variations in Synthesis Higher Level Synthesis Input is at higher level than RTL Alternate Target Synthesis Output not at Gate Level Timing Driven Synthesis

52
Slide: 51 interra confidential Higher Level Synthesis Behavioural Synthesis Synthesis done from Behavioral Level Output is normally RTL Unlike RTL Synthesis(regular Synthesis), architechture selection is done by the tool based on constraints Scheduling is non trivial. Clock is used to divide the data paths into different time slots Resources are shared if they are in different time slots

53
Slide: 52 interra confidential Higher Level Synthesis Protocol Synthesis Input in Language specific for describing Communication Protocols between designs Output is RTL Description for Synthesis Sometimes also produces C model for verification Examples are Synopsyss Protocol Compiler Austin Protocol Compiler(APC) of The University of Texas at Austin ALFred Protocol Compiler

54
Slide: 53 interra confidential Higher Level Synthesis Example of Protocol input in Timed Asynchronous Protocol(TAP) process pe const Rp: integer=0; Bq: integer=0; tr: integer=10; qe: address var sp: integer = 0; sq: array [2] of integer = 0; d, e: integer; initialize: integer = 1 begin act sendrqst in 0; initialize := 0 timeout sendrqst rst.e:=NCR(Bq,2,sq[0],sq[1]); send rqst to qe; act resend in tr; rcv rqst from qe d:=DCR(Bq,0,rqst.e); e:= DCR(Bq,1,rqt.e); if (sp=d)(sp=e) sp:=e; reply.e:= NCR(Bq,1,sp); log(detected adversary); fi timeout resend if sq[0] = sq[1] rqst.e:=NCR(Bq,2,1,sq[1]); send rqst to qe; act resend in tr; skip; fi rcv reply from qe d:= DCR(Rp,0,reply.e); if sq[1] = d sq[0]:=sq[1]; log(detected adversary); fi end

55
Slide: 54 interra confidential Alternate Target Synthesis FPGA Synthesis Special Mapping to Programmable gates e.g 4 input gates(often called LUT) that can be programmed to any 4 input logic Dedicated resources needs special care while mapping and cost computation. Gates using carry chain wires have different delay from regular wires that go through switch boxes. Architechture specific Optimization LUT Switch Box

56
Slide: 55 interra confidential Alternate Target Synthesis Physical Synthesis Generates directly Placed Gates Design Convergence is guarantied Constraint that meets in Synthesis may not meet after placement. We normally need to redo the Synthesis. Physical Synthesis helps to avoid this iteration

57
Slide: 56 interra confidential Timing Driven Synthesis Synthesis is done directly to technology gates. Synthesis is done from input towards output(light to dark) Architechtures are selected while synthesizing based on the delays

58
Slide: 57 interra confidential Q & A Thank you

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google