Presentation is loading. Please wait.

Presentation is loading. Please wait.

DAAD Program Akademischer Neuaufbau Südosteuropa Embedded System Design Course SOC Design: From System to Transistor Zoran Stamenković.

Similar presentations


Presentation on theme: "DAAD Program Akademischer Neuaufbau Südosteuropa Embedded System Design Course SOC Design: From System to Transistor Zoran Stamenković."— Presentation transcript:

1 DAAD Program Akademischer Neuaufbau Südosteuropa Embedded System Design Course SOC Design: From System to Transistor Zoran Stamenković

2 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Outline Modeling SystemsModeling Systems Simulation and Verification Analog Integrated Circuits Digital Integrated Circuits Embedded Memories Logic Synthesis Design for Testability Layout Generation Design for Manufacturability SOC Example

3 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Domains and Levels ESL Design Basics of HDL Gate Modeling Delay Modeling Power Modeling Effects of Parasitics Logic Optimization Modeling Systems

4 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Domains and Levels Open Systems Interconnection (OSI) model of network communication Local area network (LAN) technologies are defined by standards that describe unique functions at both the Physical and the Data Link layers

5 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 802.11 Wireless LAN modem Modulates outgoing digital signals from a computer or other digital device to an analogue (radio) signal Demodulates the incoming analogue (radio) signal and converts it to a digital signal for the digital device Domains and Levels

6 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 MIMO and MIMAX WLAN Modems Signal processing performed in the analogue RF domain Number of the digital basebands reduced to a single one Signal processing performed in the digital baseband Domains and Levels

7 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Behavioral Structural Physical Analysis Synthesis Generation Extraction Refinement Abstraction Domains and Levels

8 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Behavioral Structural Physical Algorithm (behavioral) Register-Transfer Language Boolean Equations Differential Equations Behavioral Domain

9 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Behavioral Structural Physical Processor- Memory System Registers Gates Transistors Structural Domain

10 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Behavioral Structural Physical Polygons Sticks Standard Cells Floor Plan Physical Domain

11 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 The point of a system level model is to capture the intent of the design Design does exactly what it is defined to do, and the model is the definition of what the design does It allows software developers to test their code on a working model The value of system level modeling is in helping us to understand the implications of our intent To explore responses to the stimulus in an useful way ESL Languages UML, SystemC, SystemVerilog ESL Verification No amount of experimentation can ever prove me right; a single experiment can prove me wrong – Albert Einstein The system level testbench languages and methodologies that exist today are woefully inadequate If one tries to capture enough information in ESL to verify RTL, then one might as well write RTL Electronic System Level Design

12 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 The environment that provides models of memories, connectors, and queues that can be interconnected with configured processors into an overall system model Processor and device interfaces are at the transaction level Transaction-level modeling requests for SOC architecture assembly and simulation tools If RTL IP blocks present, HW/SW co-verification tools needed Electronic System Level Design Xtensa Processor Generator Use standard ASIC/COT design techniques and libraries for any IC fabrication process Complete Hardware Design Source pre-verified RTL, EDA scripts, test suite Customized Software Tools C/C++ compiler Debuggers Simulators RTOSes Processor Extensions int main( ) { int i; short c[100]; for (i=0;i { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/7/1654279/slides/slide_12.jpg", "name": "DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 The environment that provides models of memories, connectors, and queues that can be interconnected with configured processors into an overall system model Processor and device interfaces are at the transaction level Transaction-level modeling requests for SOC architecture assembly and simulation tools If RTL IP blocks present, HW/SW co-verification tools needed Electronic System Level Design Xtensa Processor Generator Use standard ASIC/COT design techniques and libraries for any IC fabrication process Complete Hardware Design Source pre-verified RTL, EDA scripts, test suite Customized Software Tools C/C++ compiler Debuggers Simulators RTOSes Processor Extensions int main( ) { int i; short c[100]; for (i=0;i

13 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Electronic System Level Design Auto-generated XTMP model based on memory maps Specify chip-level memory maps for shared/private memories Place interrupt and reset vectors Assign code/data to distributed memories

14 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Motivation for HDL Increased hardware complexity Design space exploration Inexpensive alternative to prototyping General features Support for describing circuit connectivity High-level programming language support for describing behavior Support for timing information (constraints, etc.) Support for concurrency VHDL IEEE Standard 1076-1987 IEEE Standard 1076-1993 Extension VHDL-AMS-1999 Verilog IEEE Standard 1364-1995 IEEE Standard 1364-2000 Hardware Description Languages

15 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Entity (VHDL) or Module (Verilog) declaration Describes the input/output ports of a module entity reg3 is port ( d0, d1, d2, en, clk : in bit; q0, q1, q2 : out bit ); end; nameport names port mode (direction) port type (VHDL only) reserved wordspunctuation module reg3 ( d0, d1, d2, en, clk, q0, q1, q2 ); input d0, d1, d2, en, clk; output q0, q1, q2; endmodule VHDL Verilog Modeling Interfaces

16 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Architecture Body (VHDL) Describes an implementation of an entity May be several per entity Module (Verilog) Is unique Behavioral Architecture Describes the algorithm performed by the module Contains Procedural Statements, each containing Sequential Statements, including Assignment Statements and Wait Statements Modeling Behavior

17 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 entity reg3 is port ( d0, d1, d2, en, clk : in bit; q0, q1, q2 : out bit ); end; architecture behav of reg3 is begin process ( d0, d1, d2, en, clk ) begin if en = '1' and clk = '1' then q0 <= d0 after 5 ns; q1 <= d1 after 5 ns; q2 <= d2 after 5 ns; end if; end process; end; `timescale 1ns/10ps module reg3 ( d0, d1, d2, en, clk, q0, q1, q2 ); input d0, d1, d2, en, clk; output q0, q1, q2; reg q0, q1, q2; always @ ( d0 or d1 or d2 or en or clk ) if ( en & clk ) begin q0 <= #5 d0; q1 <= #5 d1; q2 <= #5 d2; end endmodule Verilog VHDL Behavior Example

18 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Structural Architecture Implements the module as a composition of components Contains Signal Declarations (entity ports are also signals) Declare internal connections Component Instances Instantiate previously declared entity/architecture pairs Port Maps in component instances Connect signals to component ports Wait Statements Suspend a process or procedure Modeling Structure

19 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Structure Example d0 d1 d2 q0 q1 q2 bit0 d_latch d clk q bit1 d_latch d clk q bit2 d_latch d clk q int_clken clk gate and2 a b y

20 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 entity d_latch is port ( d, clk : in bit; q : out bit ); end; architecture basic of d_latch is begin process ( d, clk ) begin if clk = 1 then q <= d after 5 ns; end if; end process; end; entity and2 is port ( a, b : in bit; y : out bit ); end; architecture basic of and2 is begin process ( a, b ) begin y <= a and b after 5 ns; end process; end; `timescale 1ns/10ps module d_latch ( d, clk, q ); input d, clk; output q; reg q; always @ ( d or clk ) if ( clk ) begin q <= #5 d; end endmodule `timescale 1ns/10ps module and2 ( a, b, y ); input a, b; output y; reg y; always @ ( a or b ) begin y <= #5 ( a & b ); end endmodule VHDL Verilog Structure Example

21 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 entity reg3 is port ( d0, d1, d2, en, clk : in bit; q0, q1, q2 : out bit ); end; architecture struct of reg3 is component d_latch port ( d, clk : in bit; q : out bit ); end component; component and2 port ( a, b : in bit; y : out bit ); end component; signal int_clk : bit; begin bit0 : d_latch port map ( d0, int_clk, q0 ); bit1 : d_latch port map ( d1, int_clk, q1 ); bit2 : d_latch port map ( d2, int_clk, q2 ); gate : and2 port map ( en, clk, int_clk ); end; module reg3 ( d0, d1, d2, en, clk, q0, q1, q2 ); input d0, d1, d2, en, clk; output q0, q1, q2; wire int_clk; d_latch bit0 ( d0, int_clk, q0 ); d_latch bit1 ( d1, int_clk, q1 ); d_latch bit2 ( d2, int_clk, q2 ); and2 gate ( en, clk, int_clk ); endmodule VHDL Verilog Structure Example

22 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 An architecture can contain both behavioral and structural parts Process Statements and Component Instances Collectively called Concurrent Statements Processes can read and assign to signals Example: Register-transfer-language model Data-path described structurally Control section described behaviorally Mixing Behavior and Structure

23 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Mixed Example

24 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 entity multiplier is port ( clk, reset : in bit; multiplicand, multiplier : in integer; product : out integer ); end; architecture mixed of multiplier is signal partial_product, full_product : integer; signal arith_control, result_en, mult_bit, mult_load : bit; begin arith_unit : entity work.shift_adder(behavior) port map ( addend => multiplicand, augend => full_product, sum => partial_product, add_control => arith_control ); result : entity work.reg(behavior) port map ( d => partial_product, q => full_product, en => result_en, reset => reset );... Mixed Example

25 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 … multiplier_sr : entity work.shift_reg(behavior) port map ( d => multiplier, q => mult_bit, load => mult_load, clk => clk ); product <= full_product; control_section : process is -- variable declarations for control_section -- … begin -- sequential statements to assign values to control signals -- … wait on clk, reset; end process control_section; end; Mixed Example

26 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Function f = ab + ab: a is a variable, a and a are literals, ab is a term Irredundant Function No literal can be removed without changing its value Implementing logic functions is non-trivial No logic gates in the library for all logic expressions A logic expression may map into gates that consume a lot of area, time, or power A set of functions f1, f2,... is complete if every Boolean function can be generated by a combination of the functions from the set NAND is a complete set NOR is a complete set AND and OR are not complete Transmission gates are not complete Incomplete set of logic gates No way to design arbitrary logic Logic Functions

27 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 a out + c a Inverter

28 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Inverter

29 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Complementary switch produces full-supply voltages for both logic 0 and logic 1 n-type transistor conducts logic 0 p-type transistor conducts logic 1 Switches

30 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 b + a out b a VDD GND tub ties out NAND Gate

31 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 b a out a b VDD GND tub ties out NOR Gate

32 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 AOI = and/or/invert OAI = or/and/invert Implement larger functions Pull-up and pull-down networks are compact Smaller area, higher speed than NAND/NOR network equivalents AOI312 And 3 inputs And 1 input (dummy) And 2 inputs Or together these terms Invert AOI/OAI Gates and or invert out = [ab+c]

33 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Solid logic 0/1 defined by V SS /V DD Inner bounds of logic values V L /V H are not directly determined by circuit properties, as in some other logic families logic 1 logic 0 unknown V DD V SS VHVH VLVL Levels at output of one gate must be sufficient to drive next gate Logic Levels

34 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Choose threshold voltages at points where slope of transfer curve is -1 Inverter has High gain between V IL and V IH points Low gain at outer regions of transfer curve Note that logic 0 and 1 regions are not equally sized In this case, high pull-up resistance leads to smaller logic 1 range Noise margins are V DD -V IH and V IL -V SS Noise must exceed noise margin to make second gate produce wrong output Inverter Transfer Curve

35 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Resistor model of transistor Ignores saturation region Mischaracterizes linear region Gives acceptable results Inverter Delay Only one transistor is on at the time Rise time (pull-up on) Fall time (pull-up off)

36 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Delay Time required for gates output to reach 50% of final value Transition time Time required for gates output to reach 10% (logic 0) or 90% (logic 1) of final value Gate delay based on RC time constant V out (t) = V DD exp{-t/[(R n +R L )C L ]} t d = 0.69 R n C L t f = 2.3 R n C L 0.5 m process R n = 3.9 k C L = 0.68 fF t d = 0.69 x 3.9 x.68E-15 = 1.8 ps t f = 2.3 x 3.9 x.68E-15 = 6.1 ps For pull-up time, use pull-up resistance Current source model (in power/delay studies) t f = C L (V DD -V SS )/[0.5 k (W/L) (V DD -V SS -V t ) 2 ] Fitted model Fit curve to measured circuit characteristics RC Model for Delay

37 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Step Input (V GS = V DD ) Approximation

38 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Source voltage of gates in middle of network may not equal substrate voltage Difference between source and substrate voltages causes body effect 0 0 Source above V SS Early arriving signal To minimize body effect Put early arriving signals at transistors closest to power supply Body Effect

39 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Clock frequency f = 1/t Energy E = C L (V DD - V SS ) 2 Power E x f = f C L (V DD - V SS ) 2 Almost all power consumption comes from switching behavior A single cycle requires one charge and one discharge of capacitor Static power dissipation Comes from leakage currents Surprising result Resistance of the pull-up/pull-down transistor drops out of energy calculation Power consumption is independent of the sizes of the pull-up and pull-down transistors Static CMOS power-delay product is independent of frequency Voltage scaling depends on this fact Power Consumption

40 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Capacitance on power supply is not bad Can be good in absence of inductance Resistance slows down static gates May cause pseudo-nMOS circuits to fail Increasing capacitance/resistance Reduces input slope Resistance near source is more damaging It must charge more capacitance Effects of Parasitics

41 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Sometimes, large loads must be driven Off-chip or by long wires on-chip Sizing up the driver transistors only pushes back the problem Driver now presents larger capacitance to earlier stage Use a chain of inverters Each stage has transistors larger than previous stage is the driver size ratio, C big /C d = n, ln(C big /C d ) = n ln Minimize total delay through the driver chain t tot = ln(C big /C d )( /ln )t d Optimal driver size ratio is opt = e Optimal number of stages is n opt = ln(C big /C d ) Optimal Sizing

42 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Driving Large Fan-Out Fan-out adds capacitance Increase sizes of driver transistors Must take into account rules for driving large loads Add intermediate buffers This may require/allow restructuring of the logic

43 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Network delay is measured over paths through network Can trace a causality chain from inputs to worst-case output Critical path creates longest delay Can trace transitions which cause delays that are elements of the critical path delay To reduce circuit delay, speed up the critical path Reducing delay off the path doesnt help There may be more than one path of the same delay Must speed up all equivalent paths to speed up circuit Path Delay

44 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Logic gates are not simple nodes Some input changes dont cause output changes A false path is a path which cannot be exercised due to Boolean gate conditions False paths cause pessimistic delay estimates False Paths

45 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Rewrite by using sub-expressions Logic rewrites may affect gate placement Flattening logic Increases gate fan-in Logic synthesis programs Transform Boolean expressions into logic gate networks in a particular library Deep LogicShallow Logic Logic Transformations

46 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Optimization goals Minimize area, meet delay constraint Technology-independent optimization Works on Boolean expression equivalent Estimates size based on number of literals Uses factorization, resubstitution, minimization, etc. Uses simple delay models Technology-dependent optimization Maps Boolean expressions into a particular cell library May perform some optimizations on addition to simple mapping Allows more accurate delay models Logic Optimization

47 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Simulation Verification Annotation Simulation and Verification

48 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Simulation Tests the functionality of a designs elaborated model Needs a test bench and a simulation tool Advances in discrete time steps Test Bench Includes an instance of the design under test Applies sequences of test values to inputs Monitors signal values on outputs using simulator Simulation Tools NCSIM (Cadence) VSIM (Mentor Graphics) VCS (Synopsys) Simulation

49 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Event-driven simulation is designed for digital circuit characteristics Small number of signal values Relatively sparse activity over time Event-driven simulators try to update only those signals which change in order to reduce CPU time requirements An event is a change in a signal value A time-wheel is a queue of events Simulator traces structure of circuit to determine causality of events Event at input of one gate may cause new event at gates output Event-Driven Simulation

50 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Special type of event-driven simulation optimized for MOS transistors Treats the transistor as a switch Takes capacitance into account to model charge sharing Can also be enhanced to model the transistor as a resistive switch A B C D Switch Simulation

51 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 entity test_bench is end; architecture test_reg3 of test_bench is signal d0, d1, d2, en, clk, q0, q1, q2 : bit; begin dut : entity work.reg3(behav) port map ( d0, d1, d2, en, clk, q0, q1, q2 ); stimulus : process is begin d0 <= 1; d1 <= 1; d2 <= 1; wait for 20 ns; en <= 0; clk <= 0; wait for 20 ns; en <= 1; wait for 20 ns; clk <= 1; wait for 20 ns; d0 <= 0; d1 <= 0; d2 <= 0; wait for 20 ns; … wait; end process stimulus; end; Test Bench Example

52 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 To test a refinement of a design Low-level structural model must be functionally the same as a corresponding behavioral model To include two instances of a design in the test bench To stimulate both with same test values on inputs To compare values of outputs for equality To take account of timing differences Zero delay Unit delay Gate delay RC delay Verification

53 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 architecture regression of test_bench is signal d0, d1, d2, d3, en, clk : bit; signal q0a, q1a, q2a, q3a, q0b, q1b, q2b, q3b : bit; begin dut_a : entity work.reg4(struct) port map ( d0, d1, d2, d3, en, clk, q0a, q1a, q2a, q3a ); dut_b : entity work.reg4(behav) port map ( d0, d1, d2, d3, en, clk, q0b, q1b, q2b, q3b ); stimulus : process is begin d0 <= 1; d1 <= 1; d2 <= 1; d3 <= 1; wait for 20 ns; en <= 0; clk <= 0; wait for 20 ns; en <= 1; wait for 20 ns; clk <= 1; wait for 20 ns; … wait; end process stimulus;... Verification Example

54 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 … verify : process is begin wait for 10 ns; assert q0a = q0b and q1a = q1b and q2a = q2b and q3a = q3b report implementations have different outputs severity error; wait on d0, d1, d2, d3, en, clk; end process verify; end architecture regression; Verification Example

55 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Standard Delay Format (SDF) annotation Design timing is stored in an SDF file Used to iteratively improve design Updates a more-abstract design with information from later design stages Annotation of logic schematic with extracted parasitic resistances and capacitances Back annotation requires tools to know more about each other Simulation tools Synthesis tools Layout tools Annotation

56 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 (DELAYFILE (SDFVERSION "OVI 1.0") (DESIGN "tcp_1_chip") (DATE "Fri Apr 30 09:48:22 2004") (VENDOR "cdr3synPwcslV225T125") (PROGRAM "Synopsys Design Compiler cmos") (VERSION "2003.06") (DIVIDER /) (VOLTAGE 2.25:2.25:2.25) (PROCESS) (TEMPERATURE 125.00:125.00:125.00) (TIMESCALE 1ns) (CELL (CELLTYPE "tcp_1_chip") (INSTANCE) (DELAY (ABSOLUTE (INTERCONNECT U5/x U81/a (0.000:0.000:0.000)) (INTERCONNECT U73/x U74/a (0.000:0.000:0.000))... ) (CELL (CELLTYPE "exnor2_1") (INSTANCE i_aes_wr/U_ALG/U6533) (DELAY (ABSOLUTE (IOPATH a x (0.662:1.045:1.045) (0.682:1.076:1.076)) (IOPATH b x (1.379:1.416:1.416) (1.454:1.492:1.492)) )... (CELL (CELLTYPE "mux2_2") (INSTANCE i_mips/u0/ejt_tap\/pa_addr_reg_next\/bit_00i/U1) (DELAY (ABSOLUTE (IOPATH d0 x (0.395:0.395:0.395) (0.464:0.464:0.464)) (IOPATH d1 x (0.387:0.403:0.403) (0.447:0.477:0.477)) (IOPATH sl x (1.768:1.781:1.781) (1.879:1.892:1.892)) ) Standard Delay Format

57 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Filters Amplifiers Phase Lock Loop Voltage Control Oscillator Modulator/Demodulator Analog Integrated Circuits

58 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Fairchild Semiconductor μA741 Op-Amp In 1963, a 26-year-old engineer named Robert Widlar designed the first monolithic op-amp IC, the μA702 Price at the beginning was $300 Fairchild and competitors have sold it in the hundreds of millions Now, for $300 you can get about a thousand of todays 741 chips

59 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Signetics NE555 Timer A simple IC from 1971 that could function as a timer or an oscillator It would become a best seller in analog semiconductors Kitchen appliances Toys Spacecraft A few thousand other things Many billions have been sold

60 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Intersil ICL8038 Waveform Generator A generator of sine, square, triangular, sawtooth, and pulse waveforms from 1983 Countless applications Music synthesizers Blue boxes Hundreds of millions sold Intersil discontinued the production in 2002

61 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 LNA in BiCMOS Technology

62 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 PLL for 802.11a WLAN

63 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Oscillator

64 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Modulator

65 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Adders Multipliers Shifters Carry Units Arithmetic-Logic Units Digital Integrated Circuits

66 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Computes one-bit sum and carry s i = a i b i c in c out = a i b i + a i c i + b i c in Ripple-carry adder: n-bit adder built from full adders Delay of ripple-carry adder goes through all carry bits Full Adder

67 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 0 1 1 0multiplicand x 1 0 0 1multiplier 0 1 1 0 + 0 0 0 0 0 0 1 1 0 + 0 0 0 0 0 0 0 1 1 0 + 0 1 1 0 0 1 1 0 1 1 0 partial product Combinational Multiplier

68 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Array multiplier is an efficient layout of a combinational multiplier Array multipliers may be pipelined to decrease clock period at the expense of latency xny0 + 0 + P(2n-1) P(2n-2) + x0y0x1y0x2y0 0 x0y1 + x1y1 0 + x0y2 + x1y2 P0 Array Multiplier

69 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Reduces depth of adder chain Built from carry-save adders Three inputs a, b, c Produces two outputs y, z y + z = a + b + c Carry-save equations y i = parity (a i,b i,c i ) z i = majority (a i,b i,c i ) At each stage, i numbers are combined to form 2i/3-sums Final adder completes the summation Wiring is more complex Wallace Tree

70 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Used in serial-arithmetic operations Multiplicand can be held in place by register Multiplier is shifted into array Serial-Parallel Multiplier

71 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Can perform n-bit shifts in a single cycle Accepts 2n data inputs and n control signals, producing n data outputs Selects arbitrary contiguous n bits out of 2n input buts Examples Right shift: data into top, 0 into bottom Left shift: 0 into top, data into bottom Rotate: data into top and bottom data 1 data 2 n bits output n bits Barrel Shifter

72 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Two-dimensional array of 2n vertical X n horizontal cells Input data travels diagonally upward Output wires travel horizontally Control signals run vertically Exactly one control signal is set to 1, turning on all transmission gates in that column Large number of cells, but each one is small Delay is large, considering long wires and transmission gates Barrel Shifter

73 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 First computes carry propagate and generate P i = a i + b i G i = a i b i Computes sum and carry from P and G s i = c i P i G i c i+1 = G i + P i c i Can recursively expand carry formula c i+1 = G i + P i (G i-1 + P i-1 c i-1 ) c i+1 = G i + P i G i-1 + P i P i-1 (G i-2 + P i-1 c i-2 ) Expanded formula does not depend on intermediate carries Allows carry for each bit to be computed independently Carry-Lookahead Unit

74 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Deepest carry expansion requires gates with large fan-in Large and slow Carry-lookahead unit requires complex wiring between adders and lookahead unit Values must be routed back from lookahead unit to adder Depth-4 Carry-Lookahead Unit

75 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Looks for cases in which carry out of a set of bits is identical to carry in Typically organized into m-bit stages If a i = b i for every bit in stage, then bypass gate sends stages carry input directly to carry output Carry-Skip Adder

76 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Computes two results in parallel, each for different carry input assumptions Uses actual carry in to select correct result Reduces delay to multiplexer Carry-Select Adder

77 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Precharged carry chain which uses P and G signals Propagate signal connects adjacent carry bits Generate signal discharges carry bit Worst-case discharge path goes through entire carry chain Manchester Carry Chain

78 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 May be used in signal-processing arithmetic where fast computation is important but latency is unimportant LSB control signal clears the carry shift register Serial Adder

79 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Computes a variety of logical and arithmetic functions based on opcode May offer complete set of functions of two variables or a subset Built around adder, since carry chain determines delay Function block may be used to compute required intermediate signals for a full-function ALU Transmission gates may introduce significant delay Arithmetic-Logic Unit

80 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 P and G compute intermediate values from inputs May not correspond to carry lookahead P and G for non-addition functions Add unit is adder of choice Output unit computes from sum, propagate signal Arithmetic-Logic Unit

81 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 32-bit RISC microprocessor from 1985 The simplicity made all the difference Small, low power, and easy to program ARM architecture has become the dominant embedded processor More than 10 billion ARM cores have been used in all sorts of gadgetry, including the iPhone Acorn Computers ARM1 Processor

82 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Russell Fish and Chuck Moore 1988 found a way to have the processor run its own super fast internal clock while still staying synchronized with the rest of the computer In the years since Sh-Booms invention, the speed of processors had by far surpassed that of motherboards, and so practically every maker of computers and consumer electronics wound up using the same solution Since 2006, Patriot Scientific (and Moore) have reaped over US $125 million in licensing fees from Intel, AMD, Sony, Olympus, and others Computer Cowboys Sh-Boom Processor

83 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Microchip Technology PIC16C84 8-bit microcontroller in 1993 Incorporates EEPROM Does not need UV light to be erased as EPROM needs 8-bit Microprocessors Radiation-hardened RCA CDP 1802 8-bit microprocessor in 1976 One of the first, if not the first, CMOS processors Low power consumption, wide range of operating voltages and military operating temperature range

84 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Read-Only Memory Static Random-Access Memory Dynamic Random-Access Memory Memory Generators Embedded Memories

85 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Address is divided into row and column Row may contain full word or more than one word Selected row drives/senses bit lines in columns Amplifiers/drivers read/write bit lines Memory Architecture

86 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 ROM core is organized as an array of NOR gates Pull-down transistors of NOR determine programming Erasable ROMs require special processing that is not typically available ROMs on digital ICs are generally mask-programmed Placement of pull-downs determines ROM contents Read-Only Memory (ROM)

87 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Core cell uses six-transistor circuit to store value Value is stored symmetrically Both true and complement are stored on cross-coupled transistors SRAM retains value as long as power is applied to circuit Read Precharge bit and bit high Set select line high from row decoder One bit line will be pulled down Write Set bit/bit to desired (complementary) values Set select line high Drive on bit lines will flip state if necessary Static Random-Access Memory (SRAM)

88 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Differential pair Takes advantage of complementarity of bit lines One bit line goes low One arm of diff pair reduces its current, causing compensating increase in current of another arm Sense amp can be cross-coupled to increase speed SRAM Sense Amplifier

89 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Dynamic Random-Access Memory (DRAM) Cell can easily be made with a CMOS digital technology process Dynamic RAM loses value due to charge leakage Must be refreshed Value is stored on gate capacitance of transistor t 1 Read read = 1, write = 0, read_data is precharged t 1 will pull down read_data if 1 is stored Write read = 0, write = 1, write_data = value Guard transistor writes value onto gate capacitance Modern commercial DRAMs use one- transistor cell

90 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 In 1980, Fujio Masuoka recruited four engineers to a project aimed at designing a memory chip that could store lots of data and would be affordable Team came up with a variation of EEPROM that featured a memory cell consisting of a single transistor (at the time, conventional EEPROM needed two transistors per cell) Why is it named flash? Because of the chips ultrafast erasing capability In 1984 Masuoka presented a paper at the IEEE International Electron Devices Meeting In 1988 Intel developed a type of flash based on NOR logic gates (a 256 kilobit chip) Toshibas first NAND flash (greater storage densities but trickier to manufacture) hit the market in 1989 Toshiba NAND Flash Memory

91 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Memory Generators A software tool which can create memories (ROM or RAM blocks) in a range of sizes as needed The customer usually wants a particular number of words (depth) and bits (width) for each memory ordered Each of the final building blocks (physical layout) will be implemented as a stand- alone, densely packed, pitch-matched array Complex layout generators and state-of-the-art logic and circuit design techniques offer Embedded memories of extreme density and performance Each memory generator is a set of various, parameterized generators Layout generator generates an array of custom, pitch-matched leaf cells Schematic generator and Net-lister extracts a net-list used for both layout vs. schematic and functional verification Function and Timing model generators create models for gate level simulation, dynamic/static timing analysis and synthesis Symbol generator generates schematic Critical Path generator is used for both circuit design and timing characterization

92 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Logic Synthesis Logic Synthesis Flow Optimization Technology Mapping Low-Power Techniques

93 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Optimized Net-list SDF File Optimized Net-list SDF File yes Architectural Description RTL Description Computer-Aided Synthesis Design Constraints Constraints Met no Gate-Level Net-list Standard Cell Library Logic Synthesis Flow Goal is to create a logic gate network which performs a given set of functions Input is Boolean formulae Output is gates implementing Boolean functions Several iterations needed for generation of the optimized gate-level description Logic synthesis Maps onto available gates Restructures for delay, area, testability, power, etc. Automated logic synthesis has enabled Enormous reduction of the time needed for conversion of a design from high-level to gate-level description Saving of designer resources for architectural and RTL descriptions, and optimization of the standard cell library

94 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Scheduling determines Number of clock cycles required As-soon-as-possible (ASAP) schedule puts every operation as early in time as possible As-late-as-possible (ALAP) schedule puts every operation as late in schedule as possible Binding determines Area and cycle time Area tradeoffs must consider Shared function units vs. multiplexers and control Delay tradeoffs must consider Cycle time vs. number of cycles High-Level Synthesis

95 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Technology-independent optimizations A Boolean network is the main representation of the logic functions Each node can be represented as sum-of-products (or product-of-sums) Functions in the network need not correspond to logic gates Technology mapping (library binding) Design transformation from technology-independent to technology- dependent Technology-dependent optimizations Work in the available set of logic gates Logic Synthesis Phases

96 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Area is estimated by number of literals Literal is true or complement form of a variable Simplification Rewrites a node to reduce the number of literals in it Network restructuring Introduces new nodes for common factors Collapses several nodes into one new node Delay restructuring Changes factorization to reduce path length out1 = k2 + x2out2 = k3 + x1 k2 = x1 x2 x4 + k1 k3 = k1 x4 k1 = x2 + x3 x1x2x3x4 Technology-Independent Optimization

97 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Function is defined by On-set: set of inputs for which output is 1 Off-set: set of inputs for which output is 0 Dont-care-set: set of inputs for which output is dont-care Each way to write a function as a sum- of-products is a cover It covers the on-set A cover is composed of cubes Cubes are product terms that define a subspace cube in the function space x1x1 x2x2 x3x3 0 1 1 1 Covers and Cubes

98 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Larger cover x1 x2 x3 + x1 x2 x3 + x1 x2 x3 + x1 x2 x3 Requires four cubes (12 literals) Smaller cover x2 x3 + x1 x3 + x1 x2 x3 Requires three cubes (7 literals) x1 x2 x3 is covered by two cubes Dont-cares Can be implemented in either on-set or off-set Provide the greatest opportunities for minimization in many cases Espresso A two-level logic optimizer Expands, makes irredundant and reduces Optimization loop refines cover to reduce its size Covers and Optimizations

99 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Based on division Formulate candidate divisor Test how it divides into the function If g = f/c, we can use c as an intermediate function Algebraic division Doesnt take into account Boolean simplification Less expensive then Boolean division Three steps Generate potential common factors and compute literal savings if used Choose factors to substitute into network Restructure the network to use the new factors Algebraic/Boolean division is used to implement first step Factorization

100 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Rewrites Boolean network In terms of available logic functions Optimizes for Area Delay Can be viewed as a pattern matching problem Find pattern match which minimizes area/delay cost Procedure Write Boolean network in canonical NAND form Write each library gate in canonical NAND form Assign cost to each library gate Use dynamic programming to select minimum-cost cover of network by library gates Technology Mapping

101 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Breaking into Trees not optimal, but reasonable cuts usually work well

102 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 after three levels of matching Mapping Example

103 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 after four levels of matching Mapping Example

104 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Architecture-driven supply voltage scaling Add extra logic to increase parallelism so that system can run at lower frequency Power improvement for n parallel units over Vref P n (n) = [1 + C i (n)/nC ref + C x (n)/C ref ](V/V ref ) Dynamic voltage and frequency scaling Decreased to parts of the circuit where it does not adversely affect the performance Dynamic scaling is regulated by software based on system load Reducing capacitances Parasitic capacitances of the transistors Parasitic capacitances of the wires Low Power Techniques

105 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Reducing switching activity Deactivate the clock to unused registers (clock gating) Deactivate signals if not used (signal gating) Deactivate V DD for unused hardware blocks (power gating) Low Power Techniques Distributed clocks: Globally Asynchronous Locally Synchronous Eliminating centrally synchronous clocks and utilizing local clocks Distinct local clocks, possibly running at different frequencies

106 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Design for Testability DFT Methods Scan Design Test Pattern Generation Built-In Self-Test

107 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Make the system as testable as possible Keep minimum cost in hardware and testing time Use knowledge of architecture to help in selection of testability points Modify architecture to improve testability DFT for digital circuits Ad-hoc methods Avoid asynchronous feedback Make flip-flops initializable Avoid redundant gates, large fan-in gates and gated clocks Provide test control for difficult-to-control signals Consider ATE requirements (tri-states, etc.) Structured methods Scan Design Built-in self-test (BIST) Boundary scan Design for Testability Methods

108 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Circuit is designed using pre-specified design rules Test structure (hardware) is added to the verified design Add a test control (TC) primary input Replace flip-flops by scan flip-flops (SFF) and connect to form one or more shift registers (scan-chains) in the test mode Make input/output of each scan-chain controllable/observable from primary input/primary output Use combinational ATPG to obtain tests for all testable faults in the combinational logic Add shift register tests and convert ATPG tests into scan sequences for use in manufacturing test Full scan is expensive Must roll out and roll in state many times during a set of tests Partial scan selects some registers (not all) for scanability to reduce the chain length Analysis is required to choose which registers are best for scan Scan Design

109 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 D TC SD CK Q Q MUX D flip-flop Master latchSlave latch CK TC Normal mode, D selectedScan mode, SD selected Master open Slave open t t Logic overhead Scanable Flip-Flop

110 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Level-Sensitive Scanable Flip-Flop D SD MCK Q Q D flip-flop Master latchSlave latch SCK TCK SCK MCK TCK Normal mode MCK TCK Scan mode Logic overhead

111 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Scan Structure SCANIN SCANOUT TC or TCK SFF Combinational logic PI PO Not shown: CK or MCK/SCK feed all SFFs

112 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Combinational Test Vectors I2 I1 O1 O2 PI PO SCANIN SCANOUT S1 S2 N1 N2 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 TC Sequence length = (n comb + 1) n sff + n comb clock periods n comb = number of combinational vectors n sff = number of scan flip-flops

113 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Scan-chain must be tested prior to application of scan test sequences A shift sequence 00110011... of length n sff +4 in scan mode (TC=0) Produces 00, 01, 11 and 10 transitions in all flip-flops Observes the result at SCANOUT output Total scan test length (n comb + 2) n sff + n comb + 4 clock periods Example 2,000 scan flip-flops, 500 comb. vectors, total scan test length ~ 10 6 clocks Multiple scan-chains reduce test length Testing Scan Chain

114 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Errors are introduced during manufacturing Testing weeds out infant mortality Varieties of testing Functional testing Performance testing Fault model Possible locations of faults I/O behavior produced by the fault With a fault model, we can test the network for every possible instantiation of that type of fault It is difficult to enumerate all types of manufacturing faults Testing procedure Set inputs Observe output Compare fault-free and observed output Testing and Faults

115 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Logic gate output is always stuck at 0 or 1 independently on input values Correspondence to manufacturing defects depends on logic family Experiments show that 100% stuck-at-0/1 fault coverage corresponds to high overall fault coverage Testing NAND Three ways to test it for stuck-at-0 Only one way to test it for stuck-at-1 Testing NOR Three ways to test it for stuck-at-1 Only one way to test it for stuck-at-0 abOKSA0SA1 00101 01101 10101 11001 abOKSA0SA1 00101 01001 10001 11001 Stuck-At-0/1 Faults

116 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Can test both NANDs for stuck-at-0 simultaneously abc = 000 Cannot test both NANDs for stuck-at-1 simultaneously due to inverter Must use two vectors Must also test inverter Multiple Test Example

117 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Transistors always on/off t 1 is stuck open (switch cannot be closed) No path from V DD to output capacitance Testing requires two cycles Must discharge capacitor Try to operate t 1 to charge capacitor Stuck-At-Open/Closed Model

118 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Two parts of testing Controlling the inputs of (possibly interior) gates Observing the outputs of (possibly interior) gates Delay faults Gate delay model assumes that all delays are lumped into one gate Path delay model takes into account the delay of a path through network Performance problems Functional problems in some types of circuits Combinational Testing Example

119 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Goal Test gate D for stuck-at-0 fault First step Justify 0 values on gate inputs Work backward from gate to primary inputs w1 = 0 (A output = 0) i1 = i2 = 1 Observe the fault at a primary output o1 gives different values if D is true/faulty Work forward and backward Fs other input must be 0 to detect true/fault Justify 0 at Es output In general, may have to propagate fault through multiple levels of logic to primary outputs Testing Procedure

120 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Redundant logic can mask faults Testing NOR for SA0 requires setting both inputs to 0 Network topology ensures that one NOR input (for instance b) will always be 1 Function reduces to 0 f = ((a+b) + b) = (a + b)b = 0 Redundant logic can introduce delay faults and other problems Redundancy and Testing

121 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Much harder than combinational testing Cant set memory element values directly Must apply sequences To put machine in proper state for test To observe value of test Testing of NAND for stuck-at-1 Set both NAND inputs to 1 Primary input i1 can be controlled directly Lower input is 1 if ps0/ps1 = 1 Sequential Testing

122 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 A model for sequential test Unroll machine in time A single-stuck-at fault in sequential machine appears to be the multiple- stuck-at fault Time-Frame Expansion

123 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Automatic test pattern generator (ATPG) generates a set of test vectors Boolean network (combinational ATPG) Sequential machine (sequential ATPG) D (from Discrepancy) allows us to quickly write fault D value on a node means that good and faulty circuits have different values at that point If a test for a particular fault exists, D-algorithm will find it by an exhaustive search of all sensitized paths Start at the faulty gate Suppose initially a stuck-at fault on gate output Primitive D-cube of failure (PDCF) of gate summarizes minimal assignment of input values to highlight fault Propagation D-cube (PDC) has D or D on output and on at least one input Summarizes non-controlling values for other inputs to allow propagation of D signal Test Pattern Generation

124 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 PODEM stands for Path-Oriented DEcision Making Circuit-based, fault-oriented ATPG algorithm Goal Propagate D value to primary outputs Signal values are explicitly assigned at primary inputs only Other values are computed by implication Backtracking means reassigning primary inputs when a contradiction occurs Uses implicit enumeration Uses five values: 0, 1, D, D, and X Start all values at X In worst case, must examine all possible inputs Can be implemented to run quickly PODEM Algorithm

125 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Fault Propagation Example

126 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Includes on-chip machine responsible for Generating tests Evaluating correctness of tests Allows many tests to be applied Cant afford large memory for test results Rely on compression and statistical analysis Uses a linear-feedback shift register (LFSR) to generate a pseudo-random sequence of bit vectors Built-In Self-Test (BIST)

127 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 One LFSR generates test sequence Another LFSR captures/compresses results Can store a small number of signatures which contain expected compressed results for valid system Usually used for testing memory blocks BIST Architecture

128 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Layout Generation Layout Generation Flow Design Rules Layout Tools Standard Cells Floorplanning Placement Routing Clock Tree Pads

129 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Verify Net-list Floorplanning Plan Power Routing Placement Adjust Size Generate Clock Trees Optimize Placement Route Post-layout Net-list SDF, DEF, GDSII Post-layout Net-list SDF, DEF, GDSII Cell Library (LEF, TLF) Cell Library (LEF, TLF) Design Constraints Library Exchange Format (LEF) files To create a library database (standard cells, I/O cells, and macro blocks) Timing Library Format (TLF) file Timing constraints General Constraints Format (GCF) file Design constraints Verilog net-list To create a design database Layout Generation Flow

130 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Floorplanning To create a core area with rows (or columns) and I/O rows around the core area Power planning and routing To plan, modify and rout power paths, power rings and power stripes Placement An I/O constraints file may be used to place the I/O pads Block placement Cell placement Size adjustment To estimate the die size To resize the design to make it routable Layout Generation Flow

131 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Generating clock trees The clock buffer space and clock net must be defined Generating clock trees is iterative process At this point, the physical net-list differ from the logical (original) net-list Placement optimization To resize gates and insert buffers to correct timing and electrical violations Routing To perform both global and final route on a placed design Verification To check for shorts and design rule violations Layout Generation Flow

132 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Masks are tools for manufacturing Manufacturing processes have inherent limitations in accuracy Design rules specify geometry of masks which will provide reasonable yields Design rules are determined by experience MOSIS SCMOS Designed to scale across a wide range of technologies Designed to support multiple vendors Designed for educational use Fairly conservative Lambda ( ) design rules Size of a minimum feature defines Specifying particularizes the scalable rules Parasitics are generally not specified in units Design Rules

133 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 metal 3 6 metal 2 3 metal 1 3 pdiff/ndiff 3 poly 2 Wires

134 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 2 3 1 3 2 5 Transistors

135 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Types of via Metal1/diff Metal1/poly Metal2/metal1 Metal3/metal2... 4 1 4 2 Highest via Cut: 3 x 3 Overlap by metal2: 1 Minimum spacing: 3 Minimum spacing to via1: 2 Vias

136 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Diffusion/diffusion 3 Poly/poly 2 Poly/diffusion 1 Via/via 2 Metal1/metal1 3 Metal2/metal2 4 Metal3/metal3 4 Spacings

137 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Cut in passivation layer Connection for bonding wire Minimum bonding pad 100 Pad overlap of glass opening 6 Minimum pad spacing to unrelated metal2/3 30 Minimum pad spacing to unrelated metal1, poly, active 15 Overglass

138 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Layout editors are interactive tools Design rule checkers identify errors on the layout Circuit extractors extract the net-list from the layout Connectivity verification systems (CVS) compare extracted and original net-lists CADENCE Virtuosos Layout-versus-Schematic (LVS) tool Standard cell layouts are created from pre-designed cells using the custom routing Silicon Ensemble (CADENCE) Encounter (CADENCE) Physical Compiler (SYNOPSYS) Layout Tools

139 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 routing area Layout made of small cells Gates, flip-flops, etc. Cells are hand-designed Assembly of cells is automatic Cells arranged in rows Wires routed between and through cells Pitch is the height of a cell All cells have same pitch, may have different widths VDD/VSS connections are designed to run through cells A feedthrough area allows wires to be routed over the cell Standard Cell Layout VSS n tub p tub Intra-cell wiring pullups pulldowns pin Feedthrough area VDD

140 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Floorplanning must take into account Blocks of varying function, size, and shape Space allocation Signal routing Power supply routing Clock distribution Floorplanning Strategy

141 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Develop a wiring plan Think about how layers will be used to distribute important wires Draw separate wiring plans for power and clocking These are important design tasks which should be tackled early Sweep small components into larger blocks A floorplan with a single NAND gate in the middle will be hard to work with Design wiring that looks simple If it looks complicated, it is complicated Design planar wiring Planarity is the essence of simplicity Do it where feasible (and where it doesnt introduce unacceptable delay) Floorplanning Tips

142 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Placement of components interacts with routing of wires Quality metrics for layout Area and delay Area and delay determined in part by Wiring How do we judge a placement without wiring? Estimate wire length without actually performing routing bad placementgood placement Placement Metrics

143 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 To construct an initial solution To improve an existing solution Pairwise interchange is a simple improvement metric Interchange a pair, keep the swap if it helps wire length Heuristic determines which two components to swap Placement by partitioning Works well for components of fairly uniform size Partition net-list to minimize total wire length using min-cut criterion Kernighan-Lin Algorithm Computes min-cut criterion, count total net-cut change Exchanges sets of nodes to perform hill-climbing finding improvements where no single swap will improve the cut Recursively subdivide to determine placement detail Placement Techniques

144 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Major phases in routing Global routing assigns nets to routing areas Detailed routing designs the routing areas Net ordering determines quality of result Net ordering is a heuristic Blocks and wiring Blocks divide wiring area into routing channels Large wiring areas may force rearrangement of block placement Channel routing Channel grows in one dimension to accommodate wires Pins generally on only two sides Switchbox routing Box cannot grow in any dimension Pins are on all four sides Routing channel switchbox channel

145 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Tracks form a grid for routing Spacing between tracks is center-to-center distance between wires Track spacing depends on wire layer used Density (vertical and horizontal) Gives the number of wire segments crossing a vertical/horizontal grid segment Different layers are used for horizontal and vertical wires Horizontal and vertical wires can be routed relatively independently Placement of cells determines placement of pins Pin placement determines difficulty of routing problem Routing Channels

146 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Assumes one horizontal segment per net Sweep pins from left to right Assign horizontal segment to lowest available track Limitations Some combinations of nets require more than one horizontal segment per net (a dog-leg wire) Aligned pins form vertical constraints Wire to lower pin must be on lower track Wire to upper pin must be above lower pins wire A B C ABBC BA AB aligned ? Left-Edge Algorithm

147 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Global routing Assign wires to paths through channels Dont worry about exact routing of wires within channel Can estimate channel height using congestion Detailed routing Dog-leg router breaks net into multiple segments as needed Minimize number of dog-leg segments per net to minimize congestion for future nets Use left-edge criterion on each dog-leg segment to fill up the channel Global and Detailed Routing

148 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Multipoint nets are harder to design than two-point nets Rectilinear Steiner tree problem: Find the minimum-length tree interconnecting all nets points Find additional points (so-called Steiner points) in the plane, if they contribute to a shorter tree length Steiner tree algorithm Compute the spanning tree Optimize the tree length by flipping L-shaped branches Steiner points always have degree three Multipoint Nets

149 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Place and route to minimize Capacitance of nodes with high glitching activity Feed back wiring capacitance values To better estimate power consumption Size wires to be able to handle current Requires designing topology of V DD /V SS networks Keep power network in metal Requires designing planar wiring Layout for Low Power

150 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Clock delay varies with position Deliver clock to memory elements with acceptable skew Deliver clock edges with acceptable sharpness Clocking network design The greatest challenge in the design of a large chip Clock Delay

151 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Clocks are generally distributed via wiring trees Use low-resistance interconnect to minimize delay Use multiple drivers to distribute driver requirements Use optimal sizing principles to design buffers Clock lines can create significant crosstalk Clock Distribution Tree

152 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Pads are placed on top-layer metal Provide a place to bond to the package Pads are typically placed around periphery of chip Some advanced packaging systems bond directly to package without bonding wire Some allow pads across entire chip surface Supply power/ground to Each pad Chip core Positions of pads May be determined by pin requirements Distribute power/ground pins as evenly as possible To minimize power distribution problems Pad Architecture

153 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Main purpose is to provide electrostatic discharge (ESD) protection Gate of transistor is very sensitive Can be permanently damaged by high voltage Static electricity in room is sufficient to cause damage Resistor is used in series with pad to limit current caused by voltage spike May use parasitic bipolar transistors to drain away high voltages One for positive pulses Another for negative pulses Must design layout to avoid latch-up Input Pad

154 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Dont need ESD protection Transistor gates not connected to pad Must be able to drive capacitive pad load and outside load May need voltage level shifting To be compatible with other logic families Output Pad

155 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Combination of input and output Controlled by a mode-input on chip Pad includes logic to disconnect output driver when pad is used as an input Must be protected against ESD Three-State Pad

156 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Design for Manufacturability Defects and Faults Critical Area Modeling Critical Area Extraction Yield Modeling Yield Control

157 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Design faults Random faults Shorts between lines Breaks of lines Leakage of insulation layers Permanent faults Dynamic faults Transient faults Parametric test Power consumption Input resistance Input/output current Functional test Test signals applied Output observed Defects, Faults, and Tests

158 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Defect density distribution g(D) Defect size distribution h(X) Defects Statistics

159 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 D g(D) 2DD 1 1 D Defect Density Distribution

160 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Defect Size Distribution

161 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Sample size Number of measurements Structure size High and low defect densities Critical dimensions As defect sizes Self-isolation Sensitive on one defect type Measurability Electrical Simple Reliable Test Structures

162 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Long parallel conductorsReal Patterns Critical Area Models

163 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Critical Area Models

164 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Point defects Lithographic defects Critical Area Extraction

165 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 ShortsBreaks Vertical shorts Critical Area Extraction

166 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Yield Models

167 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Yield Control

168 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Design Methodology SOC Design Flow An Example SOC Example

169 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Every company has its own design methodology Methodology depends on Size of chip Design time constraints Cost/performance Available tools Driven by contradictory impulses Customer concerns about cost and performance Forecasts of feasibility of cost and performance Design features, performance, power, etc. May be negotiated at early stages Negotiation at later stages creates problems Design Methodology

170 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Design styles Full custom logic design (very tedious) Semi custom or standard cell design Field Programmable Gate Array (FPGA) design Full custom design Most likely for data-paths Least likely for random logic off critical path Standard cell design Application Specific Integrated Circuits (ASIC) FPGA design Prototyping oriented Design Styles

171 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Functionality, speed, power consumption, area, etc. Estimation techniques of circuit performance vary with module Memories may be generated once size is known Data-paths may be estimated from previous design Controllers are hard to estimate without details Clock distribution Layout design rule check Testing Generation of simulation test vectors Generation of scan test vectors (ATPG) Manufacturing test vectors comprise of both Design Validation

172 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 MPL Library New RTL Designs HDL Top Module Definition Simulation OK? Logic Synthesis Simulation OK? Layout Synthesis Simulation OK? Final Chip Layout Test Benches yes Configurable Modules (synthesizable RTL code) Pre-defined Modules (synthesized net-lists, layouts, standard cells) New logic run sufficient? yes New layout run sufficient? yes no yes no Applications System Specification A A A SOC Design Flow

173 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Wireless Broadband Networks A highly integrated broadband wireless modem Wireless Internet A new terminal oriented TCP/IP for wireless systems RF Baseband DLC Application Engine Mobile Bus. Engine Protocol Engine Wireless Internet Power Management Wireless Internet Test Engine Test Project Wireless Broadband Network Wireless Sensor Networks A flexible sensor network node architecture for medical applications Mobile Business Engine A specific application processor for highly efficient encryption operations Applications

174 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 SOC designs are a mix of Intellectual Property (IP) blocks Standard functions (in-house developed) Application specific blocks (in-house developed) RTL descriptions, net-lists and layouts Soft-core MIPS32 4KEp Soft-core LEON-2 Hard-core LEON-3 Soft-core IPMS430 UART, GPIO, PCMCIA AMBA, I 2 C bus Controllers Hardware accelerators SRAM memory generator Hard-core flash Reusable Modules

175 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Configurable Processor

176 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Customisation addresses three architectural levels Instruction extension: designer specifies its functionality Inclusion/exclusion of predefined blocks: special registers, BIST, etc. Parameterisation: cache size, number of registers, etc. To customise the extensible processor to a specific application It starts with profiling the application using instruction-set simulator To evaluate customisations using retargetable tool generation Retargetable techniques automatically generate compilers and simulators aware of the new or extended instructions Major players in the field Tensilica, Improv Systems, ARC, Coware, and Target Compiler Techn. NECs TCP/IP offload engine integrates 10 extensible processors Extensible Processor

177 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 The crisis of complexity in SOC design SOC designs do not exploit the possible 100 M transistors per chip 250 M gates per SOC are feasible by 2005, most SOCs would use only 50 M gates More design starts for Application-Specific Standard Products (ASSP) than for ASIC Increase of the number of programmable processing units Decrease of the gate count for custom logic blocks The trend is expected to continue, yielding a sea of processors Many heterogeneous processors connected by a network-on-chip System-level design tools combined with use of off-the-shelf components are needed Application-Specific Instruction-set Processors (ASIP) from ASSP Extensible processor platform is state-of-the-art in ASIP technology SOC Design Trends

178 DAAD Program Akademischer Neuaufbau Südosteuropa, Embedded System Design Course at University of Nis, Faculty of Electronic Engineering, 29.06.-03.07.2009 Optimisation and search through a large design space Right set of extensible instructions and its constraints Communication of many extensible SOC processors on chip A customised Network-on-Chip (NOC) Shift in SoC design distinction To the process of customising extensible processors To the software design expertise needed to program them Structured ASICs introduce pre-built blocks (logic, configurable memories, and test structures) Most of the metal layers predefined Customisation of the upper two or three metal layers All customers share the same prefabricated die Open Issues and Alternatives


Download ppt "DAAD Program Akademischer Neuaufbau Südosteuropa Embedded System Design Course SOC Design: From System to Transistor Zoran Stamenković."

Similar presentations


Ads by Google