Sequential Definitions  Use two level sensitive latches of opposite type to build one master-slave flipflop that changes state on a clock edge (when the.

Slides:



Advertisements
Similar presentations
Digital Integrated Circuits A Design Perspective
Advertisements

Introduction to CMOS VLSI Design Sequential Circuits
Designing Sequential Logic Circuits
Modern VLSI Design 4e: Chapter 5 Copyright  2008 Wayne Wolf Topics n Memory elements. n Basics of sequential machines.
ELEC 256 / Saif Zahir UBC / 2000 Timing Methodology Overview Set of rules for interconnecting components and clocks When followed, guarantee proper operation.
L06 – Clocks Spring /18/05 Clocking.
(Neil west - p: ). Finite-state machine (FSM) which is composed of a set of logic input feeding a block of combinational logic resulting in a set.
Sequential Circuits. Outline  Floorplanning  Sequencing  Sequencing Element Design  Max and Min-Delay  Clock Skew  Time Borrowing  Two-Phase Clocking.
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis EE4800 CMOS Digital IC Design & Analysis Lecture 11 Sequential Circuit Design Zhuo Feng.
Introduction to CMOS VLSI Design Lecture 19: Design for Skew David Harris Harvey Mudd College Spring 2004.
Sequential Circuits IEP on Synthesis of Digital Design Sequential Circuits S. Sundar Kumar Iyer.
Introduction to CMOS VLSI Design Clock Skew-tolerant circuits.
EE141 © Digital Integrated Circuits 2nd Timing Issues 1 Digital Integrated Circuits A Design Perspective Timing Issues Jan M. Rabaey Anantha Chandrakasan.
CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 17: Dynamic Sequential Circuits And Timing Issues [Adapted from Rabaey’s Digital Integrated Circuits,
Synchronous Digital Design Methodology and Guidelines
CSE477 L19 Timing Issues; Datapaths.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 19: Timing Issues; Introduction to Datapath.
Clock Design Adopted from David Harris of Harvey Mudd College.
Technical Seminar on Timing Issues in Digital Circuits
Chapter 11 Timing Issues in Digital Systems Boonchuay Supmonchai Integrated Design Application Research (IDAR) Laboratory August 20, 2004; Revised - July.
CSE477 L19 Timing Issues; Datapaths.1Irwin&Vijay, PSU, 2002 Complex Digital Circuits Design Lecture 2: Timing Issues; [Adapted from Rabaey’s Digital Integrated.
ECE 424 – Introduction to VLSI Design Emre Yengel Department of Electrical and Communication Engineering Fall 2014.
Digital Integrated Circuits A Design Perspective
Digital Integrated Circuits© Prentice Hall 1995 Timing ISSUES IN TIMING.
Introduction to CMOS VLSI Design SRAM/DRAM
Lecture 8: Clock Distribution, PLL & DLL
© Digital Integrated Circuits 2nd Sequential Circuits Digital Integrated Circuits A Design Perspective Designing Sequential Logic Circuits Jan M. Rabaey.
Modern VLSI Design 2e: Chapter 5 Copyright  1998 Prentice Hall PTR Topics n Memory elements. n Basics of sequential machines.
1 EECS Components and Design Techniques for Digital Systems Lec 21 – RTL Design Optimization 11/16/2004 David Culler Electrical Engineering and Computer.
Chapter #6: Sequential Logic Design 6.2 Timing Methodologies
Temporizzazioni e sincronismo1 Progettazione di circuiti e sistemi VLSI Anno Accademico Lezione Temporizzazioni e sincronizzazione.
1 EE365 Read-only memories Static read/write memories Dynamic read/write memories.
Chapter 6 Memory and Programmable Logic Devices
Digital Integrated Circuits for Communication
CSE477 L17 Static Sequential Logic.1Irwin&Vijay, PSU, 2003 CSE477 VLSI Digital Circuits Fall 2003 Lecture 17: Static Sequential Circuits Mary Jane Irwin.
© Digital Integrated Circuits 2nd Sequential Circuits Digital Integrated Circuits A Design Perspective Designing Sequential Logic Circuits Jan M. Rabaey.
Digital Integrated Circuits A Design Perspective
ENGG 6090 Topic Review1 How to reduce the power dissipation? Switching Activity Switched Capacitance Voltage Scaling.
Evolution in Complexity Evolution in Transistor Count.
Lecture 2 1 Computer Elements Transistors (computing) –How can they be connected to do something useful? –How do we evaluate how fast a logic block is?
Review: Basic Building Blocks  Datapath l Execution units -Adder, multiplier, divider, shifter, etc. l Register file and pipeline registers l Multiplexers,
CSE477 L17 Static Sequential Logic.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 17: Static Sequential Circuits Mary Jane Irwin.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n Latches and flip-flops. n RAMs and ROMs.
CSE477 L24 RAM Cores.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 24: RAM Cores Mary Jane Irwin ( )
Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.
CSE477 L23 Memories.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 23: Semiconductor Memories Mary Jane Irwin (
Sp09 CMPEN 411 L18 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 16: Static Sequential Circuits [Adapted from Rabaey’s Digital Integrated Circuits,
UltraSPARC III Hari P. Ananthanarayanan Anand S. Rajan.
Computer Organization and Design Memories and State Machines Montek Singh Wed, Nov 6-11, 2013 Lecture 13.
Introduction to Clock Tree Synthesis
CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 19: Adder Design
CDA 4253 FPGA System Design RTL Design Methodology 1 Hao Zheng Comp Sci & Eng USF.
07/11/2005 Register File Design and Memory Design Presentation E CSE : Introduction to Computer Architecture Slides by Gojko Babić.
Review: Sequential Definitions
VADA Lab.SungKyunKwan Univ. 1 L5:Lower Power Architecture Design 성균관대학교 조 준 동 교수
Sp09 CMPEN 411 L21 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 21: Shifters, Decoders, Muxes [Adapted from Rabaey’s Digital Integrated Circuits,
CSE477 L21 Multiplier Design.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin (
1 Recap: Lecture 4 Logic Implementation Styles:  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates, or “pass-transistor” logic.
EE141 Timing Issues 1 Chapter 10 Timing Issues Rev /11/2003 Rev /28/2003 Rev /05/2003.
EE141 Timing Issues 1 Chapter 10 Timing Issues Rev /11/2003.
CSE477 L19 Timing Issues; Datapaths.1Irwin&Vijay, PSU, 2003 CSE477 VLSI Digital Circuits Fall 2003 Lecture 19: Timing Issues; Introduction to Datapath.
1 Clockless Logic Montek Singh Thu, Mar 2, Review: Logic Gate Families  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates,
Digital Integrated Circuits A Design Perspective
Mary Jane Irwin ( ) CSE477 VLSI Digital Circuits Fall 2002 Lecture 19: Timing Issues; Introduction to Datapath.
Architecture & Organization 1
Mary Jane Irwin ( ) CSE477 VLSI Digital Circuits Fall 2002 Lecture 22: Shifters, Decoders, Muxes Mary Jane.
Architecture & Organization 1
Mary Jane Irwin ( ) CSE477 VLSI Digital Circuits Fall 2003 Lecture 22: Shifters, Decoders, Muxes Mary Jane.
Chapter 10 Timing Issues Rev /11/2003 Rev /28/2003
332:578 Deep Submicron VLSI Design Lecture 14 Design for Clock Skew
Presentation transcript:

Sequential Definitions  Use two level sensitive latches of opposite type to build one master-slave flipflop that changes state on a clock edge (when the slave is transparent)  Static storage l static uses a bistable element with feedback to store its state and thus preserves state as long as the power is on -Loading new data into the element: 1) cutting the feedback path (mux based); 2) overpowering the feedback path (SRAM based)  Dynamic storage l dynamic stores state on parasitic capacitors so the state held for only a period of time (milliseconds); requires periodic refresh l dynamic is usually simpler (fewer transistors), higher speed, lower power but due to noise immunity issues always modify the circuit so that it is pseudostatic

Timing Classifications  Synchronous systems l All memory elements in the system are simultaneously updated using a globally distributed periodic synchronization signal (i.e., a global clock signal) l Functionality is ensure by strict constraints on the clock signal generation and distribution to minimize -Clock skew (spatial variations in clock edges) -Clock jitter (temporal variations in clock edges)  Asynchronous systems l Self-timed (controlled) systems l No need for a globally distributed clock, but have asynchronous circuit overheads (handshaking logic, etc.)  Hybrid systems l Synchronization between different clock domains l Interfacing between asynchronous and synchronous domains

Synchronous Timing Basics  Under ideal conditions (i.e., when t clk1 = t clk2 ) T  t c-q + t plogic + t su t hold ≤ t cdlogic + t cdreg  Under real conditions, the clock signal can have both spatial (clock skew) and temporal (clock jitter) variations l skew is constant from cycle to cycle (by definition); skew can be positive (clock and data flowing in the same direction) or negative (clock and data flowing in opposite directions) l jitter causes T to change on a cycle-by-cycle basis DQ R1 Combinational logic DQ R2 clk In t clk1 t clk2 t c-q, t su, t hold, t cdreg t plogic, t cdlogic

Sources of Clock Skew and Jitter in Clock Network PLL clock generation clock drivers power supply interconnect capacitive load capacitive coupling temperature  Skew l manufacturing device variations in clock drivers l interconnect variations l environmental variations (power supply and temperature)  Jitter l clock generation l capacitive loading and coupling l environmental variations (power supply and temperature)

Positive Clock Skew DQ R1 Combinational logic DQ R2 clk In t clk1 t clk2 delay   > 0: Improves performance, but makes t hold harder to meet. If t hold is not met (race conditions), the circuit malfunctions independent of the clock period! T T +   > 0  + t hold T +   t c-q + t plogic + t su so T  t c-q + t plogic + t su -  t hold +  ≤ t cdlogic + t cdreg so t hold ≤ t cdlogic + t cdreg -  1234  Clock and data flow in the same direction T : t hold :

Negative Clock Skew DQ R1 Combinational logic DQ R2 clk In t clk1 t clk2 delay  Clock and data flow in opposite directions T T +   < 0 T +   t c-q + t plogic + t su so T  t c-q + t plogic + t su -  t hold +  ≤ t cdlogic + t cdreg so t hold ≤ t cdlogic + t cdreg -  1234   < 0: Degrades performance, but t hold is easier to meet (eliminating race conditions) T : t hold :

Clock Jitter  Jitter causes T to vary on a cycle-by- cycle basis R1 Combinational logic clk In t clk T -t jitter +t jitter T - 2t jitter  t c-q + t plogic + t su so T  t c-q + t plogic + t su + 2t jitter  Jitter directly reduces the performance of a sequential circuit T :

Combined Impact of Skew and Jitter DQ R1 Combinational logic DQ R2 In t clk1 t clk2  Constraints on the minimum clock period (  > 0)   > 0 with jitter: Degrades performance, and makes t hold even harder to meet. (The acceptable skew is reduced by jitter.) T T +   > t jitter T  t c-q + t plogic + t su -  + 2t jitter t hold ≤ t cdlogic + t cdreg –  – 2t jitter

Clock Distribution Networks  Clock skew and jitter can ultimately limit the performance of a digital system, so designing a clock network that minimizes both is important l In many high-speed processors, a majority of the dynamic power is dissipated in the clock network. l To reduce dynamic power, the clock network must support clock gating (shutting down (disabling the clock) units)  Clock distribution techniques l Balanced paths (H-tree network, matched RC trees) -In the ideal case, can eliminate skew -Could take multiple cycles for the clock signal to propagate to the leaves of the tree l Clock grids -Typically used in the final stage of the clock distribution network -Minimizes absolute delay (not relative delay)

H-Tree Clock Network Clock Idle condition Gated clock Can insert clock gating at multiple levels in clock tree Can shut off entire subtree if all gating conditions are satisfied  If the paths are perfectly balanced, clock skew is zero

DEC Alpha (EV5)  300 MHz clock (9.3 million transistors on a 16.5x18.1 mm die in 0.5 micron CMOS technology) l single phase clock  3.75 nF total clock load l Extensive use of dynamic logic  20 W (out of 50) in clock distribution network  Two level clock distribution l Single 6 stage driver at the center of the chip l Secondary buffers drive the left and right sides of the clock grid in m3 and m4  Total equivalent driver size of 58 cm !!

Clock Skew in Alpha Processor  Absolute skew smaller than 90 ps  The critical instruction and execution units all see the clock within 65 ps

Dealing with Clock Skew and Jitter  To minimize skew, balance clock paths using H-tree or matched-tree clock distribution structures.  If possible, route data and clock in opposite directions; eliminates races at the cost of performance.  The use of gated clocks to help with dynamic power consumption make jitter worse.  Shield clock wires (route power lines – V DD or GND – next to clock lines) to minimize/eliminate coupling with neighboring signal nets.  Use dummy fills to reduce skew by reducing variations in interconnect capacitances due to interlayer dielectric thickness variations.  Beware of temperature and supply rail variations and their effects on skew and jitter. Power supply noise fundamentally limits the performance of clock networks.

Major Components of a Computer Processor Control Datapath Memory Devices Input Output  Modern processor architecture styles l Pipelined, single issue (e.g., ARM) l Pipelined, hardware controlled multiple issue – superscalar l Pipelined, software controlled multiple issue – VLIW l Pipelined, multiple issue from multiple process threads - multithreaded

Basic Building Blocks  Datapath l Execution units -Adder, multiplier, divider, shifter, etc. l Register file and pipeline registers l Multiplexers, decoders  Control l Finite state machines (PLA, ROM, random logic)  Interconnect l Switches, arbiters, buses  Memory l Caches, TLBs, DRAM, buffers

MIPS 5-Stage Pipelined (Single Issue) Datapath pipeline stage isolation register FetchDecodeExecuteMemoryWriteBack clk Icache precharge Dcache precharge RegWrite

Datapath Bit-Sliced Organization Control Flow Bit 0 Bit 1 Bit 2 Bit 3 Tile identical bit-slice elements Register File Pipeline RegisterAdderShifterPipeline RegisterMultiplexer Data Flow Pipeline Register From I$ Pipeline Register To/From D$