9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

Slides:



Advertisements
Similar presentations
1 Lecture 16 Timing  Terminology  Timing issues  Asynchronous inputs.
Advertisements

Digital Integrated Circuits A Design Perspective
Introduction to CMOS VLSI Design Sequential Circuits.
VLSI Design EE 447/547 Sequential circuits 1 EE 447/547 VLSI Design Lecture 9: Sequential Circuits.
11/12/2004EE 42 fall 2004 lecture 311 Lecture #31 Flip-Flops, Clocks, Timing Last lecture: –Finite State Machines This lecture: –Digital circuits with.
Introduction to CMOS VLSI Design Sequential Circuits
ECE C03 Lecture 81 Lecture 8 Memory Elements and Clocking Hai Zhou ECE 303 Advanced Digital Design Spring 2002.
MICROELETTRONICA Sequential circuits Lection 7.
ELEC 256 / Saif Zahir UBC / 2000 Timing Methodology Overview Set of rules for interconnecting components and clocks When followed, guarantee proper operation.
Flip-Flops Computer Organization I 1 June 2010 © McQuain, Feng & Ribbens A clock is a free-running signal with a cycle time. A clock may be.
Lecture 11: Sequential Circuit Design. CMOS VLSI DesignCMOS VLSI Design 4th Ed. 11: Sequential Circuits2 Outline  Sequencing  Sequencing Element Design.
CHAPTER 3 Sequential Logic/ Circuits.  Concept of Sequential Logic  Latch and Flip-flops (FFs)  Shift Registers and Application  Counters (Types,
L06 – Clocks Spring /18/05 Clocking.
1 KU College of Engineering Elec 204: Digital Systems Design Lecture 12 Basic (NAND) S – R Latch “Cross-Coupling” two NAND gates gives the S -R Latch:
경종민 1 Clock skew and signal reflection.
Chapter #6: Sequential Logic Design
(Neil west - p: ). Finite-state machine (FSM) which is composed of a set of logic input feeding a block of combinational logic resulting in a set.
EKT 124 / 3 DIGITAL ELEKTRONIC 1
Sequential Circuits. Outline  Floorplanning  Sequencing  Sequencing Element Design  Max and Min-Delay  Clock Skew  Time Borrowing  Two-Phase Clocking.
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis EE4800 CMOS Digital IC Design & Analysis Lecture 11 Sequential Circuit Design Zhuo Feng.
Introduction to CMOS VLSI Design Lecture 19: Design for Skew David Harris Harvey Mudd College Spring 2004.
Introduction to CMOS VLSI Design Clock Skew-tolerant circuits.
Sequential Definitions  Use two level sensitive latches of opposite type to build one master-slave flipflop that changes state on a clock edge (when the.
EE141 © Digital Integrated Circuits 2nd Timing Issues 1 Digital Integrated Circuits A Design Perspective Timing Issues Jan M. Rabaey Anantha Chandrakasan.
CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 17: Dynamic Sequential Circuits And Timing Issues [Adapted from Rabaey’s Digital Integrated Circuits,
Synchronous Digital Design Methodology and Guidelines
CSE477 L19 Timing Issues; Datapaths.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 19: Timing Issues; Introduction to Datapath.
Clock Design Adopted from David Harris of Harvey Mudd College.
Chapter 11 Timing Issues in Digital Systems Boonchuay Supmonchai Integrated Design Application Research (IDAR) Laboratory August 20, 2004; Revised - July.
CSE477 L19 Timing Issues; Datapaths.1Irwin&Vijay, PSU, 2002 Complex Digital Circuits Design Lecture 2: Timing Issues; [Adapted from Rabaey’s Digital Integrated.
Digital Integrated Circuits A Design Perspective
Digital Integrated Circuits© Prentice Hall 1995 Timing ISSUES IN TIMING.
Lecture 8: Clock Distribution, PLL & DLL
11/19/2004EE 42 fall 2004 lecture 341 Lecture #34: data transfer Last lecture: –Example circuits –shift registers –Adders –Counters This lecture –Communications.
Sequential Logic 1  Combinational logic:  Compute a function all at one time  Fast/expensive  e.g. combinational multiplier  Sequential logic:  Compute.
Embedded Systems Hardware:
CS 151 Digital Systems Design Lecture 20 Sequential Circuits: Flip flops.
11/15/2004EE 42 fall 2004 lecture 321 Lecture #32 Registers, counters etc. Last lecture: –Digital circuits with feedback –Clocks –Flip-Flops This Lecture:
Sequential Circuits. 2 Sequential vs. Combinational Combinational Logic:  Output depends only on current input −TV channel selector (0-9) Sequential.
Embedded Systems Hardware: Storage Elements; Finite State Machines; Sequential Logic.
Chapter #6: Sequential Logic Design 6.2 Timing Methodologies
Introduction to CMOS VLSI Design Lecture 10: Sequential Circuits Credits: David Harris Harvey Mudd College (Material taken/adapted from Harris’ lecture.
Contemporary Logic Design Sequential Logic © R.H. Katz Transparency No Chapter #6: Sequential Logic Design Sequential Switching Networks.
Lecture 11 MOUSETRAP: Ultra-High-Speed Transition-Signaling Asynchronous Pipelines.
Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.
Chapter 3: Sequential Logic Circuit EKT 121 / 4 ELEKTRONIK DIGIT 1.
1 CSE370, Lecture 16 Lecture 19 u Logistics n HW5 is due today (full credit today, 20% off Monday 10:29am, Solutions up Monday 10:30am) n HW6 is due Wednesday.
© Digital Integrated Circuits 2nd Sequential Circuits Digital Integrated Circuits A Design Perspective Designing Sequential Logic Circuits Jan M. Rabaey.
COE 202: Digital Logic Design Sequential Circuits Part 1
© H. Heck 2008Section 4.41 Module 4:Metrics & Methodology Topic 4: Recovered Clock Timing OGI EE564 Howard Heck.
1 Sequential Logic Lecture #7. 모바일컴퓨팅특강 2 강의순서 Latch FlipFlop Shift Register Counter.
Topic: Sequential Circuit Course: Logic Design Slide no. 1 Chapter #6: Sequential Logic Design.
Spring 2006EE VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues.
Reading Assignment: Rabaey: Chapter 9
1 Signal and Timing Parameters II Source Synchronous Timing – Class 3 a.k.a. Co-transmitted Clock Timing a.k.a. Clock Forwarding. Assignment for next class:
Introduction to Clock Tree Synthesis
ECE C03 Lecture 81 Lecture 8 Memory Elements and Clocking Hai Zhou ECE 303 Advanced Digital Design Spring 2002.
Interconnect/Via.
EKT 121 / 4 ELEKTRONIK DIGIT I
1 COMP541 Sequential Logic Timing Montek Singh Sep 30, 2015.
Clocking System Design
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU 99-1 Under-Graduate Project Design of Datapath Controllers Speaker: Shao-Wei Feng Adviser:
FAMU-FSU College of Engineering EEL 3705 / 3705L Digital Logic Design Spring 2007 Instructor: Dr. Michael Frank Module #10: Sequential Logic Timing & Pipelining.
EE141 Timing Issues 1 Chapter 10 Timing Issues Rev /11/2003 Rev /28/2003 Rev /05/2003.
EE141 Timing Issues 1 Chapter 10 Timing Issues Rev /11/2003.
CSE477 L19 Timing Issues; Datapaths.1Irwin&Vijay, PSU, 2003 CSE477 VLSI Digital Circuits Fall 2003 Lecture 19: Timing Issues; Introduction to Datapath.
Lecture 11: Sequential Circuit Design
Sequential circuit design with metastability
Chapter 10 Timing Issues Rev /11/2003 Rev /28/2003
Presentation transcript:

9 Timing Issues Contents 1. Clocking Schemes and Storage Elements 2. Clock Distribution Network 3. Self-Timed Logic Circuits 4. Synchronizers 5. Clock Generation & Synchronization

1. Clocking Schemes based on each storage element Waveforms for D-latch, +ve edge-triggered D-f/f, and 2-phase double latch(latter two are equivalent to each other)

Clk1 and clk2 are non-overlapping each other Finite State Machines based on each storage element Clk1 and clk2 are non-overlapping each other

CL Clock jitter(skew) positive skew negative skew signal direction clk 1 2 CL +ve clk clk’ -ve Positive skew tdelay,min must be obeyed. Otherwise, 2nd f/f, at the current sample point, samples the next value, not the current one (which is the correct one).  called double clocking negative skew Tp-(tdelay,max + tsetup) must be obeyed. Otherwise, 2nd f/f, at the next sample point, samples the old value, not the updated value(which is the correct one).

Single-phase system with edge-triggered flip-flops negative skew의 경우임

i) maximum allowable clock skew = tskew,max (Race equation) to prevent race condition, i.e., to prevent f/f from deciding Q with next input rather than current input. when hold time is ignored, tskew,max < tf/f,min+tcl,min when hold time is considered, tskew,max < tf/f,min+tcl,min- thold,max (thold:min. time a signal needs to stay stable after clock edge) ii) min. clock cycle time for correct operation with stable f/f inputs considering clock skew(Delay equation) Tp,min>tf/f,max+tcl.max+tsetup,max- tskew,max iii) tclk-width>thold, to guarantee correct data capture. 이만큼의 추가적인 여유가 있어야 next sample이 아닌 current 신호가 제대로 (hold time requirement를 만족하며) sample 된다. tskew가 +ve이면 실제로 TP가 tskew 만큼 늘어나는 효과가 있다. 즉, TP값이 그만큼의 여유가 생긴다. 아무리 edge-triggered f/f이지만(tve f/f의 경우) sampling edge 후에 thold 만큼은 clock이 high로 stay 해야 함.

Single-phase system with latches - ve skew의 경우임

i) Race condition: double-sided constraint on clock width, tclk-width clock width must be greater than tsetup.( tsetup for latch is the min. time a signal should remain stable before the fall of clock edge) tclk-width  tsetup,max clock width must be shorter than the sum of 1-stage delay(consisting of tlatch and tcl) minus hold time and skew, to prevent any signal from passing through more than one stage. Tclk-width  tlatch,min + tcl,min - thold,max - tskew,max ii) min. cycle time(in the critical stage) tcycle,min > tlatch,max + tcl,max + tsetup,max + tskew,max- tclk-width,min some delay as much as this can be transferred to the preceding or succeeding non-critical stages. -ve skew의 경우 이만큼 clk width가 effectively 줄어듬. clk이 후에도 current 신호가 이 시간만큼 유지되어야 함. 즉, clk이 하기 thold.max 만큼 전에는 next 신호가 오면 안됨. 합친 term 시간 이내에 delay와 setup까지 이루어져야 함. -ve skew의 경우에는 skew만큼 그 시간(tP+tW)이 줄어드는 것과 같다. tclk.width

2-phase non-overlapping clock using double latchl

Single-phase, edge.trig f/f에 비해 이만큼 여유가 더 있음. i) Race condition: thold-max < tnon-ol,min + tlatch,min + tlogic,min - tskew,max ii) min. cycle time(delay equation) tcycle,min > tnon-ol,max + tlatch,max + tlogic,max + tsetup,max - tskew,max iii) data capture condition tsetup < tclk-width 바꾸어 생각하면 편리

Intentional clock skew

tcl2,min > tcl1,min , tcl3,min 일 때 CLK’ 은 앞으로 CLK” 은 뒤로 shift 시킴으로써 CL1, CL3에서 남은 시간을 CL2에서 활용 이때 CLK’ 을 너무 advance(혹은 CLK”를 너무 delay)시키면 CL2의 min. delay path가 ta(edge-triggered f/f 인 경우) 혹은 tb(latch인 경우) 보다 짧게 되어 race가 발생할 수 있다.

Relation between race condition( on max, clock skew) and delay condition( on min. clock period) i) when data & clk are running in the same direction(positive skew) Clock skew should be tightly controlled to prevent race condition. With +ve skew, clock frequency can be increased for higher performance. ii) when data and clock are running in opposite direction(negative skew) No need to worry about race condition, But, -ve skew degrades the performance by increasing the min. clock period according to the delay equation.

How to suppress race condition 1) routing clock in the opposite direction of data(easy to implement in datapath) at the cost of performance degradation 2) controlling the non-overlap periods of clock( in 2-phase clocking) 3) Try to obtain good clock distribution network to obtain as uniform clock skew as possible at the local clock point.( Absolute skew between clock source and local clock point is irrelevant) 4) Clock dist. Network interconnect material shape of the dist. Network clock driver/buffering schemes load, i.e., fan-out on the clock lines rise/fall time of the clock 5) Avoid global clock/ Use self-timed approach

2. Clock Distribution Network H-tree as clock dist. Network clock receiver(photo-diode) at the center receiving sharp laser pulse through a glass window in the package

Two-level buffering(Hierarchy)

Composition of a PLL(Phase-Locked Loop) i) Loop filter : loop filter is introduced to remove clock jitter. 1st to 3rd-order LPF is generally needed, as excessive phase shift due to high-order filtering can cause instability in this feedback structure. ii) Lock range : range of input frequency over which output follows input frequency over which output follows input with given relationship. iii) Lock time : time for PLL to lock into the input iv) Jitter : Loop filter(LPF) helps remove jitter.

Simply generating local clocks from global clock generates clock skew causing inter-chip communication impossible.

How to minimize clock skew in multi-chip system, i. e How to minimize clock skew in multi-chip system, i.e., board or multiple-board system. i) Global dist. Network. ii) On-chip clock generator/buffer PLL can help here only. iii) Local dist. Network. Global Clock Source

Each Component of Skew : i) Chip-to-chip clock skew due to global dist. Network ; can be suppressed by ; placing clock pins/pads at the identical positions on the chip carrier/chip. Keeping the lead length and capacitive loading of clock pins and wires from the global clock source to each clock pin as identical as possible. ii) Skew due to on-chip clock generator/buffer can be suppressed by PLL;

Each Component of PLL(Phase detector, LPF, voltage-controlled delay line)

Methodology for dealing with timing problems in LARGE systems ; 1) Divide the whole system into a number of regions, with each region operating in synchronous manner. 2) Communication among each region is either i) through a global clock slower(N) than local clock or ii) asynchronously using self-timed discipline.

Using PLL for local synchronization between global & local clocks. Delay of local clock is adjusted via. PLL to make the local clock edge occurring simultaneously with global clock edge.

Minimal skew system 1) equal-length chip-to-chip interconnection 2) PLL-based clock generator/buffer, and 3) equal-length on-chip distribution(H-tree)

Symmetric clock trees(H- vs X- tree) - H-tree is better than X-tree in that i) in H-tree, no corners sharper than 90, thus with smaller inductive discontinuity, reflection is small. ii) in H-tree, fan-out is only 2, simplifying impedance matching

Reduction of inductive discontinuities at the corners of H-tree.

Zk+1 Zk Zk+1 Matching condition at the branch point : - impedance matching occurs when Zk=Zk+1//Zk+1= Zk+1 Zk Zk+1 Zk+1 2

Driving the clock lines :

Sharpening clock signal at the receiver front before distribution in the subblock using schmitt trigger or source-end-terminated buffer.(Look at sharp rise of Vb in previous slide.)

RC network representation of H-clock tree(simplified as a distributed RC line): When tailored H-clock tree is used, I.e., if the line width is halved at each branching point, above distributed RC tree network is equivalent to a uniformly distributed RC line.(R1= R2= R3=…, C1=2C2=4C3=...)

1) From distributed RC model ; Requirement of the cross-sectional geometry(height, width) of interconnection line : 1) From distributed RC model ; Total distance from clock source to end point(ltot) in H-tree : Time required for the last node to reach 90% of its final value : Rint : resistance of interconnection per unit length Cint : capacitance per unit length

2) From lossy transmission line RC model : Eq. (1), (2) need to be considered in determining H&W. For high frequency, skin effect prevents thickening the interconnection by more than 2-4 times the skin depth ineffective. For 1GHz, skin depth of aluminum is 2.8m.

Simulation of H-clock tree with the last stage unmatched.

Reflections in the final unmatched branch :

3. Self-Timed Logic Circuits Synchronous vs. Asynchronous(Self-Timed) i) In pipelined systems, performance(throughput) depends on worst case stage delay in synchronous systems, while it depends on average delay in self-timed systems. ii) (cons of synchronous system) : Distribution of high-speed clock over all region is a very difficult problem. iii) (cons of asynchronous systems) : Hand-shaking logic overhead.

Pipelined, synchronous datapath : Self-timed pipelined datapath : R1 F1 R2 F2 R3 F3 R4 In Out tpreg tpF1 tpF2 tpF3 1 5 Req Req Req Req HS HS HS 2 6 Ack Ack Ack Ack 3 Start Done 4 7 Start Done 8 Start Done R1 F1 R2 F2 R3 F3 In Out tpF1 tpF2 tpF3

Sequence of Events : i) As input word arrives at the input of R1, I.e., at the input buffer(IN buffer) Req(to F1) is raised. If F1 is then available(inactive), data is transferred to R1 and Acknowledges it to IN buffer. ii) F1 is then enabled by ‘start’ signal ‘Done’ signal goes high indicating the completion of the computation. iii) Request to F2-module is issued. If F2 is available, Ack is raised, and the output from F1 is transferred to R2. F1 continues with its next sample for computing.

in transition(or reset) Completion(Done) signal generation : - redundant signal generation. Ex. 2bit per signal : 00 : out 0 11 : out 1 01 or 10 : out retains its current value. B B0 B1 in transition(or reset) 1 1 1 illegal 1 1

Completion signal generation

Self-Timed signaling(2-cycle Hand shaking protocol)

Muller C-element

Implementation of Muller C(-element) element :

An example showing Muller C-element. 1) 2) 2-, self-timed FIFO

4-cycle(4-phase) handshaking protocol : ex) Implementation of 4-phase handshaking protocol

Muller C element : Its output is 1 when all inputs are 1, and 0 when all inputs are 0 ; otherwise remains at its earlier state. Called as rendezvous, join or last-of circuit.

Implementation :

Implementation :

Conversion between single rail plus request and double rail data :

Conversion form chip to package pins :

ex) Self-timed CMOS PLA Dummy line의 extra parasitic C에 의해 AND-plane의 모든 product line이 충분히 stable해진 후에 DONE이 “1”로 올라간다.

DONE 신호는 여러 개의 PLA를 cascade 시킬때 next PLA의 PC로 사용된다 DONE 신호는 여러 개의 PLA를 cascade 시킬때 next PLA의 PC로 사용된다. 이때 2nd PlA는 inverting input을 갖지 못한다.(정 필요하면 따로 1st PLA에서 만들어 주어야 함( unintended discharge) 어느 PLA의 입력이 여러 PLA에서 올때 : 가장 늦은 DONE을 PC로 받아야 함. 