Download presentation
Presentation is loading. Please wait.
1
Digital System Clocking:
High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Chapter 9: Microprocessor Examples Wiley-Interscience and IEEE Press, January 2003
2
Microprocessor Examples
Clocking for Intel® Microprocessors IA-32 Pentium® Pro First IA-64 Microprocessor Pentium 4 Sun Microsystems UltraSPARC-III® Clocking Clocking and CSEs Alpha® Clocking: A Historical Overview IBM® Microprocessors Level-Sensitive Scan Design Examples of CSEs Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
3
Microprocessor Examples
Clocking for Intel® Microprocessors IA-32 Pentium® Pro First IA-64 Microprocessor Pentium 4 Sun Microsystems UltraSPARC-III® Clocking Clocking and CSEs Alpha® Clocking: A Historical Overview IBM® Microprocessors Level-Sensitive Scan Design Examples of CSEs Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
4
Source: Microprocessor Report Journal
Intel® Microprocessor Features Pentium II Pentium III Pentium 4 MPR Issue June 1997 April 2000 Dec 2001 Clock Speed 266 MHz 1GHz 2GHz Pipeline Stages 12/14 22/24 Transistors 7.5M 24M 42M Cache (I/D/L2) 16k/16K/- 16K/16K/256K 12K/8K/256K Die Size 203mm2 106mm2 217mm2 IC Process 0.28mm, 4M 0.18mm, 6M Max Power 27W 23W 67W Source: Microprocessor Report Journal Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
5
Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
IA-32 Pentium® Pro Delay Line Delay SR Deskew Control CLK Gen Left Spine Right Spine PD FB Clk Ext Clk Core Clock distribution network with deskewing circuit (Geannopoulos and Dai 1998), Copyright © 1998 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
6
Adaptive Deskewing Technique
Equalization of two clock distribution spines by compensating for delay mismatch Delay lines Phase detector Controller Result: global clock skew of only 15ps 0.25m technology 7.5M transistors Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
7
Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
IA-32 Pentium® Pro Out In Load<1:15,2> Load<0:14,2> <1:15,2> <0:14,2> Delay Shift Register Delay Line Delay shift register (Geannopoulos and Dai 1998), Copyright © 1998 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
8
Phase detector (Geannopoulos and Dai 1998), Copyright © 1998 IEEE
IA-32 Pentium® Pro Left Clk Phase Detector 1 Detector 2 Delay = n Right Clk Bandwidth Control Left Leads Right Leads Phase detector (Geannopoulos and Dai 1998), Copyright © 1998 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
9
Clock distribution topology (Rusu and Tam 2000), Copyright © 2000 IEEE
First IA-64 Microprocessor PLL Core Clock Reference Clock Deskew Cluster RCDs Clock distribution topology (Rusu and Tam 2000), Copyright © 2000 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
10
Programmable Deskew Units
Strategy similar to that in IA-32 External differential clock System bus frequency PLL generates internal clock 2x frequency Clock distribution architecture Balanced global clock tree Multiple deskew buffers Multiple local clock buffers Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
11
Deskew buffer architecture (Rusu and Tam 2000), Copyright © 2000 IEEE
First IA-64 Microprocessor RCD Regional Clock Grid Digital Filter Control FSM Deskew Settings Global Clock TAP Interface Reference Clock Phase Detector Regional Feedback Clock Deskew Buffer Deskew buffer architecture (Rusu and Tam 2000), Copyright © 2000 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
12
First IA-64 Microprocessor
Output Input Delay Control Register Enable Digitally controlled delay line (Rusu and Tam 2000), Copyright © 2000 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
13
First IA-64 Microprocessor
Simulated regional clock-grid skew (Rusu and Tam 2000), Copyright © 2000 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
14
First IA-64 Microprocessor
Measured regional clock skew (Rusu and Tam 2000), Copyright © 2000 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
15
Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
Pentium® 4 2x-Clk enables addr. bus outbound clocks MACRO clock enable generator distribution & sync 1x-Clk enable Core PLL core Clk core clock I/O bus clock bus clock# I/O data Clk data bus outbound clocks divide by 4 deskew state machine I/O feedback clock data from core D Q MSFF input buffer inbound buffers data clocks gen state machine strobe glitch protection and detection strobes inbound latching data to core data clock Core and I/O clock generation (Kurd et al. 2001), Copyright © 2001 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
16
Multi-GHz Clock Network in Pentium 4
Three core and three I/O frequencies (total 6 frequencies running concurrently) Differential off-chip reference clock PLL synthesizes core and I/O clocks Global core clock distribution 47 independent clock domains Each domain has 5-bit deskew control register Clock skew < 20ps Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
17
Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
Pentium® 4 Domain Buffer 1 2 3 46 47 Phase Detector To Test Access Port Local Clock Macro Sequential Elements 3-stage binary tree of clock repeaters PLL Logical diagram of core clock distribution (Kurd et al. 2001), Copyright © 2001 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
18
Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
Pentium® 4 Gclk Enable 1 Adjustable Delay Buffer Stretch 1 Stretch 0 Enable 2 fast freq. pulse clk ClkBuf Type 2 medium freq. pulse clk phase 1 ClkBuf Type 1 Enable freq. normal ClkBuf Type 3 ClkBuf Type 1 clk phase 2 SlowClkSync slow freq. phase 1 Example of local clock buffers generating various frequency, phase and types of clocks (Kurd et al. 2001), Copyright © 2001 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
19
Intel Clocking: Summary
Increasing clock speeds and die size Balancing the clock skew in large designs using simple RC trees is becoming less effective Insertion delay 7-8FO4 due to increased die Comparable to the clock period Clock skew control has been getting harder to due to increased PVT variations Inductive effects at multi-GHz rates Use of active deskewing circuits Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
20
Microprocessor Examples
Clocking for Intel® Microprocessors IA-32 Pentium® Pro First IA-64 Microprocessor Pentium 4 Sun Microsystems UltraSPARC-III® Clocking Clocking and CSEs Alpha® Clocking: A Historical Overview IBM® Microprocessors Level-Sensitive Scan Design Examples of CSEs Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
21
UltraSPARC® Family Characteristics
UltraSPARC-I® UltraSPARC-II® UltraSPARC-III® Year 1995 1997 2000 Architecture SPARC V9, 4-issue Die size 17.7x17.8mm2 12.5x12.5mm2 15x15.5mm2 # of transistors 5.2M 5.4M 23M Clock Frequency 167MHz 330MHz 1GHz Supply voltage 3.3V 2.5V 1.6V Process 0.5m CMOS 0.35m CMOS 0.15m CMOS Metal layers 4 (Al) 5 (Al) 7 (Al) Power consumption <30W <80W Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
22
UltraSPARC-III: Clocking
Performance-driven high-power clock distribution Eight logic gates per cycle High-speed semi-dynamic flip-flops with logic embedding Large hold time mandates use of advanced tools for fixing fast-path violations Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
23
UltraSPARC-III®: Clocking
Clock distribution delay in UltraSPARC-III (Heald et al. 2000), Copyright © 2000 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
24
Semidynamic flip-flop (Klass 1998), Copyright © 1998 IEEE
UltraSPARC-III: Clock Storage Elements Semidynamic flip-flop (Klass 1998), Copyright © 1998 IEEE Single-ended dynamic structure with use of keepers for static operation and use of clock pulsing Positive feedback (NAND) improves low-to-high setup time Fast, at the price of high internal and clock power Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
25
UltraSPARC-III: Clock Storage Elements
Logic embedding in a semi-dynamic flip-flop Two-input XOR function (Klass, 1998), Copyright © 1998 IEEE A non-inverting logic function can be embedded by replacing the input D transistor with an n-MOS logic network Necessary for fitting 8 logic stages in cycle time, also used for scan Complexity of embedded logic limited by the n-MOS stack depth Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
26
UltraSPARC-III: Clock Storage Elements
Single-ended dynamic SDFF Differential dynamic SDFF (Klass, 1998), Copyright © 1998 IEEE Dynamic version of SDFF used in dynamic logic paths Outputs exercise precharge-evaluate sequence to ensure monotonicity Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
27
UltraSPARC-III flip-flop (Heald et al. 2000), Copyright © 2000 IEEE
UltraSPARC-III: Clock Storage Elements UltraSPARC-III flip-flop (Heald et al. 2000), Copyright © 2000 IEEE Final UltraSPARC-III flip-flop modified by decoupling keepers to increase immunity to -particles Somewhat degraded speed and logic embedding property Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
28
Microprocessor Examples
Clocking for Intel® Microprocessors IA-32 Pentium® Pro First IA-64 Microprocessor Pentium 4 Sun Microsystems UltraSPARC-III® Clocking Clocking and CSEs Alpha® Clocking: A Historical Overview IBM® Microprocessors Level-Sensitive Scan Design Examples of CSEs Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
29
Alpha® Microprocessor Features
21064 21164 21264 21364 # transistors [M] 1.68 9.3 15.2 152 Die Size [mm2] 16.8x13.9 18.1x16.5 16.7x18.8 21.1x18.8 Process 0.75m 0.5m 0.35m 0.18m Supply [V] 3.3 2.2 1.5 Power [W] 30 50 72 125 Clk Freq. [MHz] 200 300 600 1200 Gates/Cycle 16 14 12 Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
30
Alpha® Microprocessors: Clocking
grid (a) (b) (c) Alpha microprocessor final clock driver location: (a) 21064, (b) 21164, (c) 21264 Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
31
21064 clock skew (Gronowski et al. 1998), Copyright © 1998 IEEE
Alpha® Microprocessors: Clocking 21064 clock skew (Gronowski et al. 1998), Copyright © 1998 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
32
21164 clock skew (Gronowski et al. 1998), Copyright © 1998 IEEE
Alpha® Microprocessors: Clocking 21164 clock skew (Gronowski et al. 1998), Copyright © 1998 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
33
21264 clock hierarchy (Gronowski et al. 1998), Copyright © 1998 IEEE
Alpha® Microprocessors: Clocking PLL ext. clk D Clk cond local clk cond. GCLK Grid Box 21264 clock hierarchy (Gronowski et al. 1998), Copyright © 1998 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
34
21264 clock skew (Gronowski et al. 1998), Copyright © 1998 IEEE
Alpha® Microprocessors: Clocking 21264 clock skew (Gronowski et al. 1998), Copyright © 1998 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
35
21364 major clock domains (Xanthopoulos et al. 2001), Copyright © 2001
Alpha® Microprocessors: Clocking GCLK grid DLL L2LClk L2RClk NCLK 21364 major clock domains (Xanthopoulos et al. 2001), Copyright © 2001 Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
36
Alpha® Microprocessors: Clocking
21364, NCLK clock skew (Xanthopoulos et al. 2001), Copyright © 2001 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
37
21064 modified TSPC latches (Gronowski et al. 1998), Copyright © 1998
Alpha® µP: Clock Storage Elements 21064 modified TSPC latches (Gronowski et al. 1998), Copyright © 1998 Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
38
Alpha® µP: Clock Storage Elements
D Q Clk X (a) (b) 21164: (a) phase-A latch, (b) phase-B latch (Gronowski et al. 1998), Copyright © 1998 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
39
Alpha® µP: Clock Storage Elements
Q X2 Clk X1 X 1 D 2 3 4 (a) (b) Embedding of logic into a latch: (a) TSPC latch, one level of logic; (b) latch, two levels of logic. (Gronowski et al. 1998), Copyright © 1998 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
40
21264 flip-flop (Gronowski et al. 1998), Copyright © 1998 IEEE
Alpha® µP: Clock Storage Elements Q Clk D 21264 flip-flop (Gronowski et al. 1998), Copyright © 1998 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
41
Alpha® Microprocessors: Timing
GCLK D Q Logic R Critical Path Definition and Criteria - Identify common clock, and - Maximize - Minimize D+UR Tcycle Race Definition and Criteria D R+H cond Critical-path and race analysis for clock buffering and conditioning (Gronowski et al. 1998), Copyright © 1998 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
42
Microprocessor Examples
Clocking for Intel® Microprocessors IA-32 Pentium® Pro First IA-64 Microprocessor Pentium 4 Sun Microsystems UltraSPARC-III® Clocking Clocking and CSEs Alpha® Clocking: A Historical Overview IBM® Microprocessors Level-Sensitive Scan Design Examples of CSEs Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
43
Hazard-Free Level-Sensitive Polarity-Hold Latch
Out +Clock -Clock Data Eichelberger 1983 Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
44
General LSSD Configuration
= f {S n , X} Present State Next State S Clocked Storage Elements Clock Outputs (Y) Y=Y(X, S ) Inputs (X) Combinational Logic Scan-In Scan-Out Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
45
LSSD Shift Register Latch
-Scan_In +A Clk -Data -C Clk +B Clk -L 2 +L L Latch 1 Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
46
LSSD Double Latch Design
Primary Inputs X L1 L2 Combinational Logic X 1 2 3 n Primary Outputs Z S State S C A Shift Scan In B Shift or Scan In Scan Out Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
47
IBM® S/390 Parallel Server Processor
Q (SCAN_OUT) L1 A_CLK SCAN_IN L2 B_CLK CLKL CLKG SELECT_N TEST_DISABLE SELECT_A CLK_ENABLE IN_A IN_B LSSD SRL with multiplexer used in the IBM S/390 G4 processor (Sigal et al. 1997), reproduced by permission Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
48
IBM® S/390 Parallel Server Processor
(SCAN_OUT) SELECT_N SELECT_A TEST_DISABLE Q B_CLK A_CLK SCAN_IN IN_A IN_B IN_C IN_M IN_N CLKL mux_A mux_M_N Static multiplexer version of the SRL used in the IBM S/390 G4 (Sigal et al. 1997), reproduced by permission Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
49
IBM® S/390 Parallel Server Processor
Q (SCAN_OUT) L1 A_CLK SCAN_IN IN L2 CLKG C1 C2 B_CLK C2_ENABLE C1_DISABLE A clocked storage element is used in the non-timing-critical timing macros of the IBM S/390 G4 processor (Sigal et al. 1997), reproduced by permission Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
50
IBM® S/390 Parallel Server Processor
UNOVERLAP C1 C2 B_CLK CLKG C1_DISABLE C2_ENABLE The clock-generation element used to detect problems created with fast paths: IBM S/390 G4 processor (Sigal et al. 1997), reproduced by permission Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
51
IBM® PowerPC Processor
(a) (b) SEL D CLK n-1 OC OT Slave Latch SR Master Latch Complement Mux True Mux SO SEL_EXT i SG SCAN_GATE NCLK The experimental IBM PowerPC processor (Silberman et al. 1998), reproduced by permission Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
52
The PowerPC 603 MSL (Gerosa et al. 1994), Copyright © 1994 IEEE
IBM® PowerPC 603: Master-Slave Latch D in C 1 V DD ACLK SCAN 2 out The PowerPC 603 MSL (Gerosa et al. 1994), Copyright © 1994 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
53
IBM® PowerPC 603: Local Clk Generator
C1_TEST SCAN_C1 GCLK WAITCLK OVERRIDE C2_TEST C1_FREEZE C2_FREEZE ACLK C1 C2 The PowerPC 603 local clock regenerator (Gerosa et al. 1994), Copyright © 1994 IEEE Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
54
Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
Summary Intel® Microprocessors Active clock deskewing in Pentium® processors Sun Microsystems® Processors Semidynamic flip-flop (one of the fastest single-ended flip-flops today, “soft-edge”) Alpha® Processors Performance leader in the ‘90s Incorporating logic into CSEs IBM® Processors Design for testability techniques Low-power champion PowerPC 603 Nov. 14, 2003 Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.