Yulei Zhang1, James F. Buckwalter1, and Chung-Kuan Cheng2

Slides:



Advertisements
Similar presentations
Explicit Gate Delay Model for Timing Evaluation Muzhou Shao : University of Texas at Austin D.F.Wong : U. of Illinois at Urbana- Champaign Huijing Cao.
Advertisements

Topics Electrical properties of static combinational gates:
Design Rule Generation for Interconnect Matching Andrew B. Kahng and Rasit Onur Topaloglu {abk | rtopalog University of California, San Diego.
Design and Application of Power Optimized High-Speed CMOS Frequency Dividers.
Improving Placement under the Constant Delay Model Kolja Sulimma 1, Ingmar Neumann 1, Lukas Van Ginneken 2, Wolfgang Kunz 1 1 EE and IT Department University.
Leakage and Dynamic Glitch Power Minimization Using MIP for V th Assignment and Path Balancing Yuanlin Lu and Vishwani D. Agrawal Auburn University ECE.
from High-frequency Clocks using DC-DC Converters
CMOS Circuit Design for Minimum Dynamic Power and Highest Speed Tezaswi Raja, Dept. of ECE, Rutgers University Vishwani D. Agrawal, Dept. of ECE, Auburn.
Design of Variable Input Delay Gates for Low Dynamic Power Circuits
August 12, 2005Uppalapati et al.: VDAT'051 Glitch-Free Design of Low Power ASICs Using Customized Resistive Feedthrough Cells 9th VLSI Design & Test Symposium.
Statistical Crosstalk Aggressor Alignment Aware Interconnect Delay Calculation Supported by NSF & MARCO GSRC Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego.
Input-Specific Dynamic Power Optimization for VLSI Circuits Fei Hu Intel Corp. Folsom, CA 95630, USA Vishwani D. Agrawal Department of ECE Auburn University,
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 15: Interconnects & Wire Engineering Prof. Sherief Reda Division of Engineering,
1 Interconnect and Packaging Lecture 7: Distortionless Communication Chung-Kuan Cheng UC San Diego.
Study of Floating Fill Impact on Interconnect Capacitance Andrew B. Kahng Kambiz Samadi Puneet Sharma CSE and ECE Departments University of California,
Interconnect and Packaging Lecture 2: Scalability
Chung-Kuan Cheng†, Andrew B. Kahng†‡,
Fall 2006, Oct. 5 ELEC / Lecture 8 1 ELEC / (Fall 2006) Low-Power Design of Electronic Circuits Glitch-Free ASICs and Custom.
On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 22: Material Review Prof. Sherief Reda Division of Engineering, Brown University.
A Global Minimum Clock Distribution Network Augmentation Algorithm for Guaranteed Clock Skew Yield A. B. Kahng, B. Liu, X. Xu, J. Hu* and G. Venkataraman*
1 MICROELETTRONICA Logical Effort and delay Lection 4.
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 9.1 EE4800 CMOS Digital IC Design & Analysis Lecture 9 Interconnect Zhuo Feng.
Statistical Gate Delay Calculation with Crosstalk Alignment Consideration Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego
9/27/05ELEC / Lecture 91 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic Circuits.
1 An Interconnect-Centric Approach to Cyclic Shifter Design David M. Harris Harvey Mudd College. Haikun Zhu, Yi Zhu C.-K. Cheng Harvey Mudd College.
DDRO: A Novel Performance Monitoring Methodology Based on Design-Dependent Ring Oscillators Tuck-Boon Chan †, Puneet Gupta §, Andrew B. Kahng †‡ and Liangzhen.
Ultra-Low Power On-Chip Differential Interconnects Using High-Resolution Comparator Hao Liu and Chung-Kuan Cheng University of California, San Diego 10/22/2012.
MOS Inverter: Static Characteristics
A Methodology for Interconnect Dimension Determination By: Jeff Cobb Rajesh Garg Sunil P Khatri Department of Electrical and Computer Engineering, Texas.
TLC: Transmission Line Caches Brad Beckmann David Wood Multifacet Project University of Wisconsin-Madison 12/3/03.
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 36: December 7, 2012 Transmission Line.
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 39: December 6, 2013 Repeaters in Wiring.
Power Reduction for FPGA using Multiple Vdd/Vth
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
SMART: A Single- Cycle Reconfigurable NoC for SoC Applications -Jyoti Wadhwani Chia-Hsin Owen Chen, Sunghyun Park, Tushar Krishna, Suvinay Subramaniam,
Prediction of High-Performance On-Chip Global Interconnection Yulei Zhang 1, Xiang Hu 1, Alina Deutsch 2, A. Ege Engin 3 James F. Buckwalter 1, and Chung-Kuan.
Logic Synthesis for Low Power(CHAPTER 6) 6.1 Introduction 6.2 Power Estimation Techniques 6.3 Power Minimization Techniques 6.4 Summary.
Chapter 07 Electronic Analysis of CMOS Logic Gates
Filip Tavernier Karolina Poltorak Sandro Bonacini Paulo Moreira
Prof. Joongho Choi CMOS CLOCK-RELATED CIRCUIT DESIGN Integrated Circuits Spring 2001 Dept. of ECE University of Seoul.
Washington State University
Modern VLSI Design 2e: Chapter 3 Copyright  1998 Prentice Hall PTR Topics n Electrical properties of static combinational gates: –transfer characteristics;
1 Interconnect and Packaging Lecture 8: Clock Meshes and Shunts Chung-Kuan Cheng UC San Diego.
1 Passive Distortion Compensation for Package Level Interconnect Chung-Kuan Cheng UC San Diego Dongsheng Ma & Janet Wang Univ. of Arizona.
Optimal digital circuit design Mohammad Sharifkhani.
Guy Lemieux, Mehdi Alimadadi, Samad Sheikhaei, Shahriar Mirabbasi University of British Columbia, Canada Patrick Palmer University of Cambridge, UK SoC.
1 Distributed Loss Compensation for Low-latency On-chip Interconnects Class Presentation For Advanced VLSI Design Course Instructor: Dr.Fakhraie Presented.
10/03/2005: 1 Physical Synthesis of Latency Aware Low Power NoC Through Topology Exploration and Wire Style Optimization CK Cheng CSE Department UC San.
1 Passive Distortion Compensation for Package Level Interconnect Chung-Kuan Cheng UC San Diego Dongsheng Ma & Janet Wang Univ. of Arizona.
INTERCONNECT MODELING M.Arvind 2nd M.E Microelectronics
VLSI CIRCUIT ELEMENTS - Prof. Rakesh K. Jha
Day 16: October 6, 2014 Inverter Performance
Timothy O. Dickson and Sorin P. Voinigescu Edward S. Rogers, Sr. Dept of Electrical and Computer Engineering University of Toronto CSICS November 15, 2006.
FPGA-Based System Design: Chapter 2 Copyright  2004 Prentice Hall PTR Topics n Logic gate delay. n Logic gate power consumption. n Driving large loads.
Solid-State Devices & Circuits
A 20/30 Gbps CMOS Backplane Driver with Digital Pre-emphasis Paul Westergaard, Timothy Dickson, and Sorin Voinigescu University of Toronto Canada.
Surfliner: Distortion-less Electrical Signaling for Speed of Light On- chip Communication Hongyu Chen, Rui Shi, Chung-Kuan Cheng Computer Science and Engineering.
Modern VLSI Design 3e: Chapter 3 Copyright  1998, 2002 Prentice Hall PTR Topics n Electrical properties of static combinational gates: –transfer characteristics;
Low-Power and High-Speed Interconnect Using Serial Passive Compensation Chun-Chen Liu and Chung-Kuan Cheng Computer Science and Engineering Dept. University.
1 Revamping Electronic Design Process to Embrace Interconnect Dominance Chung-Kuan Cheng CSE Department UC San Diego La Jolla, CA
-1- UC San Diego / VLSI CAD Laboratory Optimization of Overdrive Signoff Tuck-Boon Chan, Andrew B. Kahng, Jiajia Li and Siddhartha Nath Tuck-Boon Chan,
-1- Delay Uncertainty and Signal Criticality Driven Routing Channel Optimization for Advanced DRAM Products Samyoung Bang #, Kwangsoo Han ‡, Andrew B.
Effects of Inductance on the Propagation Delay and Repeater Insertion in VLSI Circuits Yehea I. Ismail and Eby G. Friedman, Fellow, IEEE.
EE222 Winter 2013 Steve Kang Lecture 5 Interconnects and Clock Signaling Open systems interconnect (
ECE 333 Linear Electronics Chapter 7 Transistor Amplifiers How a MOSFET or BJT can be used to make an amplifier  linear amplification  model the linear.
High Gain Transimpedance Amplifier with Current Mirror Load By: Mohamed Atef Electrical Engineering Department Assiut University Assiut, Egypt.
The Interconnect Delay Bottleneck.
Research on Interconnect
Low Power Passive Equalizer Design for Computer-Memory Links
Presentation transcript:

High-Speed and Low-Power On-Chip Global Link Using Continuous-Time Linear Equalizer Yulei Zhang1, James F. Buckwalter1, and Chung-Kuan Cheng2 1Dept. of ECE, 2Dept. of CSE, UC San Diego, La Jolla, CA 19th Conference on Electrical Performance of Electronic Packaging and Systems Oct 25, 2010 Austin, USA

Outline Introduction Equalized On-Chip Global Link Overall structure Basic working principle Driver Design for On-Chip Transmission-Line Guideline for tapered CML driver Driver design example Continuous-Time Linear Equalizer (CTLE) Design CTLE modeling CTLE design example Driver-Receiver Co-Design for Low Energy per Bit Methodology Overall link design example Conclusion

Research Motivation Global interconnect planning becomes a challenge in ultra-deep sub-macron (UDSM) process Performance gap between global wire and logic gates Conventional buffer insertion brings in larger extra power overhead Uninterrupted wire configurations are used to tackle the on-chip global communication issues On-chip T-lines to reduce interconnect power Equalization to improve the bandwidth State-of-the-art[Kim2009] 2Gb/s/um, < 1pJ/b, signaling over 10mm global wire in 90nm

Our Contributions Contributions Results of our design Build up a novel equalized on-chip T-line structure for global communication Tapered CML driver + CTLE receiver Accurate small-signal modeling on CTLE receiver to improve the optimization quality A design methodology to achieve driver-wire-receiver co-optimization to reduce the total energy per bit Results of our design 20Gbps signaling over 10mm, 2.2um-pitch on-chip T-line 11ps/mm latency and 0.2pJ/b energy per bit in 45nm

Equalized On-Chip Global Link Overall structure Tapered current-mode logic (CML) drivers Terminated differential on-chip T-line Continuous-time linear equalizer (CTLE) receiver Sense-amplifier based latch

Basic Working Principle Tapered CML Driver Provide low-swing differential signals to driver T-line Tapered factor u, number of stages N, fan-out X, final stage current ISS, driver resistance RS T-line Differential wire w/ P/G shielding Geometries (width, pitch) and termination resistance RT CTLE Receiver Recover signal and improve eye-quality Load resistance RL, source degeneration resistance RD and capacitance CD, over-drive voltage Vod. Sense-amplifier based latch Synchronize and convert signal back to digital level

Tapered CML Driver Design Output swing constraint Design guideline [Tsuchiya2006, Heydari2004] Begin from the final stage For given VSW, output resistance RS optimized with RT to increase eye-opening Transistor size Tapered factor u = 2.7 for delay reduction Number of stages Each previous stage is designed backward by scaling with the factor u Need to design: Output resistance RS Tail current ISS Size of transistors W

CML Driver Study w/ Loaded T-line Assume 45nm 1P11M CMOS T-line built on M9 with M1 as reference T = 1.2um, H = 3.5um (fixed) Optimize W and S for eye-opening Change of the eye-opening with width for fixed 2um pitch Change of the eye-opening with pitch for equal width/spacing

CML Driver Design Example Experimental observations Optimal eye happens when width=spacing Eye-opening improves with larger pitch Design methodology Choose the minimum pitch that satisfied the wire-end eye-opening requirement Design example

Accurate CTLE Modeling Design Variables: RL, RD, CD, Vod(Size) [Hanumolu2005] Small Signal Circuit to derive H(s):

CTLE Modeling Validation <10% correlation error >20% eye-opening increase Test case:10mm, 16mV-eye@wire-end Blue lines: simple modeling, not consider rds and parasitics Red line: only consider rds Black line: the proposed accurate model

CTLE Design Example Observations of CTLE study Design example Eye-opening improves with relaxed power constraints but tends to be saturated Design example Based on the pre-optimized CML driver + T-line design Eye-opening improved by 4X after CTLE

Driver-Receiver Co-Design Methodology Optimize driver-wire-receiver together by setting Veye/Power as the cost function Choose pre-designed CML/T-line/CTLE as initial solution Optimization Flow Driver-to-receiver step-response generation based on SPICE simulation and CTLE modeling Eye-opening estimation based on step-response SQP-based non-linear optimization Variables: [ISS,RT,RL,RD,CD,Vod] Performance Comparison Option A:Driver/Receiver independent design Option B:Low-power driver/receiver co-design

Low Energy-per-Bit Optimization Flow Pre-designed CML driver Pre-designed CTLE receiver Driver-Receiver Co-Design Initial Solution Change variables [ISS,RT,RL,RD,CD,Vod] Cost-Function Veye/Power Co-Design Cost Function Estimation SPICE generated T-line step response Receiver Step-Response using CTLE modeling Step-Response Based Eye Estimation Internal SQP (Sequential Quadratic Optimization) routine to generate best solution Best set of design variables in terms of overall energy-per-bit

Simulated Eye Diagrams Methodology A: driver/receiver separate design Methodology B: driver/receiver co-design for low-power

Summary of Performance Comparison Methodology A driver/receiver separate design Methodology B driver/receiver co-design for low-power RS/ohm 47 148 RT/ohm 94 1100 RL/ohm 440 890 RD/ohm 110 1430 CD/fF 680 150 Vod/mV 60 58 Eye-Opening@CTLE/mV 91 113 Power Consumption/mW 8.1 3.8 Note: driver/receiver co-design methodology uses much larger driver/termination resistance to reduce power, but will close the eye-opening at the driver output and wire-end. Final eye is recovered by fully utilizing CTLE.

Conclusion We propose a novel equalized on-chip global link using CML driver and CTLE receiver Accurate modeling for CTLE is provided to achieve <10% correlation error and will improve eye-opening optimization quality Our design achieves 20Gbps signaling over 10mm, 2.2um-pitch on-chip T-line 11ps/mm latency and 0.2pJ/b energy

Thank You! Q & A