ECE 679: Digital Systems Engineering

Slides:



Advertisements
Similar presentations
ASYNC07 High Rate Wave-pipelined Asynchronous On-chip Bit-serial Data Link R. Dobkin, T. Liran, Y. Perelman, A. Kolodny, R. Ginosar Technion – Israel Institute.
Advertisements

18/05/2015 Calice meeting Prague Status Report on ADC LPC ILC Group.
Custom Implementation of DSP Systems School of Electrical and
CSE477 L19 Timing Issues; Datapaths.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 19: Timing Issues; Introduction to Datapath.
A Zero-IF 60GHz Transceiver in 65nm CMOS with > 3.5Gb/s Links
Design and Implementation of VLSI Systems (EN0160) Sherief Reda Division of Engineering, Brown University Spring 2007.
June 23, 2006 Patrick Chiang Website: eecs.oregonstate.edu/~pchiang Research Summary of Oregon State University Patrick.
Lecture 8: Clock Distribution, PLL & DLL
1 Interconnect and Packaging Lecture 7: Distortionless Communication Chung-Kuan Cheng UC San Diego.
Polar Loop Transmitter T. Sowlati, D. Rozenblit, R. Pullela, M. Damgaard, E. McCarthy, D. Koh, D. Ripley, F. Balteanu, I. Gheorghe.
Die-Hard SRAM Design Using Per-Column Timing Tracking
Interconnect and Packaging Lecture 2: Scalability
1 A 16:1 serializer for data transmission at 5 Gbps Datao Gong 1, Suen Hou 2, Zhihua Liang 1, Chonghan Liu 1, Tiankuan Liu 1, Da-Shun Su 2, Ping-Kun Teng.
Brief Introduction of High-Speed Circuits for Optical Communication Systems Zheng Wang Instructor: Dr. Liu.
NA62 front end architecture and performance Jan Kaplon/Pierre Jarron.
EECS 170C Lecture Week 1 Spring 2014 EECS 170C
High-Speed Circuits & Systems Laboratory Electronic Circuits for Optical Systems : Transimpedance Amplifier (TIA) Jin-Sung Youn
GUIDED BY: Prof. DEBASIS BEHERA
Wireless RF Receiver Front-end System – Wei-Liang Chen Wei-Liang Chen Wireless RF Receiver Front-end System Yuan-Ze University, VLSI Systems Lab
BY MD YOUSUF IRFAN.  GLOBAL Positioning System (GPS) receivers for the consumer market require solutions that are compact, cheap, and low power.  This.
Motivation Yang You 1, Jinghong Chen 1, Datao Gong 2, Deping Huang 1, Tiankuan Liu 2, Jingbo Ye 2 1 Department of Electrical Engineering, Southern Methodist.
A Serializer ASIC for High Speed Data Transmission in Cryogenic and HiRel Environment Tiankuan Liu On behalf of the ATLAS Liquid Argon Calorimeter Group.
Design and Implementation of VLSI Systems (EN1600) lecture01 Sherief Reda Division of Engineering, Brown University Spring 2008 [sources: Weste/Addison.
Experimental results obtained from a 1.6 GHz CMOS Quadrature Output PLL with on-chip DC-DC Converter Owen Casha Department of Micro & Nanoelectronics University.
© H. Heck 2008Section 4.41 Module 4:Metrics & Methodology Topic 4: Recovered Clock Timing OGI EE564 Howard Heck.
New MMIC-based Millimeter-wave Power Source Chau-Ching Chiong, Ping-Chen Huang, Yuh-Jing Huang, Ming-Tang Chen (ASIAA), Shou-Hsien Weng, Ho-Yeh Chang (NCUEE),
Slide: 1International Conference on Electronics, Circuits, and Systems 2010 Department of Electrical and Computer Engineering University of New Mexico.
1 Process-Variation Tolerant Design Techniques for Multiphase Clock Generation Manohar Nagaraju +, Wei Wu*, Cameron Charles # + University of Washington,
2.5Gbps jitter generator Part 1 final presentation.
A 30-GS/sec Track and Hold Amplifier in 0.13-µm CMOS Technology
1.  Why Digital RF?  Digital processors are typically implemented in the latest CMOS process → Take advantages scaling. (e.g. density,performance) 
S. -L. Jang, Senior Member, IEEE, S. -H. Huang, C. -F. Lee, and M. -H
October 31st, 2005CSICS Presentation1 A 1-Tap 40-Gbps Decision Feedback Equalizer in a  m SiGe BiCMOS Technology Adesh Garg, Anthony Chan Carusone.
Design of Front-End Low-Noise and Radiation Tolerant Readout Systems José Pedro Cardoso.
Kuang-Yu,Li 2013 IEE5011 –Autumn 2013 Memory Systems Duty Cycle Correctors (DCC) In GDDR5 SDRAM Kuang-Yu, Li Department of Electronics Engineering National.
Tinoosh Mohsenin and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis Split-Row: A Reduced Complexity, High Throughput.
A 240ps 64b Carry-Lookahead Adder in 90nm CMOS Faezeh Montazeri Advanced VLSI Course Presentation University of Tehran December.
ECE 124a/256c Advanced VLSI Design Forrest Brewer.
1 ECE1352F – Topic Presentation - ADPLL By Selvakkumaran S.
SI Fundamentals Short Course - Equalization
A 2-GHz Direct Sampling ΔΣ Tunable Receiver with 40-GHz Sampling Clock and on-chip PLL T. Chalvatzis 1, T. O. Dickson 1,2 and S. P. Voinigescu 1 1 University.
A Low-Jitter 8-to-10GHz Distributed DLL for Multiple-Phase Clock Generation Keng-Jan Hsiao and Tai-Cheng Lee National Taiwan University Taipei, Taiwan.
Homework Statement Mao-Cheng Chiu National Chiao -Tung University Department of Electronics Engineering.
A 1.25-Gb/s Digitally-Controlled Dual-Loop Clock and Data Recovery Circuit with Improved Effective Phase Resolution Chang-Kyung Seong 1), Seung-Woo Lee.
Design studies of a low power serial data link for a possible upgrade of the CMS pixel detector Beat Meier, Paul Scherrer Institut PSI TWEPP 2008.
A 1-V 2.4-GHz Low-Power Fractional-N Frequency Synthesizer with Sigma-Delta Modulator Controller 指導教授 : 林志明 教授 學生 : 黃世一 Shuenn-Yuh Lee; Chung-Han Cheng;
Adviser : Hwi-Ming Wang Student : Wei-Guo Zhang Date : 2009/7/14
Tod Dickson University of Toronto June 9, 2005
1 Low-Voltage BiCMOS Circuits for High-Speed Data Links up to 80 Gb/s Tod Dickson University of Toronto June 24, 2005.
1 PD Loop Filter 1/N Ref VCO Phase- Locked Loop LO.
Timothy O. Dickson and Sorin P. Voinigescu Edward S. Rogers, Sr. Dept of Electrical and Computer Engineering University of Toronto CSICS November 15, 2006.
Trends in IC technology and design J. Christiansen CERN - EP/MIC
64 bit Kogge-Stone Adders in different logic styles – A study Rob McNish Satyanand Nalam.
A 20/30 Gbps CMOS Backplane Driver with Digital Pre-emphasis Paul Westergaard, Timothy Dickson, and Sorin Voinigescu University of Toronto Canada.
Surfliner: Distortion-less Electrical Signaling for Speed of Light On- chip Communication Hongyu Chen, Rui Shi, Chung-Kuan Cheng Computer Science and Engineering.
A high speed serializer ASIC for ATLAS Liquid Argon calorimeter upgrade Tiankuan Liu On behalf of the ATLAS Liquid Argon Calorimeter Group Department of.
1 The Link-On-Chip (LOC) Project at SMU 1.Overview. 2.Status 3.Current work on LOCs6. 4.Plan and summary Jingbo Ye Department of Physics SMU Dallas, Texas.
Low-Power and High-Speed Interconnect Using Serial Passive Compensation Chun-Chen Liu and Chung-Kuan Cheng Computer Science and Engineering Dept. University.
Seok-jae, Lee VLSI Signal Processing Lab. Korea University
Delay-based Spread Spectrum Clock Generator Subramaniam Venkatraman Matthew Leslie University of California, Berkeley EE 241 Final Presentation May 9 th.
J. Ye SMU Joint ATLAS-CMS Opto-electronics working group, April 10-11, 2008 CERN 1 Test Results on LOC1 and Design considerations for LOC2 LOC1 test results:
M. Atef, Hong Chen, and H. Zimmermann Vienna University of Technology
High Gain Transimpedance Amplifier with Current Mirror Load By: Mohamed Atef Electrical Engineering Department Assiut University Assiut, Egypt.
A 16:1 serializer for data transmission at 5 Gbps
The Control of Phase and Latency in the Xilinx Transceivers
A 13.5-mW 5-GHz Frequency Synthesizer With Dynamic Divider
All-Synthesizable 6Gbps Voltage-Mode Transmitter for Serial Link
MCP Electronics Time resolution, costs
Interconnect and Packaging Lecture 2: Scalability
All-Synthesizable 6Gbps Voltage-Mode Transmitter for Serial Link
Presentation transcript:

ECE 679: Digital Systems Engineering Patrick Chiang Office Hours: 1-2PM Mon-Thurs GLSN 100

Class Introductions Who am I Who are you

Class Basics Class basics Guest lecture (Dr. Frank O’Mahony) 4 Homeworks (%20) (groups of 2) Midterm (%40) Final Project (%40) 4-page IEEE report 10 minute presentation (groups of 2) Guest lecture (Dr. Frank O’Mahony) Intel Research Labs (May 4th) Intel Field Trip (June 7th) TBD Presentations of 1-2 best project reports

Class Homework Homework Skim Dally/Poulton “Digital Systems Engineering” Chapter 3 Skim Overview Paper: http://mos.stanford.edu/papers/mh_micro_98.pdf Includes running Stat Eye Oregon State Matlab (eecs.oregonstate.edu/it) www.stateye.org Problem Set #1 rlc files -- ~pchiang/hspice (rlc_spice_deck; rlc.rlc) Spice models -- ~pchiang/hspice/process_files/ 130nm to 22nm Simulator lang = spice Spectre models – DEFINE gpdk090 /nfs/guille/analog/c/cdsmgr/process/gpdk090_v3.8/libs.cdb/gpdk090

What does this mean for analog designers? Ever build an ADC? Ever wonder what to do with the digital bits? 8-16 bits @ 100MHz, 200MHz, 400MHz Goes to Vector analyzer Analog Why does this clock rate not increase? What really is this output doing? Where is it going? Fs = 600MHz

Brief Summary Introduction to the area Why serial links are important What are the current technology trends/limitations

4Gb/s Low Power, Area Efficient Serial Links IBM Processor CPU M e m o r y From/to other subsystems (e.g. backplane) High-speed I/Os Interconnection between different chips Transmitter Equalization Receiver Offset Cancellation 4Gb/s Transmitter Output, 1m Organization of the channel, arrows from channel, plots…change image layout Reall what you want to say on the slides. Transmitter Output Router Backplane(1m, FR4) Receiver Input 2000 0.25um Testchip 2001 0.25um Testchip Ming-Ju E. Lee, William J. Dally, John W. Poulton, Patrick Chiang, Stephen F. Greenwood. An 84-mW 4Gb/s Clock and Data Recovery Circuit for Serial Link Applications. VLSI Circuits Symposium, Kyoto, Japan, June 2001, pp. 149-152. Ming-Ju E. Lee, William Dally, Patrick Chiang. Low-Power Area-Efficient High-Speed I/O Circuit Techniques. IEEE Journal of Solid-State Circuits, November 2000, Vol. 35, No. 11, pp. 1591-1599. 4Gb/s Transmitter Output, Equalized 4Gb/s Transmitter Output

Scaling Serial Links: From 4Gb/s->20Gb/s Thesis: Develop 20Gb/s Serial Link Area: 500um x 500um Power: 200mW/link 1 bit time = 1FO4 Focus on timing uncertainty, not channel…independent vector Timing uncertainty becomes KEY issue t 250ps v 4Gb/s Eye Diagram t 50ps v 20Gb/s Eye Diagram

Transmitter Block Diagram No post-PLL Clock Buffers Dotted lines around different circuit components, PLL, muxing, etc. Clocks are differential clocks. Get rid of everything else, use red. Or change images…lose people on the insight, carry through. Simpler is better

Test Chip UMC 1.2V, 0.13um CMOS(single Vt) Die size 700um x 1.15mm Test Interface 10GHz PLL PRBS Check Test Structures 700um Phase Interpolators RX DLL TX Clock Recovery Transmitter Muxing PRBS Gen Our test chip was fabricated in National Semiconductor’s quarter micron CMOS technology. The die is 2.6 by 1.4 square millimeter and uses a 52-pin impedance controlled package donated by Vitesse Corporation. The active area of the transceiver circuits is 0.31-mm2. 1.1mm UMC 1.2V, 0.13um CMOS(single Vt) Die size 700um x 1.15mm 50 Ohm Pad Termination using Wafer Probes

PLL Measurements Jitter limited by 1.25GHz input reference clock Power Spectrum Open Loop VCO Phase Noise @ 1MHz -97dBc/Hz 10GHz Jitter (RMS) 0.97ps 10GHz Jitter(pk-pk) 8.0ps PLL Power 38.6mW VCO Power 6mW Tuning Range 1.14-1.31 Change the cadence of talking…these are the important points. Too much stuff in slides, too heavy…line width, is 2-3 points. Q=10 Jitter Q=5 Jitter (c) Jitter limited by 1.25GHz input reference clock HP 8133A input clock (1.2ps RMS, 8.9ps pk-pk)

Eye Diagram Jitter 2.2ps RMS 15.6ps pk-pk Data Rate = 19.2Gb/s Don’t spend toom uch time on 19.2 Seen here is the phase step values across the entire range. The average phase resolution should by 15.6ps, so the interpolation steps shown are very accurate. Note that every 9nth phase has phase interpolation values lower than the average of 15.6ps, which is what is expected, since these are the redundant steps. You can also see that not “every” 9th phase value is consistently small. For example, phases 18 and 36 don’t show as small of a phase step as phases 9 and 27.The reason for this error is due to a layout error, due to asymmetric clock loading causing different capacitive coupling for different transitions. (Different phase differences due to different delays amounts in the DLL itself) Data Rate = 19.2Gb/s Voltage ripple caused by lack of current source at differential pair tail node

High Speed Transmitter Comparisons A 250mW Full-Rate 10Gb/s Transceiver Core in 90nm CMOS using a Tri-State Binary PD with 100ps Gated Digital Output T. Masuda, et. al., ISSCC 2007. A full-rate 10Gb/s transceiver core employing a tri-state binary PD with 100ps gated digital output is implemented in a 90nm CMOS process. Direct drive from the VCO is utilized to eliminate the 10GHz clock buffer current. The RX exhibits a recovered jitter of 906fs(rms) and an input sensitivity of 5.9mV. The TX generates a jitter of 5mUI(rms). The chip consumes 250mW.

Conventional Serial Link Receivers Pre-Amp In Data 20Gb/s Multiphase PLL D[0] D[1] D[2] D[3] ck[0] ck[1] ck[2] ck[3] Conventional architectures also use multi-phase PLL Static Phase Offset Power Supply Sensitivity Well, guess what…we have same problem at the receiver

2nd Generation Transmitter Equalizing Path Analog delay, but replica bias… 2-Tap Equalizer implemented for compensating for channel losses Achieve 50ps analog delay with CML buffers

Fabrication: Test Chip ST Microelectronics 0.13um test chip 307mW / transceiver 0.46mm^2 20mV input sensitivity 2006 0.13um Test Chip 450um 350um Transmitter 500um 600um Receiver First 0.13um

All Results Single-Ended 80mV 20Gb/s Ideal Channel All Results Single-Ended 43ps 33mV 20Gb/s -6.5dB @ 10GHz 37ps

20Gb/s Ideal Channel with α=0.37 Results (cont’d) 20Gb/s Ideal Channel with α=0.37 72mV 36.4ps 62mV 20Gb/s -6.5dB @ 10GHz with α=0.37 35ps

Rationale for Multi-cores Next generation computing – Multi-core Processing i.e. multiple, parallel DSPs (i.e. MACs) Why we cannot achieve faster frequencies? Wire delays don’t scale like transistors Power increases exponentially (when pushing process technology) Timing margins degraded by Variability Power supply noise Digital crosstalk NOTE: More independent threads require more memory bandwidth Intel, 80 Cores, ISSCC 2007

Research: Explore Parallel Serial Links Serial Links also exhibit the same characteristics Channel losses get worse Power consumption increases significantly with bandwidth Timing precision limited by: Static Phase Offset (process variation) Power-supply Induced Jitter Interchannel Crosstalk Serial Links need to to also push for high amounts of parallelism How is this different than conventional link design? Channel equalization becomes more difficult Adjacent channel crosstalk Difficult channel estimation problem (power, flexibility, data-rate, equalizer design, channel, distance) Amortize Clock Power for Multiple Links Distributed resonant clocking of analog/mixed-signal front-end’s

Problem of IO 2500 pins / 2 = 1200 Differential pins Assume 10Gbs / link = 12 Tb/s Bandwidth 100mW/Gb(bandwidth) = 120W

Stateye Playing Fun with Stat-Eye Homework examples 5Gb/s -> 10Gb/s Worse Channels Worse timing jitter Homework examples

Next Time Telegrapher’s Equation Channel Models Reflection coefficients Channel Models Skin Effect Dielectric constant vias