ECE 260B – CSE 241A Design Styles 1http://vlsicad.ucsd.edu ECE260B – CSE241A Winter 2005 Design Styles Multi-Vdd/Vth Designs Website:

Slides:



Advertisements
Similar presentations
Embedded Systems Design: A Unified Hardware/Software Introduction 1 Chapter 10: IC Technology.
Advertisements

Day - 3 EL-313: Samar Ansari. INTEGRATED CIRCUITS Integrated Circuit Design Methodology EL-313: Samar Ansari Programmable Logic Programmable Array Logic.
1 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea MAPLD 148:"Is Scaling the Correct Approach for Radiation Hardened Conversions.
Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY Circuits R-US.
Altera FLEX 10K technology in Real Time Application.
Introduction to CMOS VLSI Design Lecture 19: Design for Skew David Harris Harvey Mudd College Spring 2004.
Clock Design Adopted from David Harris of Harvey Mudd College.
EECE579: Digital Design Flows
ECE Synthesis & Verification - Lecture 0 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits VLSI.
MICROELETTRONICA Design methodologies Lection 8. Design methodologies (general) Three domains –Behavior –Structural –physic Three levels inside –Architectural.
The Design Process Outline Goal Reading Design Domain Design Flow
Puneet Sharma and Puneet Gupta Prof. Andrew B. Kahng Prof. Dennis Sylvester System-Level Living Roadmap Annual Review, Sept Basic Ideas Gate-length.
ECE Synthesis & Verification - Implementation 1 ECE 667 Spring 2007 ECE 667 Spring 2007 Synthesis and Verification of Digital Circuits Design Implementation.
UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.
Institute of Digital and Computer Systems 1 Fabio Garzia / Finding Peak Performance in a Process23/06/2015 Chapter 5 Finding Peak Performance in a Process.
Evolution of implementation technologies
1 CS 140L Lecture 1 CK Cheng CSE Dept. UC San Diego.
Design Methodologies.
On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 22: Material Review Prof. Sherief Reda Division of Engineering, Brown University.
Design Methodology.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 13: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
Lecture 5 – Power Prof. Luke Theogarajan
Temperature-Aware Design Presented by Mehul Shah 4/29/04.
Lecture 7: Power.
Selective Gate-Length Biasing for Cost-Effective Runtime Leakage Control Puneet Gupta 1 Andrew B. Kahng 1 Puneet Sharma 1 Dennis Sylvester 2 1 ECE Department,
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
Lecture # 1 ENG6090 – VLSI Design.
CSET 4650 Field Programmable Logic Devices
EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.
IC Design methodology and Design styles J. Christiansen, CERN - EP/MIC
CMOS Design Methodologies
EGRE 427 Advanced Digital Design Figures from Application-Specific Integrated Circuits, Michael John Sebastian Smith, Addison Wesley, 1997 Chapter 7 Programmable.
Lecture 2: Field Programmable Gate Arrays September 13, 2004 ECE 697F Reconfigurable Computing Lecture 2 Field Programmable Gate Arrays.
ENGG 6090 Topic Review1 How to reduce the power dissipation? Switching Activity Switched Capacitance Voltage Scaling.
EGRE 427 Advanced Digital Design Figures from Application-Specific Integrated Circuits, Michael John Sebastian Smith, Addison Wesley, 1997 Chapter 1 Introduction.
Evolution in Complexity Evolution in Transistor Count.
Power Reduction for FPGA using Multiple Vdd/Vth
CAD for Physical Design of VLSI Circuits
EGRE 427 Advanced Digital Design Figures from Application-Specific Integrated Circuits, Michael John Sebastian Smith, Addison Wesley, 1997 Chapter 4 Programmable.
Open Discussion of Design Flow Today’s task: Design an ASIC that will drive a TV cell phone Exercise objective: Importance of codesign.
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
1 Moore’s Law in Microprocessors Pentium® proc P Year Transistors.
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
J. Christiansen, CERN - EP/MIC
Penn ESE370 Fall Townley & DeHon ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 13: October 5, 2011 Layout and.
Field Programmable Gate Arrays (FPGAs) An Enabling Technology.
XIAOYU HU AANCHAL GUPTA Multi Threshold Technique for High Speed and Low Power Consumption CMOS Circuits.
Guy Lemieux, Mehdi Alimadadi, Samad Sheikhaei, Shahriar Mirabbasi University of British Columbia, Canada Patrick Palmer University of Cambridge, UK SoC.
Lecture 10: Circuit Families. CMOS VLSI DesignCMOS VLSI Design 4th Ed. 10: Circuit Families2 Outline  Pseudo-nMOS Logic  Dynamic Logic  Pass Transistor.
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
Leakage reduction techniques Three major leakage current components 1. Gate leakage ; ~ Vdd 4 2. Subthreshold ; ~ Vdd 3 3. P/N junction.
Exercise TAIST ICTES Program VLSI Design Methodology Hiroaki Kunieda Tokyo Institute of Technology.
Chapter 3 How transistors operate and form simple switches
Basics of Energy & Power Dissipation
FPGA-Based System Design: Chapter 1 Copyright  2004 Prentice Hall PTR Moore’s Law n Gordon Moore: co-founder of Intel. n Predicted that number of transistors.
Sp09 CMPEN 411 L14 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 14: Designing for Low Power [Adapted from Rabaey’s Digital Integrated Circuits,
Integrated Microsystems Lab. EE372 VLSI SYSTEM DESIGNE. Yoon 1-1 Panorama of VLSI Design Fabrication (Chem, physics) Technology (EE) Systems (CS) Matel.
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 6.1 EE4800 CMOS Digital IC Design & Analysis Lecture 6 Power Zhuo Feng.
Seok-jae, Lee VLSI Signal Processing Lab. Korea University
EE141 Arithmetic Circuits 1 Chapter 14 Arithmetic Circuits Rev /12/2003 Rev /05/2003.
April 22, Bit-Line Leakage Cancellation: Design and Test Automation Sudhanshu Khanna.
Introduction to ASICs ASIC - Application Specific Integrated Circuit
EE141 Design Styles and Methodologies
XC4000E Series Xilinx XC4000 Series Architecture 8/98
Chapter 10: IC Technology
Chapter 10: IC Technology
HIGH LEVEL SYNTHESIS.
Chapter 10: IC Technology
IC Design methodology and Design styles
Presentation transcript:

ECE 260B – CSE 241A Design Styles 1http://vlsicad.ucsd.edu ECE260B – CSE241A Winter 2005 Design Styles Multi-Vdd/Vth Designs Website:

ECE 260B – CSE 241A Design Styles 2http://vlsicad.ucsd.edu The Design Problem Source: sematech97 A growing gap between design complexity and design productivity

ECE 260B – CSE 241A Design Styles 3http://vlsicad.ucsd.edu Design Methodology Design process traverses iteratively between three abstractions: behavior, structure, and geometry More and more automation for each of these steps

ECE 260B – CSE 241A Design Styles 4http://vlsicad.ucsd.edu Behavioral Description of Accumulator Design described as set of input-output relations, regardless of chosen implementation Data described at higher abstraction level (“integer”)

ECE 260B – CSE 241A Design Styles 5http://vlsicad.ucsd.edu Structural Description of Accumulator Design defined as composition of register and full-adder cells (“netlist”) Data represented as {0,1,Z} Time discretized and progresses with unit steps Description language: VHDL Other options: schematics, Verilog

ECE 260B – CSE 241A Design Styles 6http://vlsicad.ucsd.edu Implementation Methodologies

ECE 260B – CSE 241A Design Styles 7http://vlsicad.ucsd.edu Full Custom  Hand drawn geometry  All layers customized  Digital and analog  Simulation at transistor level  High density  High performance  Long design time Magic Layout Editor (UC Berkeley)

ECE 260B – CSE 241A Design Styles 8http://vlsicad.ucsd.edu Symbolic Layout Stick diagram of inverter Dimensionless layout entities Only topology is important Final layout generated by “compaction” program

ECE 260B – CSE 241A Design Styles 9http://vlsicad.ucsd.edu Standard Cells Routing channel requirements are reduced by presence of more interconnect layers  Organized in rows  Cells made as full custom by vendor (not user)  All layers customized  Digital with possible special analog cells  Simulation at gate level (digital)  Medium-high density  Medium-high performance  Reasonable design time

ECE 260B – CSE 241A Design Styles 10http://vlsicad.ucsd.edu Standard Cell — Example [Brodersen92]

ECE 260B – CSE 241A Design Styles 11http://vlsicad.ucsd.edu Standard Cell - Example 3-input NAND cell (from Mississippi State Library) characterized for fanout of 4 and for three different technologies

ECE 260B – CSE 241A Design Styles 12http://vlsicad.ucsd.edu Automatic Cell Generation Random-logic layout generated by CLEO cell compiler (Digital)

ECE 260B – CSE 241A Design Styles 13http://vlsicad.ucsd.edu Module Generators — Compiled Datapath

ECE 260B – CSE 241A Design Styles 14http://vlsicad.ucsd.edu Macrocell-Based Design Macrocell Interconnect Bus Routing Channel  Predefined macro blocks (uP, RAM, etc.)  Macro blocks made as full custom by vendor (IP blocks)  All layers customized  Digital and some analog  Simulation at behavior or gate level  High density  High performance  Short design time  Use standard on-chip busses  “System on a chip” (SOC)

ECE 260B – CSE 241A Design Styles 15http://vlsicad.ucsd.edu Macrocell Design Methodogoly Video-encoder chip [Brodersen92] SRAM Routing Channel Data paths Standard cells Floorplan: Defines overall topology of design, relative placement of modules, and global routes of busses, supplies, and clocks

ECE 260B – CSE 241A Design Styles 16http://vlsicad.ucsd.edu Gate Array  Predefined transistors connected via metal  Two types: channel based, sea of gates  Only metal layers customized  Fixed array sizes  Digital cells in library  Simulation at gate level (digital)  Medium density  Medium performance  Reasonable design time

ECE 260B – CSE 241A Design Styles 17http://vlsicad.ucsd.edu Gate Array — Primitive Cells Uncommited Cell Committed Cell (4-input NOR)

ECE 260B – CSE 241A Design Styles 18http://vlsicad.ucsd.edu Sea-of-gate Primitive Cells Using oxide-isolationUsing gate-isolation

ECE 260B – CSE 241A Design Styles 19http://vlsicad.ucsd.edu Sea-of-gates Random Logic Memory Subsystem LSI Logic LEA300K (0.6  m CMOS)

ECE 260B – CSE 241A Design Styles 20http://vlsicad.ucsd.edu Prewired Arrays  Programmable logic blocks  Programmable connections between logic blocks  No layers customized (standard devices)  Digital only  Low-medium performance  Low-medium density  Programmable: SRAM, EPROM, Flash, Anti-fuse, etc.  Easy and quick design changes  Cheap design tools  Low development cost  High device cost  NOT a real ASIC Courtesy Altera Corp.

ECE 260B – CSE 241A Design Styles 21http://vlsicad.ucsd.edu Programmable Logic Devices PLAPROM PAL

ECE 260B – CSE 241A Design Styles 22http://vlsicad.ucsd.edu EPLD Block Diagram Macrocell Courtesy Altera Corp. Primary inputs

ECE 260B – CSE 241A Design Styles 23http://vlsicad.ucsd.edu Field-Programmable Gate Arrays - Fuse-based Standard-cell like floorplan

ECE 260B – CSE 241A Design Styles 24http://vlsicad.ucsd.edu Interconnect Programming interconnect using anti-fuses

ECE 260B – CSE 241A Design Styles 25http://vlsicad.ucsd.edu Field-Programmable Gate Arrays - RAM-based

ECE 260B – CSE 241A Design Styles 26http://vlsicad.ucsd.edu RAM-based FPGA - Basic Cell (CLB) Courtesy of Xilinx

ECE 260B – CSE 241A Design Styles 27http://vlsicad.ucsd.edu RAM-based FPGA Xilinx XC4025

ECE 260B – CSE 241A Design Styles 28http://vlsicad.ucsd.edu High Performance Devices  Mixture of full custom, standard cells and macro’s  Full custom for special blocks: Adder (data path), etc.  Macro’s for standard blocks: RAM, ROM, etc.  Standard cells for non critical digital blocks

ECE 260B – CSE 241A Design Styles 29http://vlsicad.ucsd.edu Global Signaling and Layout  Global signaling and layout optimization  Multi-V dd  Static power analysis  Multi-V th + V dd + sizing D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 30http://vlsicad.ucsd.edu Global Signaling  Current global signaling paradigm  insert large static CMOS repeaters to reduce wire RC delay  Impending problems: l Too many repeaters -180nm processors: 22K repeaters (Itanium), 70K (Power4) -Project 1-1.5M repeaters at 45-65nm technologies l Too much power -Many large repeaters = significant static and dynamic power l Too much noise -Repeater clustering complicates power distribution -Inductive coupling across wide bus structures D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 31http://vlsicad.ucsd.edu Cell Layout Optimization  Advanced layout techniques must allow l Continuous individual device sizing l Variable p/n ratios l Tapered FET stacking sizes l Arbitrary V th assignments within gates  First cut: Cadabra  15-22% power reduction using 1 st two approaches under fixed footprint constraint GDSII Import Compact fixed width Ref: Hurat, Cadabra Optimize specific instances of standard gates D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 32http://vlsicad.ucsd.edu Multi-Vdd  Global signaling and layout optimization  Multi-V dd  Static power analysis  Multi-V th + V dd + sizing D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 33http://vlsicad.ucsd.edu Multi-Vdd Status  Idea: Incorporate two V dd ’s to reduce dynamic power  Limited to a few recent Japanese multimedia processors Example – 0.3  m, 75MHz, 3.3V media processor (Toshiba) -Total power savings of 47% in logic, 69% in clock l Dynamic voltage scaling of mobile processors -Transmeta Crusoe, Intel Speedstep, etc. -Not considered in this talk  Very powerful technique currently applied only in low-performance designs l Mentality: today’s high performance parts aren’t “limited” by power D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 34http://vlsicad.ucsd.edu Lower Power Via Rich Replacement  Media processors and other low speed designs have many non-critical paths l 60-70% of paths have delay  half the clock period l After replacement, most paths become near critical  What about high-speed microprocessors? % of total paths Path delay (normalized to clock period) D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 35http://vlsicad.ucsd.edu Similar Story For High-Performance  IBM 480 MHz PowerPC shows over 50% of paths have delay less than half the clock period l Implies that high-performance designs can benefit from multi-V dd Ref: Akrout, JSSC98 D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 36http://vlsicad.ucsd.edu Resizing Is Not The Right Answer  Post-synthesis optimizations resize gates to recover power on non-critical paths l Looks similar to pre- and post-replacement figures in media processor… Before post- synthesis resizing After post- synthesis resizing Ref: Sirichotiyakul, DAC99 This is the wrong approach for nanometer design! D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 37http://vlsicad.ucsd.edu Multi-Vdd Instead of Sizing  Power ~ C V dd 2 f, where f is fixed  Key: Reducing gate width impacts power sub-linearly l Interconnect capacitance is not affected  Reducing supply voltage cuts power quadratically l All capacitive loads have lower voltage swing  How can we minimize delay penalty at low V dd ? D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 38http://vlsicad.ucsd.edu Challenges For Multi-Vdd  Area overhead l Toshiba reported 7% rise in area due to placement restrictions, level converters, additional power grid routing  EDA tool support for the above issues (placement, dual power routing)  Noise analysis l Additional shielding required between Vdd,low and Vdd,high signals? l Including clock network D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 39http://vlsicad.ucsd.edu Static Power  Global signaling and layout optimization  Multi-V dd  Static power  Multi-V th + V dd + sizing D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 40http://vlsicad.ucsd.edu Static Power  Why do we care about static power in non-portable devices? l Standby power is “wasted” -- leaves fewer Watts for computation l Worsens reliability by raising die temperatures  Leakage current is a function of V th and subthreshold swing (S s ) (x10 at operating vs. room temp!)  S s expected to remain at mV/dec (room temp) l Device technology may cut this by ~20%  V th reductions are mandated by scaling V dd l V th has been around V dd /5 D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 41http://vlsicad.ucsd.edu Current Status  No sub-1V technologies demonstrate good on/off current performance (yet – expect improvements in production)  Oxide scaling is running out of steam; overall ~3x I off per node (physical)50ITRS (uses high-k)45ITRS (physical)70ITRS (physical)100ITRS (physical)100NEC, Intel, TI, NEC, Samsung, Intel,00 I off (nA/  m) I on (  A/  m) V dd T ox (Å) (electrical)ITRS node Reference Working numbers D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 42http://vlsicad.ucsd.edu Leakage Suppression Approaches  Dual-V th (most common) l Low-V th on critical paths, high-V th off l Only cost is additional masks  MTCMOS l Series inserted high-V th device cuts leakage current when off (sleep mode) l Delay and area penalties, control device sizing is critical  Other techniques l Substrate biasing to control V th l Dual-V th domino -Use low-V th devices only in evaluate paths Pull Up Pull Down Parasitic Node Vcontrol Vout Vdd High Vth Device D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 43http://vlsicad.ucsd.edu Can Gate-length biasing help leakage reduction?  Reduce leakage? Variation of leakage and delay (each normalized to 1) for an NMOS device in an industrial 130nm technology  Reduce leakage variability? Biasing

ECE 260B – CSE 241A Design Styles 44http://vlsicad.ucsd.edu Gate-length Biasing  First proposed by Sirisantana et al. l Comparative study of effect of doping, tox and gate-length l Large bias used, significant slow down  Small bias l Little reduction in leakage beyond 10% bias while delay degrades linearly l Preserves pin compatibility  Technique applicable as post-RET step  Salient features l Design cycle not interfered l Zero cost (no additional masks)

ECE 260B – CSE 241A Design Styles 45http://vlsicad.ucsd.edu Granularity  Technology-level All devices in all cells have one biased gate-length  Cell-level All devices in a cell have one biased gate-length  Device-level All devices have independent biased gate-length Simplification: In each cell, NMOS devices have one gate-length and PMOS devices have another

ECE 260B – CSE 241A Design Styles 46http://vlsicad.ucsd.edu Device-Level Leakage Reduction

ECE 260B – CSE 241A Design Styles 47http://vlsicad.ucsd.edu Circuit level  Bias gate-length for non-critical cells  Library extended with each cell having a biased version  Benefits analyzed in conjunction with Multi-V T assignment and in isolation l SVT-SGL l DVT-SGL l SVT-DGL l DVT-DGL

ECE 260B – CSE 241A Design Styles 48http://vlsicad.ucsd.edu Results: Leakage Reduction With less than 2.5% delay penalty Design Compiler used for V T assignment and gate-length biasing Better results expected with Duet (academic sizer from Michigan)

ECE 260B – CSE 241A Design Styles 49http://vlsicad.ucsd.edu Results: Leakage Variability Leakage distribution for the testcase alu128 Traces shown Unbiased circuit Technology level biasing Uniform biasing Percentage Reduction in Leakage Spread

ECE 260B – CSE 241A Design Styles 50http://vlsicad.ucsd.edu Futures  Construction of effective biasing based leakage optimization heuristics  Gate-length selection at true device-level granularity  Evaluation of gate-length biasing at future technology nodes

ECE 260B – CSE 241A Design Styles 51http://vlsicad.ucsd.edu Multi-Vth + Vdd + Sizing  Global signaling and layout optimization  Multi-V dd  Static power analysis  Multi-V th + V dd + sizing D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 52http://vlsicad.ucsd.edu Multi-Everything  Need an approach that selects between speed, static power, and dynamic power  Should be scalable to nanometer design l Rules out dual-V th domino or other dynamic logic families (low supplies kill performance advantages)  Techniques mentioned so far l Flexible, optimized cell layouts l Multi-V dd l Dual-V th  Put them all together D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 53http://vlsicad.ucsd.edu Multi-Vdd Can Leverage Vth’s  Existing designs using multi-V dd do not alter V th in low- V dd cells l Highly sub-optimal, delay is fully penalized l Limits cell replacement  limits power savings  Much better solution: reduce V th in low-V dd cells to carefully balance delay, static power, and dynamic power l Enforce technology scaling within a chip – whenever we reduce V dd, we also reduce V th to maintain speed D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 54http://vlsicad.ucsd.edu Multi-Vdd + Vth Negates Delay Penalty Delay ~ CV dd /I on  Scenarios l Constant V th (current paradigm) l Scale V th to maintain constant static power l Scale V th to reduce static power linearly with V dd  Delay penalty is substantially offset I on is very sensitive to V th at V dd < 1V P static reduces with V dd due to linear term and smaller I off (I on and DIBL  ) D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 55http://vlsicad.ucsd.edu Now Add Sizing  Multi-V dd + multi-V th + sizing/cell layout optimization attacks power from many angles (multi-dimensional)  Depending on criticality and switching activities, non- critical gates can be: l Assigned Vdd,low l Assigned Vdd,low + lower Vth l Assigned Vth,high l Downsized (at the individual transistor level if advantageous) l Assigned Vdd,low and upsized -For gates that cannot tolerate Vdd,low delay, this can be power efficient l And others D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 56http://vlsicad.ucsd.edu Summary  Power density must saturate to maintain affordable packaging options l 50 W/cm 2 means W for future large MPUs l Dynamic thermal management saves 25% on packaging power budget  Multi-V dd will leverage multiple V th ’s to offset delay penalty at low V dd l More widespread re-assignment to Vdd,low l Use V dd first instead of re-sizing to take advantage of large path slacks l Anticipated power savings of 50-80%  Static power also addressed through multi-V th + V dd + sizing l V th difficult to control in ultra-short channels l Intra-cell V th assignment + MTCMOS/variants + sleep modes D. Sylvester, DAC-2001

ECE 260B – CSE 241A Design Styles 57http://vlsicad.ucsd.edu Next Week: Project Meetings D. Sylvester, DAC-2001