® 1 Exponential Challenges, Exponential Rewards— The Future of Moore’s Law Based on lecture of Shekhar Borkar Intel Fellow Circuit Research, Intel Labs.

Slides:



Advertisements
Similar presentations
Exponential Challenges, Exponential Rewards— The Future of Moore’s Law
Advertisements

® 1 Exponential Challenges, Exponential Rewards The Future of Moores Law Shekhar Borkar Intel Fellow Circuit Research, Intel Labs Fall, 2004.
Multi-core systems System Architecture COMP25212 Daniel Goodman Advanced Processor Technologies Group.
Power Reduction Techniques For Microprocessor Systems
Lecture 2: Modern Trends 1. 2 Microprocessor Performance Only 7% improvement in memory performance every year! 50% improvement in microprocessor performance.
VLSI Trends. A Brief History  1958: First integrated circuit  Flip-flop using two transistors  From Texas Instruments  2011  Intel 10 Core Xeon Westmere-EX.
Room: E-3-31 Phone: Dr Masri Ayob TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Computer Performance.
CS 7810 Lecture 12 Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors D. Brooks et al. IEEE Micro, Nov/Dec.
High Performance Computer Architecture Challenges Rajeev Balasubramonian School of Computing, University of Utah.
S. Reda EN160 SP’08 Design and Implementation of VLSI Systems (EN1600) Lecture 18: Scaling Theory Prof. Sherief Reda Division of Engineering, Brown University.
Low-power computer architecture
EE314 Basic EE II Silicon Technology [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]
Temperature-Aware Design Presented by Mehul Shah 4/29/04.
1 Lecture 1: CS/ECE 3810 Introduction Today’s topics:  logistics  why computer organization is important  modern trends.
Power Delivery Challenges for High Performance Low Voltage Microprocessors Tanay Karnik Microprocessor Research Labs Intel Corporation November 9, 2001.
EE141 © Digital Integrated Circuits 2nd Introduction 1 The First Computer.
3.1Introduction to CPU Central processing unit etched on silicon chip called microprocessor Contain tens of millions of tiny transistors Key components:
Computer performance.
EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.
1. 2 Electronics Beyond Nano-scale CMOS Shekhar Borkar Intel Corp. July 27, 2006.
1 VLSI and Computer Architecture Trends ECE 25 Fall 2012.
EZ-COURSEWARE State-of-the-Art Teaching Tools From AMS Teaching Tomorrow’s Technology Today.
Semiconductor Memory 1970 Fairchild Size of a single core –i.e. 1 bit of magnetic core storage Holds 256 bits Non-destructive read Much faster than core.
Multi Core Processor Submitted by: Lizolen Pradhan
Lecture 03: Fundamentals of Computer Design - Trends and Performance Kai Bu
Last Time Performance Analysis It’s all relative
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
® 1 VLSI Design Challenges for Gigascale Integration Shekhar Borkar Intel Corp. October 25, 2005.
1 CS/EE 6810: Computer Architecture Class format:  Most lectures on YouTube *BEFORE* class  Use class time for discussions, clarifications, problem-solving,
MS108 Computer System I Lecture 2 Metrics Prof. Xiaoyao Liang 2014/2/28 1.
Outline  Over view  Design  Performance  Advantages and disadvantages  Examples  Conclusion  Bibliography.
Present – Past -- Future
Basics of Energy & Power Dissipation
W E L C O M E. T R I G A T E T R A N S I S T O R.
Variation-Tolerant Circuits: Circuit Solutions and Techniques Jim Tschanz, Keith Bowman, and Vivek De Microprocessor Technology Lab Intel Corporation,
EE586 VLSI Design Partha Pande School of EECS Washington State University
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
EE141 © Digital Integrated Circuits 2nd Introduction 1 Principle of CMOS VLSI Design Introduction Adapted from Digital Integrated, Copyright 2003 Prentice.
0 1 Thousand Core Chips A Technology Perspective Shekhar Borkar Intel Corp. June 7, 2007.
Patricia Gonzalez Divya Akella VLSI Class Project.
EE141 © Digital Integrated Circuits 2nd Introduction 1 EE5900 Advanced Algorithms for Robust VLSI CAD Dr. Shiyan Hu Office: EERC 731 Adapted.
Computer Organization Yasser F. O. Mohammad 1. 2 Lecture 1: Introduction Today’s topics:  Why computer organization is important  Logistics  Modern.
CS203 – Advanced Computer Architecture
Intel’s 3D Transistor BENJAMIN BAKER. Where we are headed  What is a transistor?  What it is and what does it do?  Moore’s Law  Who is Moore and what.
History of Computers and Performance David Monismith Jan. 14, 2015 Based on notes from Dr. Bill Siever and from the Patterson and Hennessy Text.
EE141 © Digital Integrated Circuits 2nd Introduction 1 EE4271 VLSI Design Dr. Shiyan Hu Office: EERC 731 Adapted and modified from Digital.
University of Michigan Advanced Computer Architecture Lab. 2 CAD Tools for Variation Tolerance David Blaauw and Kaviraj Chopra University of Michigan.
William Stallings Computer Organization and Architecture 6th Edition
Smruti R. Sarangi IIT Delhi
CS203 – Advanced Computer Architecture
Lynn Choi School of Electrical Engineering
Temperature and Power Management
Hot Chips, Slow Wires, Leaky Transistors
Basics of Energy & Power Dissipation
Architecture & Organization 1
Energy Efficient Computing in Nanoscale CMOS
VLSI Design MOSFET Scaling and CMOS Latch Up
Challenges in Nanoelectronics: Process Variability
Lecture 2: Performance Today’s topics: Technology wrap-up
Architecture & Organization 1
3.1 Introduction to CPU Central processing unit etched on silicon chip called microprocessor Contain tens of millions of tiny transistors Key components:
A High Performance SoC: PkunityTM
Die Stacking (3D) Microarchitecture -- from Intel Corporation
Processor Design Challenges
Instructor: Joel Grodstein
The University of Adelaide, School of Computer Science
Welcome to Computer Architecture
Technology scaling Currently, technology scaling has a threefold objective: Reduce the gate delay by 30% (43% increase in frequency) Double the transistor.
Intel CPU for Desktop PC: Past, Present, Future
Presentation transcript:

® 1 Exponential Challenges, Exponential Rewards— The Future of Moore’s Law Based on lecture of Shekhar Borkar Intel Fellow Circuit Research, Intel Labs

2 ISSCC 2003— Gordon Moore said… “No exponential is forever… But We can delay Forever”

3 Goal: 1TIPS by 2010 Pentium® Pro Architecture Pentium® 4 Architecture Pentium® Architecture How do you get there?

4 Transistors Scaling Will high K happen? Would you count on it?

5 Technology Scaling GATE SOURCE BODY DRAIN Xj Tox D GATE SOURCE DRAIN Leff BODY Dimensions scale down by 30% Doubles transistor density Oxide thickness scales down Faster transistor, higher performance Vdd & Vt scaling Lower active power Technology has scaled well, will it in the future?

6 Gate Oxide is Near Limit 70 nm Si 3 N 4 CoSi 2 130nm Transistor Will high K happen? Would you count on it? GATE SOURCE BODY DRAIN Tox GATE SOURCE DRAIN 70 nm BODY

7 3D-Gate Transistor

8 Transistor Integration Capacity On track for 1billion transistor integration capacity

9 35 Years of Microprocessor Trend C Moore, Data Processing in ExaScale-Class Computer Systems, Salishan, April 2011

10 Transistor Integration Capacity

11 Transistor Integration Capacity

12 Transistor Integration Capacity

13 Transistor Integration Capacity

14 Exponential Challenge #1

15 Is Transistor a Good Switch? On I = ∞ I = 0 Off I = 0 I ≠ 0 I = 1ma/u I ≠ 0 Sub-threshold Leakage

16 Sub-threshold Leakage Sub-threshold leakage increases exponentially Assume: 0.25  m, I off = 1na/  5X increase each generation at 30ºC

17 Leakage Power Leakage power limits Vt scaling A. Grove, IEDM 2002

18 The Power Crisis

19 How Power Should Have Scaled A. Danowitz et al. CPU DB: Recording Microprocessor History. ACMQueue Processors, vol. 10, issue 4, pp

20 Exponential Challenge #4

21 Impact on Path Delays Path Delay Path delay variability due to technological variations Impacts individual circuit performance and power Optimize each circuit for performance and power Delay Probability Due to variations in: Vdd, Vt, and Temp

22 Impact on Path Delays Path Delay Path delay variability due to technological variations Impacts individual circuit performance and power Optimize each circuit for performance and power Delay Probability Due to variations in: Vdd, Vt, and Temp How many silicon atoms (111pm) have on transistor channel (20nm)? 3D transistor is a solution?

23 Shift in Design Paradigm From deterministic design to probabilistic and statistical design From deterministic design to probabilistic and statistical design –A path delay estimate is probabilistic (not deterministic) Multi-variable design optimization for Multi-variable design optimization for – Parameter variations – Active and leakage power – Performance

24 Exponential Challenge #6

25 Exponential Costs G. Moore ISSCC 03 Litho Cost FAB Cost $ per Transistor $ per MIPS

26 Some Implications Tox scaling will slow down—may stop? Tox scaling will slow down—may stop? Vdd scaling will slow down—may stop? Vdd scaling will slow down—may stop? Vt scaling will slow down—may stop? Vt scaling will slow down—may stop? Approaching constant Vdd scaling Approaching constant Vdd scaling Energy/logic op will not scale Energy/logic op will not scale

27 The Terascale Dilemma Many billion transistor integration capacity will be available Many billion transistor integration capacity will be available – But could be unusable due to power Logic transistor growth will slow down Logic transistor growth will slow down Transistor performance will be limited Transistor performance will be limitedSolutions Low power design techniques Low power design techniques Improve design efficiency Improve design efficiency

28 Exponential Challenge #5

29 Platform Requirements PC towerMini tower  tower Slim lineSmall pc System Volume ( cubic inch) Shrinking volume Quieter Yet, High Performance Power (W) Thermal Budget ( o C/W) Heat-Sink Volume (in 3 ) Projected Heat Dissipation Volume Projected Air Flow Rate Pentium ® III Thermal Budget Air Flow Rate (CFM) Pentium ® 4 Thermal budget decreasing Higher heat sink volume Higher air flow rate

30 Active Power Reduction SlowFastSlow Low Supply Voltage High Supply Voltage Logic Block Freq = 1 Vdd = 1 Throughput = 1 Power = 1 Area = 1 Pwr Den = 1 Vdd Logic Block Freq = 0.5 Vdd = 0.5 Throughput = 1 Power = 0.25 Area = 2 Pwr Den = Vdd/2 Logic Block Multiple Vdd Throughput oriented design

31 Design &  Arch Efficiency Employ efficient design &  Architectures

Improve  Arch Efficiency ST Wait for Mem MT1 Wait for Mem MT2 Wait MT3 Single Thread Multi-Threading Thermals & Power Delivery designed for full HW utilization Multi-threading improves performance without impacting thermals & power delivery Computer Architecture: A Quantitative Approach (Hennessy;Patterson, 2011)

33 Increase on-die Memory Large on die memory provides: 1.Increased Data Bandwidth & Reduced Latency 2.Hence, higher performance for much lower power

34 Chip Multi-Processing Keynote presentation (L. Benini, RSP 2010).

35 Chip Multi-Processing C1C2 C3C4 Cache Multi-core, each core Multi-threaded Shared cache and front side bus Each core has different Vdd & Freq Spreading hot spots Lower junction temperature

36 Example (Itanium Tukwila)

37 Example (Itanium Tukwila) 30 MBytes cache 130 Watts

38 Example (Itanium Tukwila)

39 What the Cores Will look like?

40 What the Cores Will look like?

41 What the Cores Will look like?

42 What the Cores Will look like? clocks run with the same frequency but unknown phases

43 What the Cores Will look like?

44 What the Cores Will look like? Intelligent redistribution workload Improvement of energy efficiency Multiple functionalities

45 What the Cores Will look like? Several interconnection possibilities Mesh Ring

46 Tera-Scale RMS - Recognition, Mining and Synthesis

47 Tera-Scale

48 Tera-Scale

49 Tera-Scale

50 The Exponential Reward Speculative, OOO Era of Instruction LevelParallelism Super Scalar Era of Pipelined Architecture Multi Threaded Era of Thread & ProcessorLevelParallelism Special Purpose HW Multi-Threaded, Multi-Core

51 Summary—Delaying Forever Terascale transistor integration capacity will be available - Power and Energy are the barriers Terascale transistor integration capacity will be available - Power and Energy are the barriers Variations will be even more prominent - shift from Deterministic to Probabilistic design Variations will be even more prominent - shift from Deterministic to Probabilistic design Improve design efficiency Improve design efficiency Exploit integration capacity to deliver performance in power/cost envelope Exploit integration capacity to deliver performance in power/cost envelope

52 1. Discuta um problema associados a integração dos dispositivos 2. Comente a afirmação: - “A redução do tamanho dos transistores muda o paradigma de avaliação de consumo de energia e tempo de execução de determinístico para probabilístico” 3. Porque o consumo de energia estático é tão problemático para as tecnologias futuras? 4. Porque a redução da voltagem é um dos principais elementos a tratar para reduzir o consumo de energia? 5. Como um sistema com várias alimentações pode contribuir para a redução do consumo de energia? Qual o efeito sobre o tempo de execução? Exercícios

53 6. Faça uma ilustração que mostre como um programa multi- thread pode ocupar melhor os recursos de um sistema, reduzindo o gargalo de comunicação com a memória 7. Qual o motivo do percentual de memória interno a um circuito integrado passar de 50% nos processadores atuais? 8. Dada a limitação do escalamento, o que pode ser feito para continuar o crescente aumento do desempenho das máquinas? 9. Quais as tendências em termos de computação (cores), infra-estrutura de comunicação e armazenamento para os próximos processadores? Exercícios