Trends in the Infrastructure of Computing: Processing, Storage, Bandwidth CSCE 190: Computing in the Modern World Dr. Jason D. Bakos.

Slides:



Advertisements
Similar presentations
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Advertisements

Lecture 6: Multicore Systems
Lecture 0: Introduction
ECE 6466 “IC Engineering” Dr. Wanda Wosik
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
Types of Parallel Computers
Heterogeneous Computing: New Directions for Efficient and Scalable High-Performance Computing Dr. Jason D. Bakos.
Some Thoughts on Technology and Strategies for Petaflops.
CSCE 612: VLSI System Design Instructor: Jason D. Bakos.
Introduction to CMOS VLSI Design Lecture 0: Introduction
FPGA Acceleration of Phylogeny Reconstruction for Whole Genome Data Jason D. Bakos Panormitis E. Elenis Jijun Tang Dept. of Computer Science and Engineering.
CSCE 190: Computing in the Modern World Dr. Jason D. Bakos
ECE 526 – Network Processing Systems Design
Parallel Computer Architectures
CSCE 212 Introduction to Computer Architecture
CSCE 613: Fundamentals of VLSI Chip Design Instructor: Jason D. Bakos.
Seven Minute Madness: Reconfigurable Computing Dr. Jason D. Bakos.
Heterogeneous Computing Dr. Jason D. Bakos. Heterogeneous Computing 2 “Traditional” Parallel/Multi-Processing Large-scale parallel platforms: –Individual.
Integrated Circuit Design and Fabrication Dr. Jason D. Bakos.
Lecture 0: Introduction. CMOS VLSI Design 4th Ed. 0: Introduction2 Introduction  Integrated circuits: many transistors on one chip.  Very Large Scale.
3.1Introduction to CPU Central processing unit etched on silicon chip called microprocessor Contain tens of millions of tiny transistors Key components:
Introduction Integrated circuits: many transistors on one chip.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Single-Chip Multi-Processors (CMP) PRADEEP DANDAMUDI 1 ELEC , Fall 08.
The Devices: Diode.
Z. Feng VLSI Design 1.1 VLSI Design MOSFET Zhuo Feng.
INTRODUCTION TO MICROPROCESSORS
Lecture#14. Last Lecture Summary Memory Address, size What memory stores OS, Application programs, Data, Instructions Types of Memory Non Volatile and.
Trends in the Infrastructure of Computing: Processing, Storage, Bandwidth CSCE 190: Computing in the Modern World Dr. Jason D. Bakos.
1 Lecture 1: CS/ECE 3810 Introduction Today’s topics:  Why computer organization is important  Logistics  Modern trends.
1 Integrated Circuits Basics Titov Alexander 25 October 2014.
Lecture 2. Logic Gates Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System Education & Research.
Chapter 2 The CPU and the Main Board  2.1 Components of the CPU 2.1 Components of the CPU 2.1 Components of the CPU  2.2Performance and Instruction Sets.
Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.
Lecture 0: Introduction. CMOS VLSI Design 4th Ed. 0: Introduction2 Introduction  Integrated circuits: many transistors on one chip.  Very Large Scale.
High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.
SJSU SPRING 2011 PARALLEL COMPUTING Parallel Computing CS 147: Computer Architecture Instructor: Professor Sin-Min Lee Spring 2011 By: Alice Cotti.
VTU – IISc Workshop Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc
Text Book: Silicon VLSI Technology Fundamentals, Practice and Modeling Authors: J. D. Plummer, M. D. Deal, and P. B. Griffin Class: ECE 6466 “IC Engineering”
Trends in the Infrastructure of Computing
 Historical view:  1940’s-Vacuum tubes  1947-Transistors invented by willliam shockely & team  1959-Integrated chips invented by Texas Instrument.
CMOS VLSI Design Introduction
Silicon Design Page 1 The Creation of a New Computer Chip.
Introduction to CMOS Transistor and Transistor Fundamental
Trends in the Infrastructure of Computing: Processing, Storage, Bandwidth CSCE 190: Computing in the Modern World Dr. Jason D. Bakos.
Systems Architecture, Fourth Edition 1 Processor Technology and Architecture Chapter 4.
Microprocessor Design Process
2007/11/20 Paul C.-P. Chao Optoelectronic System and Control Lab., EE, NCTU P1 Copyright 2015 by Paul Chao, NCTU VLSI Lecture 0: Introduction Paul C.–P.
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
COURSE NAME: SEMICONDUCTORS Course Code: PHYS 473.
Introduction to CMOS VLSI Design Lecture 0: Introduction.
1. Introduction. Diseño de Circuitos Digitales para Comunicaciones Introduction Integrated circuits: many transistors on one chip. Very Large Scale Integration.
Fall 2012 Parallel Computer Architecture Lecture 4: Multi-Core Processors Prof. Onur Mutlu Carnegie Mellon University 9/14/2012.
William Stallings Computer Organization and Architecture 6th Edition
Lynn Choi School of Electrical Engineering
Lecture 1: CS/ECE 3810 Introduction
Chapter 4 Processor Technology and Architecture
RAM, CPUs, & BUSES Egle Cebelyte.
INTRODUCTION TO MICROPROCESSORS
CSCE 212 Introduction to Computer Architecture
CSCE 190: Computing in the Modern World Dr. Jason D. Bakos
Lecture 1: CS/ECE 3810 Introduction
BIC 10503: COMPUTER ARCHITECTURE
3.1 Introduction to CPU Central processing unit etched on silicon chip called microprocessor Contain tens of millions of tiny transistors Key components:
Trends in the Infrastructure of Computing
Chapter 1 Introduction.
Computer Evolution and Performance
Presentation transcript:

Trends in the Infrastructure of Computing: Processing, Storage, Bandwidth CSCE 190: Computing in the Modern World Dr. Jason D. Bakos

CSCE 190: Computing in the Modern World 2 Elements

CSCE 190: Computing in the Modern World 3 Semiconductors Silicon is a group IV element (4 valence electrons, shells: 2, 8, 18, 32…) –Forms covalent bonds with four neighbor atoms (3D cubic crystal lattice) –Si is a poor conductor, but conduction characteristics may be altered –Add impurities/dopants (replaces silicon atom in lattice): Makes a better conductor Group V element (phosphorus/arsenic) => 5 valence electrons –Leaves an electron free => n-type semiconductor (electrons, negative carriers) Group III element (boron) => 3 valence electrons –Borrows an electron from neighbor => p-type semiconductor (holes, positive carriers) forward bias reverse bias P-N junction

CSCE 190: Computing in the Modern World 4 MOSFETs body/bulk GROUND NMOS/NFETPMOS/PFET channel shorter length, faster transistor (dist. for electrons) body/bulk HIGH positive voltage (Vdd) negative voltage (rel. to body) (GND) (S/D to body is reverse-biased) current Metal-poly-Oxide-Semiconductor structures built onto substrate –Diffusion: Inject dopants into substrate –Oxidation: Form layer of SiO2 (glass) –Deposition and etching: Add aluminum/copper wires

CSCE 190: Computing in the Modern World 5 Layout 3-input NAND

CSCE 190: Computing in the Modern World 6 Logic Gates invNAND2 NAND3 NOR2

CSCE 190: Computing in the Modern World 7 Logic Synthesis Behavior: –S = A + B –Assume A is 2 bits, B is 2 bits, C is 3 bits ABC 00 (0) 000 (0) 00 (0)01 (1)001 (1) 00 (0)10 (2)010 (2) 00 (0)11 (3)011 (3) 01 (1)00 (0)001 (1) 01 (1) 010 (2) 01 (1)10 (2)011 (3) 01 (1)11 (3)100 (4) 10 (2)00 (0)010 (2) 10 (2)01 (1)011 (3) 10 (2) 100 (4) 10 (2)11 (3)101 (5) 11 (3)00 (0)011 (3) 11 (3)01 (1)100 (4) 11 (3)10 (2)101 (5) 11 (3) 110 (6)

CSCE 190: Computing in the Modern World 8 MIPS Microarchitecture

CSCE 190: Computing in the Modern World 9 Synthesized and P&R’ed MIPS Architecture

CSCE 190: Computing in the Modern World 10 Feature Size Shrink minimum feature size… –Smaller L decreases carrier time and increases current –Therefore, W may also be reduced for fixed current –C g, C s, and C d are reduced –Transistor switches faster (~linear relationship)

CSCE 190: Computing in the Modern World 11 Minimum Feature Size YearProcessorSpeedProcess 1982i MHz 1.5 m 1986i38616 – 40 MHz m 1989i MHz.8 m 1993Pentium MHz m 1995Pentium Pro MHz m 1997Pentium II MHz m 1999Pentium III450 – 1400 MHz m 2000Pentium 41.3 – 3.8 GHz m 2005Pentium D2.66 – 3.6 GHz m 2006Core – 3 GHz.065 m Upcoming milestones: 45 nm (Xeon 5400 Nov. 2007), 32 nm ( ), 22 nm ( ), 16 nm (2013)

CSCE 190: Computing in the Modern World 12 Integration Density Trends (Moore’s Law) Pentium Core 2 Duo (2007) has ~300M transistors

CSCE 190: Computing in the Modern World 13 Microprocessor Technology Advances in fabrication (lithography, photoresist, metal layers) …faster transistor switching (faster processor) …smaller transistors/wires …higher integration density …more “real estate” …architectural improvements!

CSCE 190: Computing in the Modern World 14 Parallelism Decompose a problem into a set of smaller sub-problems Process multi sub-problems simultaneously Different techniques: –Platform: Spread problem across multiple CPUs (parallel processing) Spread problem across multiple components on a CPU (microarchitectural parallelism) –Programming: Automatically DYNAMIC Automatically STATIC Programmer using programming model

CSCE 190: Computing in the Modern World 15 Parallel Processing Parallel processing: –Shared memory Symmetric multiprocessing Multiple CPUs share a single memory space (usually NUMA) Communicate through memory reference Each CPU may have local but globally accessible memory Requires expensive crossbar switch (16-processor => $500K) –Message-passing No shared memory CPUs communicate via explicit messages MPI and OpenMP APIs COTS processors and high-speed LAN switch Scalable: –NASA Space Exploration Simulator has 10,240 CPUs (Intel Itanium 2) and requires 1 MW (Lake Murray generates 200 MW) –Laurence Livermore BlueGene/L has 65,536 dual-processor (700 MHz PowerPC) nodes and requires 1.5 MW –Hybrid systems

CSCE 190: Computing in the Modern World 16 Microarchitectural Parallelism Parallelism => perform multiple operations simultaneously –Instruction-level parallelism Execute multiple instructions at the same time Multiple issue Out-of-order execution Speculation –Thread-level parallelism (hyper-threading) Execute multiple threads at the same time on one CPU Threads share memory space and pool of functional units –Chip multiprocessing Execute multiple processes/threads at the same time on multiple CPUs Cores are symmetrical and completely independent but share a common level-2 cache

CSCE 190: Computing in the Modern World 17 Heterogeneous Computing General-purpose CPUs strike a balance when allocating real estate: –on-chip memory (cache) optimized for both sequential and random access –floating-point units for arithmetic –multimedia units (SIMD) –operating system support Special-purpose processors can accelerate programs –GPUs (stream processors) –FPGAs

CSCE 190: Computing in the Modern World 18 High-Performance Reconfigurable Computing HPRC: –Use FPGA as co-processor Example: –Application requires a week of CPU time –One computation consumes 99% of execution time Kernel speedup Application speedup Execution time hours hours hours hours hours Replaces software Exploits parallelism

CSCE 190: Computing in the Modern World 19 HPRC: Requirements, Pros, Cons Application criteria: 1.computationally expensive 2.has a bottleneck computation 3.bottleneck computation is parallelizable 4.…and has low I/O and storage requirements Advantages of HPRC: –Cost FPGA card => ~ $15K 128-processor cluster => ~ $150K + maintenance + cooling + electricity + recycling Disadvantage for HPRC: –Programming the FPGA

CSCE 190: Computing in the Modern World 20 LANs Peripheral and LAN switched interconnects are merging LAN –Fibre Channel For storage devices / SAN (1 – Gbps) 16 port 1U 2.12 Gbps is $15K –Infiniband (copper or fiber) 2.5 Gbps 16 port is $10K –Myrinet (designed for clusters) 10 Gbps 16 port for $10K –1G/10G Ethernet