Emerging Technologies: A CompSci Perspective UC SANTA BARBARA Tim Sherwood.

Slides:



Advertisements
Similar presentations
Supporting Security at the Gate Level: Opportunities and Misconceptions Tim Sherwood UC Santa Barbara.
Advertisements

2013/06/10 Yun-Chung Yang Kandemir, M., Yemliha, T. ; Kultursay, E. Pennsylvania State Univ., University Park, PA, USA Design Automation Conference (DAC),
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
Helper Threads via Virtual Multithreading on an experimental Itanium 2 processor platform. Perry H Wang et. Al.
What Great Research ?s Can RAMP Help Answer? What Are RAMP’s Grand Challenges ?
1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Future Parallel Computing Systems – what to remember from the past RAMP Workshop FCRC.
3D Interconnect: Architectural Challenges and Opportunities UC SANTA BARBARA Tim Sherwood.
UCB November 8, 2001 Krishna V Palem Proceler Inc. Customization Using Variable Instruction Sets Krishna V Palem CTO Proceler Inc.
CS 7810 Lecture 24 The Cell Processor H. Peter Hofstee Proceedings of HPCA-11 February 2005.
Intel ® Research mote Ralph Kling Intel Corporation Research Santa Clara, CA.
1 Instant replay  The semester was split into roughly four parts. —The 1st quarter covered instruction set architectures—the connection between software.
+ CS 325: CS Hardware and Software Organization and Architecture Introduction.
Computer Architecture and Organization
Computer System Architectures Computer System Software
February 12, 1998 Aman Sareen DPGA-Coupled Microprocessors Commodity IC’s for the Early 21st Century by Aman Sareen School of Electrical Engineering and.
Computer Architecture ECE 4801 Berk Sunar Erkay Savas.
Introspective 3D Chips S. Mysore, B. Agrawal, N. Srivastava, S. Lin, K. Banerjee, T. Sherwood (UCSB), ASPLOS 2006 Shimin Chen (LBA Reading Group Presentation)
Simultaneous Multithreading: Maximizing On-Chip Parallelism Presented By: Daron Shrode Shey Liggett.
Computer Architecture Challenges Shriniwas Gadage.
Reporter: PCLee. Assertions in silicon help post-silicon debug by providing observability of internal properties within a system which are.
Chapter 1 Computer System Overview Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.
1 CS503: Operating Systems Spring 2014 Dongyan Xu Department of Computer Science Purdue University.
Introduction CSE 410, Spring 2008 Computer Systems
Computer Architecture and Organization Introduction.
Three fundamental concepts in computer security: Reference Monitors: An access control concept that refers to an abstract machine that mediates all accesses.
StimulusCache: Boosting Performance of Chip Multiprocessors with Excess Cache Hyunjin Lee Sangyeun Cho Bruce R. Childers Dept. of Computer Science University.
Copyright © 2007 Heathkit Company, Inc. All Rights Reserved PC Fundamentals Presentation 27 – A Brief History of the Microprocessor.
Computer Architecture Lecture10: Input/output devices Piotr Bilski.
1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah
Ramazan Bitirgen, Engin Ipek and Jose F.Martinez MICRO’08 Presented by PAK,EUNJI Coordinated Management of Multiple Interacting Resources in Chip Multiprocessors.
Operating Systems Lecture 02: Computer System Overview Anda Iamnitchi
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
1 Computer System Organization I/O systemProcessor Compiler Operating System (Windows 98) Application (Netscape) Digital Design Circuit Design Instruction.
Presenter : Cheng-Ta Wu David Lin1, Ted Hong1, Farzan Fallah1, Nagib Hakim3, Subhasish Mitra1, 2 1 Department of EE and 2 Department of CS Stanford University,
Computer Organization & Assembly Language © by DR. M. Amer.
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) Reconfigurable Architectures Forces that drive.
A few issues on the design of future multicores André Seznec IRISA/INRIA.
Shashwat Shriparv InfinitySoft.
CHAPTER 6 Instruction Set Architecture 12/7/
THE BRIEF HISTORY OF 8085 MICROPROCESSOR & THEIR APPLICATIONS
Next Generation ISA Itanium / IA-64. Operating Environments IA-32 Protected Mode/Real Mode/Virtual Mode - if supported by the OS IA-64 Instruction Set.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
HyperThreading ● Improves processor performance under certain workloads by providing useful work for execution units that would otherwise be idle ● Duplicates.
0 1 Thousand Core Chips A Technology Perspective Shekhar Borkar Intel Corp. June 7, 2007.
DR. SIMING LIU SPRING 2016 COMPUTER SCIENCE AND ENGINEERING UNIVERSITY OF NEVADA, RENO CS 219 Computer Organization.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
ISA's, Compilers, and Assembly
Computer Structure 2015 – Intel ® Core TM μArch 1 Computer Structure Multi-Threading Lihu Rappoport and Adi Yoaz.
By Chad Andrus. TILE-Gx100  100 Identical Processor Cores Each core has its own L2 & L3 cache Each can run its own OS or group together for multiprocessing.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Computer Operation. Binary Codes CPU operates in binary codes Representation of values in binary codes Instructions to CPU in binary codes Addresses in.
New-School Machine Structures Parallel Requests Assigned to computer e.g., Search “Katz” Parallel Threads Assigned to core e.g., Lookup, Ads Parallel Instructions.
SEPTEMBER 8, 2015 Computer Hardware 1-1. HARDWARE TERMS CPU — Central Processing Unit RAM — Random-Access Memory  “random-access” means the CPU can read.
Hardware Architecture
Multi-Core CPUs Matt Kuehn. Roadmap ► Intel vs AMD ► Early multi-core processors ► Threads vs Physical Cores ► Multithreading and Multi-core processing.
Introduction CSE 410, Spring 2005 Computer Systems
Computer Operations Part 2.
Lynn Choi School of Electrical Engineering
Lecture 3: MIPS Instruction Set
CSE 410, Spring 2006 Computer Systems
Computer Structure Multi-Threading
How We Think Of Computers
Architecture & Organization 1
عمارة الحاسب.
Super Quick Architecture Review
Architecture & Organization 1
Today’s agenda Hardware architecture and runtime system
Die Stacking (3D) Microarchitecture -- from Intel Corporation
Introduction to Computer Systems Engineering
Instruction Set Architecture
Presentation transcript:

Emerging Technologies: A CompSci Perspective UC SANTA BARBARA Tim Sherwood

Software Beware

The End of an Era $381B / year

The Beginning of a New Era 80 Cores Integrated MEMS 3D Stacks of Dies

The Role of Architecture Applications Runtime System Architecture Circuit Device Package SW HW Constraints Demands Emerging Technology (Noise, Thermal, Yield) (Battery Life, Performance, Programmability )

temp package total power dynamic power V utilized area communication A Simple Performance “Ecosystem” parallelismfreq leakage app OS or runtime feedback chip performance No multicore, no spatial variance, no temporal variance, no metrics of cost or error or yield

3D Integration 80 Cores Integrated MEMS 3D Stacks of Dies

3d technology Through Silicon Vias (TSV) 5x5μm Standard Si Substrate CMP Reduced Si Layer active layer Science Fiction?  Intel, IBM, Ziptronix invest heavily in 3d integration research  Many demonstrated 3d prototype systems

Work at UCSB Shashidhar Mysore, Banit Agrawal, Sheng-Chih Lin, Navin Srivastava, Kaustav Banerjee and Timothy Sherwood. Introspective 3D Chips, Proceedings of the Twelfth International Conference on Architectural Support for Programming Languages and Operating Systems ( ASPLOS ), October San Jose, CA Gian Luca Loi, Banit Agrawal, Navin Srivastava, Sheng-Chih Lin, Timothy Sherwood, Kaustav Banerjee. A Thermally- Aware Performance Analysis of Vertically Integrated (3-D) Processor-Memory Hierarchy, Proceedings of the 43nd Design Automation Conference ( DAC ), June San Francisco, CA

Basic Savings in 3D Area: 4 Dist: √8 ≈ 2.8 Area: 2 Dist: √4 ≈ 2 + 1L Area: 1 Dist: √2 ≈ L BW: √8 ≈ 2.8 BW: 2√4 ≈ 4 BW: 4√2 ≈ 5.6 On-chip Latency improved Bandwidth could improve even more UCSB First to Successfully Model Thermal/Performance

Addressing more than Performance The hardware/software boundary is uniquely situated  Ultimately, Everything is an instruction  Used by Intel, AMD, Freescale, to guide their development Could Provide Unprecedented Visibility  Not just data capture, we need the ability to put together a cohesive picture of system interactions and correlate between them in a sound and non-intrusive manner

Cutting Through Abstraction Complex interactions across levels of abstraction make debugging, optimizing, securing, and analysis in general difficult

To Integrated Monitoring Hardware L1_BPU Decode Trace Cache Top L2_BPU Bus Control MOBITLB Trace Cache Bottom DTLB L1 Cache Top L2 Cache L1 Cache Bottom FP Exec UROM FP Reg Alloc Rename Instr Q1 Sched Instr Q2 Int Reg Retire Int Exec Mem Ctl What programmers want 32 bit Memory Address 32 bit Memory Value 10 bit Opcodes 2, 5 bit Register Names 2, 32 bit Register Values 10 bits of “status” 3x 4x 1892 bits per cycle = 1 4Ghz Less buggy systems ($54 Billion / Year )

Why programmers cant have it Interconnect is not free  Huge cross chip busses  OptBuf 285um  20,000 buffers Analysis is not free  Significant processing required Extra cost of added heat  $15 budget for cooling Used by developers To Integrated Monitoring Hardware L1_BPU Decode Trace Cache Top L2_BPU Bus Control MOBITLB Trace Cache Bottom DTLB L1 Cache Top L2 Cache L1 Cache Bottom FP Exec UROM FP Reg Alloc Rename Instr Q1 Sched Instr Q2 Int Reg Retire Int Exec Mem Ctl

3D Introspection Primary Processor 5x5μm Standard Si Substrate CMP Reduced Si Layer active layer Observer

Thermal Impact w/ 4x Processingw/ 8x ProcessingP4 – Base Case Processing Layer w/ 3D Layer Analysis Layer

Conclusions Emerging Technologies will play a significant new role  Risk is hard to avert right now UCSB Computer Science and Engineering  Collaborate across disciplines to consider the entire SW/HW Research that is driving industry  UCSB Technology in use in most Microprocessors and Networks  Always looking for more collaboration with industry partners

NSF CNS , NSF CCF , NSF CCF