Presentation is loading. Please wait.

Presentation is loading. Please wait.

Technion, Haifa Israel, June 2013

Similar presentations


Presentation on theme: "Technion, Haifa Israel, June 2013"— Presentation transcript:

1 Technion, Haifa Israel, June 2013
21st Century Computer Architecture A community white paper Technion, Haifa Israel, June 2013 Information & Commun. Tech’s Impact Semiconductor Technology’s Challenges Computer Architecture’s Future Example: Bypassing Paged Virtual Memory

2 White Paper Participants
Sarita Adve, U Illinois * David H. Albonesi, Cornell U David Brooks, Harvard U Luis Ceze, U Washington * Sandhya Dwarkadas, U Rochester Joel Emer, Intel/MIT Babak Falsafi, EPFL Antonio Gonzalez, Intel/UPC Mark D. Hill, U Wisconsin *,** Mary Jane Irwin, Penn State U * David Kaeli, Northeastern U * Stephen W. Keckler, NVIDIA/U Texas Christos Kozyrakis, Stanford U Alvin Lebeck, Duke U Milo Martin, U Pennsylvania José F. Martínez, Cornell U Margaret Martonosi, Princeton U * Kunle Olukotun, Stanford U Mark Oskin, U Washington Li-Shiuan Peh, M.I.T. Milos Prvulovic, Georgia Tech Steven K. Reinhardt, AMD Michael Schulte, AMD/U Wisconsin Simha Sethumadhavan, Columbia U Guri Sohi, U Wisconsin Daniel Sorin, Duke U Josep Torrellas, U Illinois * Thomas F. Wenisch, U Michigan * David Wood, U Wisconsin * Katherine Yelick, UC Berkeley/LBNL * “*” contributed prose; “**” effort coordinator Thanks of CCC, Erwin Gianchandani & Ed Lazowska for guidance and Jim Larus & Jeannette Wing for feedback

3 20th Century ICT Set Up Information & Communication Technology (ICT) Has Changed Our World <long list omitted> Required innovations in algorithms, applications, programming languages, … , & system software Key (invisible) enablers (cost-)performance gains Semiconductor technology (“Moore’s Law”) Computer architecture (~80x per Danowitz et al.)

4 Enablers: Technology + Architecture
Danowitz et al., CACM 04/2012, Figure 1

5 21st Century Promise ICT Promises Much More Characterized by
Data-centric personalized health care Computation-driven scientific discovery Human network analysis Much more: known & unknown Characterized by Big Data Always Online Secure/Private Whither enablers of future (cost-)performance gains?

6 Technology’s Challenges 1/2
Late 20th Century The New Reality Moore’s Law — 2× transistors/chip Transistor count still 2× BUT… Dennard Scaling —~constant power/chip Gone. Can’t repeatedly double power/chip

7 Classic CMOS Dennard Scaling: the Science behind Moore’s Law
(Finding 2) Classic CMOS Dennard Scaling: the Science behind Moore’s Law Source: Future of Computing Performance: Game Over or Next Level?, National Academy Press, 2011 Scaling: Voltage: V/a Oxide: tOX/a Results: Power/ckt: 1/a2 Power Density: ~Constant National Research Council (NRC) – Computer Science and Telecommunications Board (CSTB.org)

8 Post-classic CMOS Dennard Scaling
Post Dennard CMOS Scaling Rule TODO: Chips w/ higher power (no), smaller (), dark silicon (), or other (?) Scaling: Voltage: V/a V Oxide: tOX/a Results: Power/ckt: 1/a2 1 Power Density: ~Constant a2 National Research Council (NRC) – Computer Science and Telecommunications Board (CSTB.org)

9 Technology’s Challenges 2/2
Late 20th Century The New Reality Moore’s Law — 2× transistors/chip Transistor count still 2× BUT… Dennard Scaling —~constant power/chip Gone. Can’t repeatedly double power/chip Modest (hidden) transistor unreliability Increasing transistor unreliability can’t be hidden Focus on computation over communication Communication (energy) more expensive than computation 1-time costs amortized via mass market One-time cost much worse & want specialized platforms How should architects step up as technology falters?

10 21st Century Comp Architecture
20th Century 21st Century Single-chip in generic computer Architecture as Infrastructure: Spanning sensors to clouds Performance plus security, privacy, availability, programmability, … Cross-Cutting: Break current layers with new interfaces Performance via invisible instr.-level parallelism Energy First Parallelism Specialization Cross-layer design Predictable technologies: CMOS, DRAM, & disks New technologies (non-volatile memory, near-threshold, 3D, photonics, …) Rethink: memory & storage, reliability, communication X X

11 21st Century Comp Architecture
20th Century 21st Century Single-chip in stand-alone computer Architecture as Infrastructure: Spanning sensors to clouds Performance plus security, privacy, availability, programmability, … Cross-Cutting: Break current layers with new interfaces Performance via invisible instr.-level parallelism Energy First Parallelism Specialization Cross-layer design Predictable technologies: CMOS, DRAM, & disks New technologies (non-volatile memory, near-threshold, 3D, photonics, …) Rethink: memory & storage, reliability, communication

12 What Research Exactly? Research areas in white paper (& backup slides)
Architecture as Infrastructure: Spanning Sensors to Clouds Energy First Technology Impacts on Architecture Cross-Cutting Issues & Interfaces Much more research developed by future PIs! E.g.: Efficient Virtual Memory for Big Memory Servers Basu, Gandhi, Chang, Hill, & Swift [ISCA 2013] Big Memory: graph500, memcached, databases Self-manage most memory (e.g., bufferpool)

13 Execution Time Overhead: TLB Misses
Significant waste Larger memory? Byte-addr NVM? Lower is better First lets see whether the problem actually exists or not. A set of workloads, first column shows how many cycles hardware page table walker in a 32 nm Sandybridntge/westmere machine spends as percentage of the execution time. Second column shows number of L1 + L2 TLB misses experienced per 1K instructions More interestingly when we do the same experiment on non-server things are very different 10/5/12

14 Hardware: Direct Segment
1 Conventional Paging 2 Direct Segment BASE LIMIT VA OFFSET PA Why Direct Segment? Matches Big Memory Workload needs NO Paging => NO TLB Miss

15 Execution Time Overhead: TLB Misses
92-100% TLB “misses” to direct segment Requires: Both small SW + small HW changes 10/5/12

16 Technion, Haifa Israel, June 2013
21st Century Computer Architecture A community white paper Technion, Haifa Israel, June 2013 Information & Commun. Tech’s Impact Semiconductor Technology’s Challenges Computer Architecture’s Future Example: Bypassing Paged Virtual Memory

17 Pre-Competitive Research Justified
Retain (cost-)performance enabler to ICT revolution Successful companies cannot do this by themselves Lack needed long-term focus Don’t want to pay for what benefits all Resist transcending interfaces that define their products

18 White Paper Process Late March 2012 April 2012 May 2012
CCC contacts coordinator & forms group April 2012 Brainstorm (meetings/online doc) Read related docs (PCAST, NRC Game Over, ACAR1/2, …) Use online doc for intro & outline then parallel sections Rotated authors to revise sections May 2012 Brainstorm list of researcher in/out of comp. architecture Solicit researcher feedback/endorsement Do distributed revision & redo of intro Release May 25 to CCC & via Kudos to participants on executing on a tight timetable

19 Back Up Slides Detailed research areas in white paper
Architecture as Infrastructure: Spanning Sensors to Clouds Energy First Technology Impacts on Architecture Cross-Cutting Issues & Interfaces Findings on National Academy “Game Over” Study Glimpse at DARPA/ISAT Workshop “Advancing Computer Systems without Technology Progress”

20 1. Architecture as Infrastructure: Spanning Sensors to Clouds
Beyond a chip in a generic computer To pillar of 21st century societal infrastructure. Computation in context (sensor, mobile, …, data center) Systems often large & distributed Communication issues can dominate computation Goals beyond performance (battery life, form factor) Opportunities (not exhaustive) Reliable sensors harvesting (intermittent) energy Smart phones to Star Trek’s medical “tricorder” Cloud infrastructure suitable for both “Big Data” streams & low-latency qualify-of-service with stragglers Analysis & design tools that scale

21 2. Energy First Beyond single-core performance computer
To (cost-)performance per watt/joule Energy across the layers Circuit/technology (near-threshold CMOS, 3D stacking) Architecture (reducing unnecessary data movement) Software (communication-reducing algorithms) Parallelism to save energy Vast (fined-grained) homogeneous & heterogeneous Improved SW stack Applications focus (beyond graphic processing units) Specialization for performance & energy efficiency Abstractions for specialization (reducing 1-time cost) Energy-efficient memory hierarchies Reconfigurable logic structures

22 3. Technology Impacts on Architecture
Beyond CMOS, Dram, & Disks of last 3+ decades to Using replacement circuit technologies Sub/near-threshold CMOS, QWFETs, TFETs, and QCAs Non-volatile storage Beyond flash memory to STT-RAM, PCRAM, & memristor 3D die stacking & interposers logic, cache, small main memory Photonic interconnects Inter- & even intra-chip Design automation from circuit-design w/ new technologies to pre-RTL functional, performance, power, area modeling of heterogeneous chips & systems

23 4. Cross-Cutting Issues & Interfaces
Beyond performance w/ stable interfaces to New design goals (for pillar of societal infrastructure) Verifiability (bugs kill) Reliability (“dependability” computing base?) Security/Privacy (w/ non-volatile memory?) Programmability (time to correct-performant solution) Better Interfaces High-level information (quality of service, provenance) Parallelism ((in)dependence, (lack of) side-effects) Orchestrating communication ((recursive) locality) Security/Reliability (fine-grain protection)

24 Executive summary (Added to National Academy Slides)
Highlights of National Academy Findings (F1) Computer hardware has transitioned to multicore (F2) Dennard scaling of CMOS has broken down (F3) Parallelism and locality must be exploited by software (F4) Chip power will soon limit multicore scaling Eight recommendations from algorithms to education We know all of this at some level, BUT: Are we all acting on this knowledge or hoping for business as usual? Thinking beyond next paper to where future value will be created? Questions Asked but Not Answered Embedded in NA Talk Briefly Close with Kübler-Ross Stages of Grief: Denial  …  Acceptance Source: Future of Computing Performance: Game Over or Next Level?, National Academy Press, 2011 Mark Hill talk (http://www.cs.wisc.edu/~markhill/NRCgameover_wisconsin_2011_05.pptx)

25 System Capability (log)
The Graph New Technology Our Focus CMOS System Capability (log) Fallow Period 80s 90s 00s 10s 20s 30s 40s 50s Source: Advancing Computer Systems without Technology Progress, ISAT Outbrief (http://www.cs.wisc.edu/~markhill/papers/isat2012_ACSWTP.pdf) Mark D. Hill and Christos Kozyrakis, DARPA/ISAT Workshop, March 26-27, 2012. Approved for Public Release, Distribution Unlimited The views expressed are those of the author and do not reflect the official policy or position of the Department of Defense or the U.S. Government.

26 ~1000x = 2 decades of Moore’s Law!
Surprise 1 of 2 Can Harvest in the “Fallow” Period! 2 decades of Moore’s Law-like perf./energy gains Wring out inefficiencies used to harvest Moore’s Law HW/SW Specialization/Co-design (3-100x) Reduce SW Bloat (2-1000x) Approximate Computing (2-500x) ~1000x = 2 decades of Moore’s Law!

27 “Surprise” 2 of 2 Systems must exploit LOCALITY-AWARE parallelism
Parallelism Necessary, but not Sufficient As communication’s energy costs dominate Shouldn’t be a surprise, but many are in denial Both surprises hard, requiring “vertical cut” thru SW/HW


Download ppt "Technion, Haifa Israel, June 2013"

Similar presentations


Ads by Google