Rethinking Hardware and Software for Disciplined Parallelism Sarita V. Adve University of Illinois

Slides:



Advertisements
Similar presentations
MicroKernel Pattern Presented by Sahibzada Sami ud din Kashif Khurshid.
Advertisements

Concurrency Issues Motivation, Problems, Directions Dennis Kafura - CS Operating Systems1.
Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be.
Emmett Witchel Krste Asanović MIT Lab for Computer Science Hardware Works, Software Doesn’t: Enforcing Modularity with Mondriaan Memory Protection.
Memory Consistency Models Kevin Boos. Two Papers Shared Memory Consistency Models: A Tutorial – Sarita V. Adve & Kourosh Gharachorloo – September 1995.
Multiprocessor Architectures for Speculative Multithreading Josep Torrellas, University of Illinois The Bulk Multicore Architecture for Programmability.
Distributed Systems CS
Memory Models (1) Xinyu Feng University of Science and Technology of China.
Department of Computer Sciences Revisiting the Complexity of Hardware Cache Coherence and Some Implications Rakesh Komuravelli Sarita Adve, Ching-Tsun.
Memory Models: A Case for Rethinking Parallel Languages and Hardware † Sarita Adve University of Illinois Acks: Mark Hill, Kourosh Gharachorloo,
Challenges and Opportunities for System Software in the Multi-Core Era or The Sky is Falling, The Sky is Falling!
Rethinking Shared-Memory Languages and Hardware Sarita V. Adve University of Illinois Acks: M. Hill, K. Gharachorloo, H. Boehm, D. Lea,
Object-Oriented Software Development CS 3331 Fall 2009.
© Chinese University, CSE Dept. Software Engineering / Software Engineering Topic 1: Software Engineering: A Preview Your Name: ____________________.
Priority Research Direction (I/O Models, Abstractions and Software) Key challenges What will you do to address the challenges? – Develop newer I/O models.
Extensibility, Safety and Performance in the SPIN Operating System Presented by Allen Kerr.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
UW-Madison Computer Sciences Vertical Research Group© 2010 Relax: An Architectural Framework for Software Recovery of Hardware Faults Marc de Kruijf Shuou.
Memory Models: The Case for Rethinking Parallel Languages and Hardware Sarita Adve University of Illinois at Urbana-Champaign
DeNovo † : Rethinking Hardware for Disciplined Parallelism Byn Choi, Rakesh Komuravelli, Hyojin Sung, Rob Bocchino, Sarita Adve, Vikram Adve Other collaborators:
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Dec 5, 2005 Topic: Intro to Multiprocessors and Thread-Level Parallelism.
Fundamental Design Issues for Parallel Architecture Todd C. Mowry CS 495 January 22, 2002.
[ 1 ] Agenda Overview of transactional memory (now) Two talks on challenges of transactional memory Rebuttals/panel discussion.
CS533 Concepts of Operating Systems Class 14 Virtualization.
Multiprocessors CSE 471 Aut 011 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor.
Figure 1.1 Interaction between applications and the operating system.
Shared Memory Consistency Models: A Tutorial By Sarita V Adve and Kourosh Gharachorloo Presenter: Meenaktchi Venkatachalam.
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
DeNovo: Rethinking the Multicore Memory Hierarchy for Disciplined Parallelism Byn Choi, Rakesh Komuravelli, Hyojin Sung, Robert Smolinski, Nima Honarmand,
Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
Memory Models: A Case for Rethinking Parallel Languages and Hardware † Sarita V. Adve University of Illinois Acks: Mark Hill, Kourosh.
Lecture 1 Page 1 CS 111 Online Introduction to the Course Purpose of course and relationships to other courses Why study operating systems? Major themes.
1 Software Quality CIS 375 Bruce R. Maxim UM-Dearborn.
Memory Models: A Case for Rethinking Parallel Languages and Hardware COE 502/CSE 661 Fall 2011 Parallel Processing Architectures Professor: Muhamed Mudawar,
Multi-core systems System Architecture COMP25212 Daniel Goodman Advanced Processor Technologies Group.
Software & the Concurrency Revolution by Sutter & Larus ACM Queue Magazine, Sept For CMPS Halverson 1.
Foundations of the C++ Concurrency Memory Model Hans-J. Boehm Sarita V. Adve HP Laboratories UIUC.
The Imperative of Disciplined Parallelism: A Hardware Architect’s Perspective Sarita Adve, Vikram Adve, Rob Bocchino, Nicholas Carter, Byn Choi, Ching-Tsun.
HPC User Forum Back End Compiler Panel SiCortex Perspective Kevin Harris Compiler Manager April 2009.
DeNovo: A Software-Driven Rethinking of the Memory Hierarchy Sarita Adve, Vikram Adve, Rob Bocchino, Nicholas Carter, Byn Choi, Ching-Tsun Chou, Stephen.
Writing Systems Software in a Functional Language An Experience Report Iavor Diatchki, Thomas Hallgren, Mark Jones, Rebekah Leslie, Andrew Tolmach.
Group 3: Architectural Design for Enhancing Programmability Dean Tullsen, Josep Torrellas, Luis Ceze, Mark Hill, Onur Mutlu, Sampath Kannan, Sarita Adve,
8:15 AM Tuesday September 15, 2009 Karl Frank, Point of Contact for Constellation Projects Validating Integration Requirements Diagrams for illustrative.
Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.
DeNovoSync: Efficient Support for Arbitrary Synchronization without Writer-Initiated Invalidations Hyojin Sung and Sarita Adve Department of Computer Science.
CS 295 – Memory Models Harry Xu Oct 1, Multi-core Architecture Core-local L1 cache L2 cache shared by cores in a processor All processors share.
CS533 Concepts of Operating Systems Jonathan Walpole.
What’s Ahead for Embedded Software? (Wed) Gilsoo Kim
Distributed shared memory u motivation and the main idea u consistency models F strict and sequential F causal F PRAM and processor F weak and release.
CALTECH cs184c Spring DeHon CS184c: Computer Architecture [Parallel and Multithreaded] Day 9: May 3, 2001 Distributed Shared Memory.
CS223: Software Engineering Lecture 13: Software Architecture.
CS533 Concepts of Operating Systems Jonathan Walpole.
Agenda  Quick Review  Finish Introduction  Java Threads.
Lecture 2 Intro. To Software Engineering and Object-Oriented Programming (1/2)
DeNovo: A Software-Driven Rethinking of the Memory Hierarchy Sarita Adve with Vikram Adve, Rob Bocchino, Nicholas Carter, Byn Choi, Ching-Tsun Chou, Stephen.
EE 382 Processor DesignWinter 98/99Michael Flynn 1 EE382 Processor Design Winter 1998 Chapter 8 Lectures Multiprocessors, Part I.
Tools and Libraries for Manycore Computing Kathy Yelick U.C. Berkeley and LBNL.
Operating System Structures
Types for Programs and Proofs
Distributed Shared Memory
Many-core Software Development Platforms
Threads and Memory Models Hal Perkins Autumn 2011
Shared Memory Consistency Models: A Tutorial
Threads and Memory Models Hal Perkins Autumn 2009
Design and Implementation Issues for Atomicity
Introduction to Heterogeneous Parallel Computing
Relaxed Consistency Finale
Compilers, Languages, and Memory Models
Chip&Core Architecture
Presentation transcript:

Rethinking Hardware and Software for Disciplined Parallelism Sarita V. Adve University of Illinois

Sequential CS 101 Java

Parallel CS 101 Java Threads

Parallel CS 101 Java Threads Data races

Parallel CS 101 Java Threads Data races Non-determinism

Parallel CS 101 Java Threads Data races Non-determinism Memory Model General-purpose parallel models are complex, abandon decades of sequential programming advances – Safety, modularity, composability, maintainability, …

The Problem Popular parallel languages are fundamentally broken

The Problem Theorem: Popular parallel languages are fundamentally broken Proof: See the Java Memory Model (+ unresolved bug)

The Problem Theorem: Popular parallel languages are fundamentally broken Proof: See the Java Memory Model (+ unresolved bug) Memory consistency model = what values can a read return? – 20+ years of research finally led to convergence – But extremely complex * Dealing with data races is very hard * Mismatch between hardware and software evolution We are building on a foundation where even legal values for reads are complex to specify

The Problem Theorem: Current parallel languages are fundamentally broken Proof: See the Java Memory Model (+ unresolved bug) Memory model = what values can a read will return? 20+ years of research finally led to convergence – Sequential consistency for data-race-free programs is minimal – Java added MUCH complexity for safety/security * Minimal (complex) semantics for data races, but unresolved bug – C++, C added complexity for experts due to h/w – s/w mismatch * Independent h/w – s/w evolution resulted in painful consequences Should we continue building on a foundation that can’t even specify legal values for reads? Banish shared-memory?

The Problem Theorem: Current parallel languages are fundamentally broken Proof: See the Java Memory Model (+ unresolved bug) Memory model = what values can a read will return? 20+ years of research finally led to convergence – Sequential consistency for data-race-free programs is minimal – Java added MUCH complexity for safety/security * Minimal (complex) semantics for data races, but unresolved bug – C++, C added complexity for experts due to h/w – s/w mismatch * Independent h/w – s/w evolution resulted in painful consequences Should we continue building on a foundation that can’t even specify legal values for reads? Banish wild shared-memory!

The Opportunity Need disciplined shared-memory parallel languages Banish data races by design Provide determinism by default Support only explicit and controlled non-determinism Explicit side effects (sharing behavior, granularity, …) ??? Discipline is enforced Much momentum from software community What does this have to do with hardware?

The Opportunity Memory model = core of parallel hardware/software interface Today’s hardware designed for wild shared memory – Cache coherence, communication architecture, scheduling, … – Inefficient in performance, power, resilience, complexity, … Claim: Disciplined interface  h/w simplicity + efficiency E.g., race-free s/w  race-free (MUCH SIMPLER) coherence protocols E.g., explicit sharing behavior and granularity  efficient communication, data layout, cache design, …

The Approach Software enforces disciplined behavior  Software: safe, modular, composable, maintainable, … Hardware designed for disciplined software  Hardware: simple, scalable, power-efficient, … Broad hardware/software research agenda – Interface: semantics, mechanisms at all levels, ISA, … – Rethink hardware: coherence, communication, layout, caches, … – Help software to abide by interface Fundamental shift in software, hardware – But can be done incrementally – Memory models convergence from similar process But this time let’s co-evolve h/w, s/w