Programming-Language Approaches to Improving Shared-Memory Multithreading: Work-In-Progress Dan Grossman University of Washington Microsoft Research, RiSE.

Slides:



Advertisements
Similar presentations
Design and Implementation Issues for Atomicity Dan Grossman University of Washington Workshop on Declarative Programming Languages for Multicore Architectures.
Advertisements

Pay-to-use strong atomicity on conventional hardware Martín Abadi, Tim Harris, Mojtaba Mehrara Microsoft Research.
Automatic Parallelization Nick Johnson COS 597c Parallelism 30 Nov
Automatic Memory Management Noam Rinetzky Schreiber 123A /seminar/seminar1415a.html.
A Rely-Guarantee-Based Simulation for Verifying Concurrent Program Transformations Hongjin Liang, Xinyu Feng & Ming Fu Univ. of Science and Technology.
D u k e S y s t e m s Time, clocks, and consistency and the JMM Jeff Chase Duke University.
Region-Based Dynamic Separation in STM Haskell (And Related Perspective) Dan Grossman University of Washington Transactional Memory Workshop April 30,
Background for “KISS: Keep It Simple and Sequential” cs264 Ras Bodik spring 2005.
Atomicity in Multi-Threaded Programs Prachi Tiwari University of California, Santa Cruz CMPS 203 Programming Languages, Fall 2004.
Software Transactions: A Programming-Languages Perspective Dan Grossman University of Washington 28 February 2008.
1 Tuesday, November 07, 2006 “If anything can go wrong, it will.” -Murphy’s Law.
STM in Managed Runtimes: High-Level Language Semantics (MICRO 07 Tutorial) Dan Grossman University of Washington 2 December 2007.
My programming-languages view of TM: Research and Conjectures Dan Grossman University of Washington June 2008.
Software Transactions: A Programming-Languages Perspective Dan Grossman University of Washington 27 March 2008.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
Automatic Generation of Code-Centric Graphs for Understanding Shared-Memory Communication Dan Grossman University of Washington February 25, 2010.
Concurrency CS 510: Programming Languages David Walker.
Deterministic Execution of Nondeterministic Shared-Memory Programs Dan Grossman University of Washington Dagstuhl Seminar on Design and Validation of Concurrent.
1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.
The Cost of Privatization Hagit Attiya Eshcar Hillel Technion & EPFLTechnion.
1 Sharing Objects – Ch. 3 Visibility What is the source of the issue? Volatile Dekker’s algorithm Publication and Escape Thread Confinement Immutability.
STM in Managed Runtimes: High-Level Language Semantics (MICRO 07 Tutorial) Dan Grossman University of Washington 2 December 2007.
Department of Computer Science Presenters Dennis Gove Matthew Marzilli The ATOMO ∑ Transactional Programming Language.
Cormac Flanagan UC Santa Cruz Velodrome: A Sound and Complete Dynamic Atomicity Checker for Multithreaded Programs Jaeheon Yi UC Santa Cruz Stephen Freund.
Software Transactions: A Programming-Languages Perspective Dan Grossman University of Washington 5 December 2006.
Memory Management for Real-Time Java Wes Beebee and Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Supported by: DARPA.
RCDC SLIDES README Font Issues – To ensure that the RCDC logo appears correctly on all computers, it is represented with images in this presentation. This.
SEC(R) 2008 Intel® Concurrent Collections for C++ - a model for parallel programming Nikolay Kurtov Software and Services.
Prospector : A Toolchain To Help Parallel Programming Minjang Kim, Hyesoon Kim, HPArch Lab, and Chi-Keung Luk Intel This work will be also supported by.
Multi-core Programming Thread Profiler. 2 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Topics Look at Intel® Thread Profiler features.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Microsoft Research Faculty Summit Panacea or Pandora’s Box? Software Transactional Memory Panacea or Pandora’s Box? Christos Kozyrakis Assistant.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Threads and Processes.
Games Development 2 Concurrent Programming CO3301 Week 9.
Colorama: Architectural Support for Data-Centric Synchronization Luis Ceze, Pablo Montesinos, Christoph von Praun, and Josep Torrellas, HPCA 2007 Shimin.
Writing Systems Software in a Functional Language An Experience Report Iavor Diatchki, Thomas Hallgren, Mark Jones, Rebekah Leslie, Andrew Tolmach.
Low-Overhead Software Transactional Memory with Progress Guarantees and Strong Semantics Minjia Zhang, 1 Jipeng Huang, Man Cao, Michael D. Bond.
Precise Dynamic Data-Race Detection At The Right Abstraction Level Dan Grossman University of Washington Facebook Faculty Summit August 6, 2013.
Lecture 20: Parallelism & Concurrency CS 62 Spring 2013 Kim Bruce & Kevin Coogan CS 62 Spring 2013 Kim Bruce & Kevin Coogan Some slides based on those.
Debugging parallel programs. Breakpoint debugging Probably the most widely familiar method of debugging programs is breakpoint debugging. In this method,
Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.
Data races, informally [More formal definition to follow] “race condition” means two different things Data race: Two threads read/write, write/read, or.
Concurrency Control 1 Fall 2014 CS7020: Game Design and Development.
Detecting and Eliminating Potential Violation of Sequential Consistency for concurrent C/C++ program Duan Yuelu, Feng Xiaobing, Pen-chung Yew.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 4: Threads.
A visualisation and debugging tool for multi-active objects Ludovic Henrio, Justine Rochas LAMHA, Nov 2015.
CoreDet: A Compiler and Runtime System for Deterministic Multithreaded Execution Tom Bergan Owen Anderson, Joe Devietti, Luis Ceze, Dan Grossman To appear.
CAPP: Change-Aware Preemption Prioritization Vilas Jagannath, Qingzhou Luo, Darko Marinov Sep 6 th 2011.
AtomCaml: First-class Atomicity via Rollback Michael F. Ringenburg and Dan Grossman University of Washington International Conference on Functional Programming.
Pointer and Escape Analysis for Multithreaded Programs Alexandru Salcianu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
On Transactional Memory, Spinlocks and Database Transactions Khai Q. Tran Spyros Blanas Jeffrey F. Naughton (University of Wisconsin Madison)
Haskell on a Shared-Memory Multiprocessor Tim Harris Simon Marlow Simon Peyton Jones.
SMP Basics KeyStone Training Multicore Applications Literature Number: SPRPxxx 1.
Testing Concurrent Programs Sri Teja Basava Arpit Sud CSCI 5535: Fundamentals of Programming Languages University of Colorado at Boulder Spring 2010.
740: Computer Architecture Memory Consistency Prof. Onur Mutlu Carnegie Mellon University.
Concurrent Revisions: A deterministic concurrency model. Daan Leijen & Sebastian Burckhardt Microsoft Research (OOPSLA 2010, ESOP 2011)
Threads by Dr. Amin Danial Asham. References Operating System Concepts ABRAHAM SILBERSCHATZ, PETER BAER GALVIN, and GREG GAGNE.
The Structuring of Systems Using Upcalls By David D. Clark Presented by Samuel Moffatt.
Tools and Libraries for Manycore Computing Kathy Yelick U.C. Berkeley and LBNL.
Explicitly Parallel Programming with Shared-Memory is Insane: At Least Make it Deterministic! Joe Devietti, Brandon Lucia, Luis Ceze and Mark Oskin University.
Why Events Are A Bad Idea (for high-concurrency servers)
CSC 591/791 Reliable Software Systems
Hongjin Liang, Xinyu Feng & Ming Fu
Threads and Memory Models Hal Perkins Autumn 2011
Chapter 4: Threads.
Changing thread semantics
Threads and Memory Models Hal Perkins Autumn 2009
Design and Implementation Issues for Atomicity
Chapter 4: Threads & Concurrency
Presentation transcript:

Programming-Language Approaches to Improving Shared-Memory Multithreading: Work-In-Progress Dan Grossman University of Washington Microsoft Research, RiSE July 28, 2009

Dan Grossman: Multithreading Work-In-Progress2 Today A little history / organization: how I got here Informal, broad-not-deep overview of 4 ongoing projects –Better semantics / languages for transactional memory x 2 Dynamic separation for Haskell Semantics / abstraction for “escape actions” –Deterministic Multiprocessing –Code-centric communication graphs Hopefully time for discussion

July 28, 2009Dan Grossman: Multithreading Work-In-Progress3 Biography / group names Me: PLDI, ICFP, POPL “feel like home”, PhD for Cyclone  UW faculty, –Type system, compiler for memory-safe C dialect 30%  80% focus on multithreading, Co-advising 3-4 students with computer architect Luis Ceze, Two groups for “marketing purposes” WASP, wasp.cs.washington.edu SAMPA, sampa.cs.washington.edu

July 28, 2009Dan Grossman: Multithreading Work-In-Progress4 People / other projects Ask me later about: Progress estimation for PigLatin Hadoop queries [Kristi] Composable browser extensions [Ben L.]

July 28, 2009Dan Grossman: Multithreading Work-In-Progress5 Today A little history / organization: how I got here Informal, broad-not-deep overview of 4 ongoing projects –Better semantics / languages for transactional memory x 2 Dynamic separation for Haskell Semantics / abstraction for “escape actions” –Deterministic Multiprocessing –Code-centric communication graphs Hopefully time for discussion

July 28, 2009Dan Grossman: Multithreading Work-In-Progress6 Atomic blocks An easier-to-use and harder-to-implement synchronization primitive void transferFrom(int amt, Acct other){ atomic{ other.withdraw(amt); this.deposit(amt); } “Transactions are to shared-memory concurrency as garbage collection is to memory management” [OOPSLA 07] GC also has key semantic questions most programmers can ignore –Resurrection, serialization, dead assignments, etc.

July 28, 2009Dan Grossman: Multithreading Work-In-Progress7 “Weak” isolation Widespread misconception: “Weak” isolation violates the “all-at-once” property only if corresponding lock code has a race (May still be a bad thing, but smart people disagree.) atomic { y = 1; x = 3; y = x; } x = 2; print(y); //1? 2? 666? initially y==0

July 28, 2009Dan Grossman: Multithreading Work-In-Progress8 It’s worse Privatization: One of several examples where lock code works and weak-isolation transactions do not (Example adapted from [Rajwar/Larus] and [Hudson et al]) ptr fg atomic { r = ptr; ptr = new C(); } assert(r.f==r.g); atomic { ++ptr.f; ++ptr.g; } initially ptr.f == ptr.g

July 28, 2009Dan Grossman: Multithreading Work-In-Progress9 atomic { ++ptr.f; ++ptr.g; } It’s worse Most weak-isolation systems let the assertion fail! Eager-update or lazy-update ptr fg atomic { r = ptr; ptr = new C(); } assert(r.f==r.g); initially ptr.f == ptr.g

July 28, 2009Dan Grossman: Multithreading Work-In-Progress10 The need for semantics Which is wrong: the privatization code or the language implementation? What other “gotchas” exist? Can programmers correctly use transactions without understanding their implementation? Only rigorous programming-language semantics can answer

July 28, 2009Dan Grossman: Multithreading Work-In-Progress11 Separation Static separation: Each thread-shared, mutable object is accessed-inside- transactions xor accessed-outside-transactions throughout its lifetime –Natural in STM Haskell (but not other settings) –Proved sound for eager update [POPL 08 x 2] Dynamic separation: Each thread-shared, mutable object has dynamic metastate explicitly set by programmers to determine “side of the partition” –Designed, proven, implemented by Abadi et al for Bartok

July 28, 2009Dan Grossman: Multithreading Work-In-Progress12 atomic { r = ptr; ptr = new C(); } unprotect(r); assert(r.f==r.g); atomic { ++ptr.f; ++ptr.g; } initially ptr.f == ptr.g Example redux ptr fg

July 28, 2009Dan Grossman: Multithreading Work-In-Progress13 Laura’s work Design, semantics, implementation, and benchmarks for dynamic separation in Haskell Primary contributions: 1.Regions: Change protection state of entire data structures in O(1) time –Cool idioms/benchmarks where this gives 2-6x speedup 2.Lazy-update implementation –Allows protection-state changes from within transactions 3.Interface allowing composable libraries that can be used inside or outside transactions, without breaking Haskell’s types 4.Formal semantics in the style of STM Haskell

July 28, 2009Dan Grossman: Multithreading Work-In-Progress14 Today A little history / organization: how I got here Informal, broad-not-deep overview of 4 ongoing projects –Better semantics / languages for transactional memory x 2 Dynamic separation for Haskell Semantics / abstraction for “escape actions” –Deterministic Multiprocessing –Code-centric communication graphs Hopefully time for discussion

July 28, 2009Dan Grossman: Multithreading Work-In-Progress15 Escape actions Escape actions: –Do not count for memory conflicts –Are not undone if transaction aborts –Possible “strange results” if race with transactional accesses Essentially an unchecked back-door Note: Open nesting is just escape { atomic { s } } –So escaping is the essential primitive atomic { s1; escape { s2; } // perhaps in a callee s3; }

July 28, 2009Dan Grossman: Multithreading Work-In-Progress16 Canonical example If escape actions are hidden behind strong abstractions, we can improve parallelism without affecting program behavior –Clients cannot observe the escaping Unique-id generation type id; id new_id(); bool compare_ids(id, id); Transactions generating ids need not conflict with each other If transaction aborts, no need to undo the id-generation

July 28, 2009Dan Grossman: Multithreading Work-In-Progress17 Matt’s work Formal semantics for escape actions Use to prove the unique-id example is correct –Two implementations, one using escape –Show no client affected by choice of implementation –Fundamentally similar to proving ADTs actually work Gotcha: The theorem is false if a client abuses other escape actions to “leak ids”: –Discovered by attempting the proof! atomic { id x = new_id(); if(compare_ids(x,glbl)) … escape { glbl=x; } … }

July 28, 2009Dan Grossman: Multithreading Work-In-Progress18 Today A little history / organization: how I got here Informal, broad-not-deep overview of 4 ongoing projects –Better semantics / languages for transactional memory x 2 Dynamic separation for Haskell Semantics / abstraction for “escape actions” –Deterministic Multiprocessing –Code-centric communication graphs Hopefully time for discussion

July 28, 2009Dan Grossman: Multithreading Work-In-Progress19 Deterministic C Take arbitrary C + POSIX Threads and make behavior dependent only on inputs (not nondeterministic scheduling) –Helps testing, debugging, reproducibility, replication It’s easy! –Run one thread at a time with deterministic context-switch Example: run for N instructions or until blocking It’s hard! –Need to recover scalability with reasonable overhead Amdahl’s Law is one tough cookie!

July 28, 2009Dan Grossman: Multithreading Work-In-Progress20 How to do it It’s a long and interesting compiler, run-time, and correctness story –Invite Luis over for an hour Key techniques: –Dynamic ownership of memory (run in parallel while threads access what they own) –Buffering (publish buffers deterministically while not violating language’s memory-consistency model) –No promise which deterministic execution programmer will get (tiny change to source code can affect behavior) Performance: –Depends on application –Buffering has better scalability but worse per-thread overhead, so hybrid approaches are sometimes needed

July 28, 2009Dan Grossman: Multithreading Work-In-Progress21 Today A little history / organization: how I got here Informal, broad-not-deep overview of 4 ongoing projects –Better semantics / languages for transactional memory x 2 Dynamic separation for Haskell Semantics / abstraction for “escape actions” –Deterministic Multiprocessing –Code-centric communication graphs Hopefully time for discussion

July 28, 2009Dan Grossman: Multithreading Work-In-Progress22 Code-centric In a shared-memory C/C#/Java program, any heap access might be inter-thread communication –But very few actually are Most prior work to detect/exploit this sparseness is data-centric –What objects are thread-local? –What locks protect what memory? Answers can find bugs, optimize programs, define code metrics, etc. We provide a complementary code-centric view…

July 28, 2009Dan Grossman: Multithreading Work-In-Progress23 Graph Nodes: Code units (e.g., functions) Directed edges: –Source did a write in thread T1 –Target read that write in thread T2 –T1 != T2 Current tool: –Automatically build graph of a (slower) dynamic execution –Manual easy clean-up by programmer –Rely heavily on state-of-the-art dynamic instrumentation (PIN) and graph visualization (Prefuse)

July 28, 2009Dan Grossman: Multithreading Work-In-Progress24 A toy example queue q; // global, mutable void enqueue(T* obj) { … } T* dequeue() { … } void consumer(){ … T t = dequeue(); … } void producer(){ … T* t = …; t->f=…; enqueue(t) … } Program: multiple threads call producer and consumer enqueuedequeueproducer consumer Tool supports “conceptual inlining” to allow multiple abstraction levels

July 28, 2009Dan Grossman: Multithreading Work-In-Progress25 Not just for toys Small and large applications –Example: MySQL (940KLOC); graph clean-up by one grad student in < 1 day without prior source-code knowledge I truly believe: –Great “first day of internship” tool interactive graph essential and not our contribution –Useful way to measure multithreaded behavior Example: Graphs are very sparse thankfully MySQL: >11,000 functions, 423 nodes, 802 edges Example: Graph diff across runs with same input measures the nondeterminism of the program But this is hard-to-evaluate tool work – your thoughts? Future work: Specification of graphs checked during execution

July 28, 2009Dan Grossman: Multithreading Work-In-Progress26 Summary –Better semantics / languages for transactional memory x 2 Dynamic separation for Haskell Semantics / abstraction for “escape actions” –Deterministic Multiprocessing –Code-centric communication graphs Very little published yet, but all Real Soon Now Microsoft has been essential –Transactions (Harris, Abadi, Peyton Jones, many more) –Funding (Scalable Multicore RFP, New Faculty Fellows) Hopefully opportunities to collaborate –Particularly on the (unproven) SE applications of this work