Region-Based Dynamic Separation in STM Haskell (And Related Perspective) Dan Grossman University of Washington Transactional Memory Workshop April 30,

Slides:

Advertisements

Similar presentations

Design and Implementation Issues for Atomicity Dan Grossman University of Washington Workshop on Declarative Programming Languages for Multicore Architectures.

Advertisements

Pay-to-use strong atomicity on conventional hardware Martín Abadi, Tim Harris, Mojtaba Mehrara Microsoft Research.

Identity and Equality Based on material by Michael Ernst, University of Washington.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Rich Transactions on Reasonable Hardware J. Eliot B. Moss Univ. of Massachusetts,

Programming-Language Approaches to Improving Shared-Memory Multithreading: Work-In-Progress Dan Grossman University of Washington Microsoft Research, RiSE.

Making sense of transactional memory Tim Harris (MSR Cambridge) Based on joint work with colleagues at MSR Cambridge, MSR Mountain View, MSR Redmond, the.

CSE332: Data Abstractions Lecture 23: Programming with Locks and Critical Sections Dan Grossman Spring 2010.

We should define semantics for languages, not for TM Tim Harris (MSR Cambridge)

PARALLEL PROGRAMMING with TRANSACTIONAL MEMORY Pratibha Kona.

DMITRI PERELMAN IDIT KEIDAR TRANSACT 2010 SMV: Selective Multi-Versioning STM 1.

Parallelism and Concurrency Koen Lindström Claessen Chalmers University Gothenburg, Sweden Ulf Norell.

“THREADS CANNOT BE IMPLEMENTED AS A LIBRARY” HANS-J. BOEHM, HP LABS Presented by Seema Saijpaul CS-510.

Assignment – no class Wednesday All: watch the Google Techtalk “Getting C++ Threads Right” by Hans Boehm at the following link in place of Wednesday’s.

Formalisms and Verification for Transactional Memories Vasu Singh EPFL Switzerland.

1 Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, “lazy” implementation.

1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.

The Cost of Privatization Hagit Attiya Eshcar Hillel Technion & EPFLTechnion.

1 Sharing Objects – Ch. 3 Visibility What is the source of the issue? Volatile Dekker’s algorithm Publication and Escape Thread Confinement Immutability.

STM in Managed Runtimes: High-Level Language Semantics (MICRO 07 Tutorial) Dan Grossman University of Washington 2 December 2007.

Department of Computer Science Presenters Dennis Gove Matthew Marzilli The ATOMO ∑ Transactional Programming Language.

CS510 Concurrent Systems Class 5 Threads Cannot Be Implemented As a Library.

Tim Sheard Oregon Graduate Institute Lecture 11: A Reduction Semantics for MetaML CS510 Section FSC Winter 2005 Winter 2005.

CSE341: Programming Languages Lecture 11 Type Inference Dan Grossman Winter 2013.

Programming Languages and Paradigms Object-Oriented Programming.

OOPs Object oriented programming. Based on ADT principles  Representation of type and operations in a single unit  Available for other units to create.

Accelerating Precise Race Detection Using Commercially-Available Hardware Transactional Memory Support Serdar Tasiran Koc University, Istanbul, Turkey.

High-Level Design With Sequence Diagrams COMP314 (based on original slides by Mark Hall)

Threading and Concurrency Issues ● Creating Threads ● In Java ● Subclassing Thread ● Implementing Runnable ● Synchronization ● Immutable ● Synchronized.

A Qualitative Survey of Modern Software Transactional Memory Systems Virendra J. Marathe Michael L. Scott.

Retro: Modular and efficient retrospection in a database Ross Shaull Liuba Shrira Brandeis University.

1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.

Optimistic Design 1. Guarded Methods Do something based on the fact that one or more objects have particular states  Make a set of purchases assuming.

DoubleChecker: Efficient Sound and Precise Atomicity Checking Swarnendu Biswas, Jipeng Huang, Aritra Sengupta, and Michael D. Bond The Ohio State University.

Lowering the Overhead of Software Transactional Memory Virendra J. Marathe, Michael F. Spear, Christopher Heriot, Athul Acharya, David Eisenstat, William.

Objects & Dynamic Dispatch CSE 413 Autumn Plan We’ve learned a great deal about functional and object-oriented programming Now,  Look at semantics.

Transactional Memory Lecturer: Danny Hendler. 2 2 From the New York Times…

Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.

Data races, informally [More formal definition to follow] “race condition” means two different things Data race: Two threads read/write, write/read, or.

Java Thread and Memory Model

Object-Oriented Programming Chapter Chapter

Design Patterns Software Engineering CS 561. Last Time Introduced design patterns Abstraction-Occurrence General Hierarchy Player-Role.

 In the java programming language, a keyword is one of 50 reserved words which have a predefined meaning in the language; because of this,

How to execute Program structure Variables name, keywords, binding, scope, lifetime Data types – type system – primitives, strings, arrays, hashes – pointers/references.

AtomCaml: First-class Atomicity via Rollback Michael F. Ringenburg and Dan Grossman University of Washington International Conference on Functional Programming.

Week 9, Class 3: Java’s Happens-Before Memory Model (Slides used and skipped in class) SE-2811 Slide design: Dr. Mark L. Hornick Content: Dr. Hornick Errors:

CSCE 240 – Intro to Software Engineering Lecture 3.

FastTrack: Efficient and Precise Dynamic Race Detection [FlFr09] Cormac Flanagan and Stephen N. Freund GNU OS Lab. 23-Jun-16 Ok-kyoon Ha.

Parallelism and Concurrency Koen Lindström Claessen Chalmers University Gothenburg, Sweden Patrik Jansson.

Java Thread Programming

CSE341: Programming Languages Lecture 21 Dynamic Dispatch Precisely, and Manually in Racket Dan Grossman Spring 2016.

CSE341: Programming Languages Lecture 11 Type Inference

Parallelism and Concurrency

CSE341: Programming Languages Lecture 21 Dynamic Dispatch Precisely, and Manually in Racket Dan Grossman Winter 2013.

(Nested) Open Memory Transactions in Haskell

CSE341: Programming Languages Lecture 21 Dynamic Dispatch Precisely, and Manually in Racket Dan Grossman Spring 2017.

CSE341: Programming Languages Lecture 21 Dynamic Dispatch Precisely, and Manually in Racket Dan Grossman Autumn 2017.

Stateful Manifest Contracts

Threads and Memory Models Hal Perkins Autumn 2011

Enforcing Isolation and Ordering in STM Systems

CSE341: Programming Languages Lecture 21 Dynamic Dispatch Precisely, and Manually in Racket Dan Grossman Autumn 2018.

Threads and Memory Models Hal Perkins Autumn 2009

CSE341: Programming Languages Lecture 11 Type Inference

Design and Implementation Issues for Atomicity

CSE341: Programming Languages Lecture 11 Type Inference

CSE341: Programming Languages Lecture 11 Type Inference

CSE 332: Concurrency and Locks

Controlled Interleaving for Transactions

CSE341: Programming Languages Lecture 11 Type Inference

CSE341: Programming Languages Lecture 21 Dynamic Dispatch Precisely, and Manually in Racket Dan Grossman Spring 2019.

Presentation transcript:

Region-Based Dynamic Separation in STM Haskell (And Related Perspective) Dan Grossman University of Washington Transactional Memory Workshop April 30, 2010

Apology From: Hank Levy (Department Chair) Date: April 6, 2010 Subject: Upcoming faculty meetings … Please reserve ** NOON TO 5:30 PM ** on THURSDAY APRIL 29th for a possible (marathon) faculty meeting… April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM2

Apology April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM3 From: Nicholas Kidd Subject: Re: [TMW'10] A few announcements Ugh indeed, this sounds terrible … I hereby promise that coffee will be available throughout TMW'10!

TM at Univ. Washington I come at transactions from the programming-languages side –Formal semantics, language design, and efficient implementation for atomic blocks –Software-development benefits –Interaction with other sophisticated features of modern PLs [ICFP05][MSPC06][PLDI07][OOPSLA07][SCHEME07][POPL08] April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM4 transfer(from,to,amt){ atomic { deposit(to,amt); withdraw(from,amt); } An easier-to-use and harder-to-implement synchronization primitive

The goal I want atomic blocks to: –Be easy to use in most cases –Interact well with rest of language design / implementation Despite subtle semantic issues for PL experts My favorite analogy [OOPSLA07] : garbage collection is a success story, for memory management rather than concurrency –People forget subtle semantic issues exist for GC Finalization / resurrection Space-explosion “optimizations” (like removing x=null ) … April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM5

Today Review and perspective on transaction + non-transaction access –“How we got to where we are” –A healthy reminder, probably without (much) controversy –But not much new for this expert crowd Not-yet-published work on specific issue of dynamic separation –Extension of STM Haskell –Emphasize need for “regions” and libraries reusable inside and outside transactions Time permitting: Brief note on two other current projects April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM6

Are races allowed? For performance and legacy reasons, many experts have decided not to allow code like the following –I can probably grudgingly live with this Why penalize “good code” for questionable benefit –But: For managed PLs, still struggle with “what can happen” Does make it harder to maintain / evolve code April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM7 Thread 1 x = 2; Thread 2 atomic { x = 1; y = 1; assert(x==y); }

Privatization Alas, there are examples where it is awkward to consider the program racy, but “basic” TM approaches can “create” a problem Canonical “privatization” example: April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM8 Thread 1 atomic { r = ptr; ptr = new C(); } assert(r.f==r.g); Thread 2 atomic { ++ptr.f; ++ptr.g; } initially ptr.f = = ptr.g ptr fg

The Problems Eager update, lazy conflict detection: assert may see one update from “doomed” Thread 2 Lazy update: assert may see one update from “partially committed” Thread 2 April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM9 Thread 1 atomic { r = ptr; ptr = new C(); } assert(r.f==r.g); Thread 2 atomic { ++ptr.f; ++ptr.g; } initially ptr.f = = ptr.g ptr fg

Solution areas To support atomic blocks that privatize (and related idioms): 1.Enrich underlying TM implementations to be privatization safe –I’m all for it if trade-offs are acceptable Important but uncommon cases –Not today’s presentation 2.Disallow privatization –Either soundly prohibited by PL or programmer error 3.Allow privatization only if programmers do more explicit work –Our work, making this more convenient and flexible April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM10

Disallowing privatization Prior work on static separation takes this approach –Same memory cannot be used inside a transaction and outside a transaction –Note read-only and thread-local are okay April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM11 Thread local Immutable Never accessed in transaction See: NAIT is provably enough for “weak” TM to implement “strong” atomic block –POPL08 * 2 STM Haskell –functional + monads => immutable or NAIT

Dynamic separation Dynamic separation allows objects to transition among –Only accessed inside transactions –Only accessed outside transactions –Read only –(Added by us: thread-local to thread tid ) Explicit language primitives to enact transitions –Example: protect obj transitions obj to “only inside” Semantics and implementation for C# and AME –[Abadi et al, CC2009, CONCUR2008] April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM12

Uses of dynamic separation Obvious use: Explicit privatization Another: more efficient (re)-initialization of data structures than static separation would allow –Essentially a “publication” –Create a large tree in one thread without transactions and then protect it and make it thread-shared –Resize a hashtable without a long transaction (next slide) But the (re)-initialization argument is much more compelling if we can transition an entire data structure in O(1) time/space –For example: If hash table uses linked lists April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM13

Hash table example April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM14 class HT { T [] table; boolean resizing = false; … void insert(T x){ atomic{ if(resizing) retry; … }} T find(int key) { atomic{ if(resizing) retry; … }} void resize() { atomic{ if(resizing) return; resizing = true; } unprotect(table); … protect(table); atomic{ resizing = false; } }

Today Review and perspective on transaction + non-transaction access –“How we got to where we are” –A healthy reminder, probably without (much) controversy –But not much new for this expert crowd Not-yet-published work on specific issue of dynamic separation –Extension of STM Haskell –Emphasize need for “regions” and libraries reusable inside and outside transactions Time permitting: Brief note on two other current projects April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM15 Laura Effinger-Dean

Why Haskell In some sense, Haskell is a terrible choice for dynamic separation –The one language where static separation is natural –Monads already enforce static separation of many things But this makes it an ideal setting for our research –Use dynamic separation only where static separation is unpalatable –Need a precise, workable semantics from the start, else it will be obvious we are “ruining Haskell” April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM16

Novelties 1.Region-based to support constant-time transition-change for collection of objects 2.Complement static separation (current default in Haskell) –Allow both approaches in same program (different data) –Use dynamic separation for composable libraries that can be used inside or outside transactions, without violating Haskell’s type system 3.Extend elegant formal semantics (including orelse ) 4.Underlying implementation uses lazy update –Significant speed-up for some benchmarks by avoiding transactions that are necessary with static separation April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM17

STM Haskell basics STM Haskell has static separation –Most data is read-only (purely functional language) –Non-transactional mutable locations called IORef s –Transactional mutable locations called TVar s Because the type system enforces static separation, you can’t “transactionalize” code using IORef s, by “slapping an atomic around it” –This is a general feature of Haskell’s monads –The STM monad and IO (top-level) monad are distinct –atomically primitive takes a transaction “object” and creates a top-level-action “object” April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM18 atomically :: STM a -> IO a

Adding DVars From a language-design standpoint, it’s mostly straightforward to add a third kind of mutable location for dynamic separation In “normal languages”, a DVar would be allowed by the type system to be accessed anywhere –A meta-data field would record “current protection state” and dynamically disallow transactions to use it when “unprotected” –This doesn’t work with monads: separation is the rule April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM19

DVars for Haskell So we add a third monad, DSTM monad, for Dvar s –Can turns a DSTM “object” into an STM “object” or a top- level-action “object” April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM20 atomically :: STM a -> IO a protected :: DSTM a -> STM a unprotected :: DSTM a -> IO a -- not atomic! A DSTM “object” could be as little as a single read/write of a DVar –But sequences of actions can be packaged up so that the same library can be used inside or outside transactions –Trade-off between code reuse and protection-state checks –Not possible in previous approaches to sound separation

Regions So far, we could just have the DSTM Monad include operations, including protection-state changes for DVar s April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM21 Instead, we add a level of indirection for the protection state, so one state change can effect a collection of objects (could be 1) –Cost is one implicit word per DVar (avoidable if unneeded) newDRgn :: DSTM DRgn a -> DRgn -> DSTM (DVar a) newDVar :: a -> DSTM (DVar a) readDVar :: DVar a -> DSTM a writeDVar :: DVar a -> a -> DSTM a protectDVar :: DVar a -> IO () unprotectDVar :: DVar a -> IO () protectDRgn :: DRgn -> IO () unprotectDRgn :: DRgn -> IO ()

Novelties 1.Region-based to support constant-time transition-change for collection of objects 2.Complement static separation (current default in Haskell) –Allow both approaches in same program (different data) –Use dynamic separation for composable libraries that can be used inside or outside transactions, without violating Haskell’s type system 3.Extend elegant formal semantics (including orelse ) 4.Underlying implementation uses lazy update –Significant speed-up for some benchmarks by avoiding transactions that are necessary with static separation April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM22

Implementation in one slide DVar read/write also reads associated DRgn –Only txn’s first access of the DVar (easy with lazy update) Protection-state change is a mini-transaction that writes to the DRgn –TM mechanism synchronizes with txns There are, uhm, some other details April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM23 DVar

Non-transactional accesses Suppose DVar accesses outside of transactions do not check the DRgn protection-state –Any correct program w.r.t. dynamic separation runs correctly –Any incorrect program is still type safe, but may violate atomicity Alternately, we can check all accesses –Have a safe caching mechanism to avoid unnecessary DRgn access in common cases April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM24

Preliminary Performance Caveat: Comparing to STM Haskell baseline is not necessarily state-of-the-art Approach 1: Take existing STM benchmarks, use all DVar s instead of TVar s, measure slowdown: 0-20% Approach 2: Code up “killer uses” of dynamic separation, measure speedup: 2-8x for 4 threads (e.g., resizing hash table) Approach 3: Find an STM Haskell program that would benefit from dynamic separation and rewrite it: TBD April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM25

Conclusion Dynamic separation appears to be an elegant and viable alternative for implementing a PL over a TM that is not privatization-safe April 30, 2010Dan Grossman: Region-Based Dynamic Separation for STM26