Minh, Trautmann, Chung, McDonald, Bronson, Casper, Kozyrakis, Olukotun

Slides:



Advertisements
Similar presentations
Copyright 2008 Sun Microsystems, Inc Better Expressiveness for HTM using Split Hardware Transactions Yossi Lev Brown University & Sun Microsystems Laboratories.
Advertisements

Synchronization. How to synchronize processes? – Need to protect access to shared data to avoid problems like race conditions – Typical example: Updating.
CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.
Transactional Locking Nir Shavit Tel Aviv University (Joint work with Dave Dice and Ori Shalev)
Transactional Memory – Implementation Lecture 1 COS597C, Fall 2010 Princeton University Arun Raman 1.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
Hybrid Transactional Memory Nir Shavit MIT and Tel-Aviv University Joint work with Alex Matveev (and describing the work of many in this summer school)
Transactional Memory (TM) Evan Jolley EE 6633 December 7, 2012.
1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
1 Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, “lazy” implementation.
1 Lecture 23: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
1 Lecture 8: Transactional Memory – TCC Topics: “lazy” implementation (TCC)
1 Lecture 24: Transactional Memory Topics: transactional memory implementations.
Supporting Nested Transactional Memory in LogTM Authors Michelle J Moravan Mark Hill Jayaram Bobba Ben Liblit Kevin Moore Michael Swift Luke Yen David.
Language Support for Lightweight transactions Tim Harris & Keir Fraser Presented by Narayanan Sundaram 04/28/2008.
LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood Presented by Colleen Lewis.
KAUSHIK LAKSHMINARAYANAN MICHAEL ROZYCZKO VIVEK SESHADRI Transactional Memory: Hybrid Hardware/Software Approaches.
Lazy Release Consistency for Software Distributed Shared Memory Pete Keleher Alan L. Cox Willy Z.
Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory Alexander Matveev Nir Shavit MIT.
Sutirtha Sanyal (Barcelona Supercomputing Center, Barcelona) Accelerating Hardware Transactional Memory (HTM) with Dynamic Filtering of Privatized Data.
A Qualitative Survey of Modern Software Transactional Memory Systems Virendra J. Marathe Michael L. Scott.
CS5204 – Operating Systems Transactional Memory Part 2: Software-Based Approaches.
Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.
Transactional Coherence and Consistency Presenters: Muhammad Mohsin Butt. (g ) Coe-502 paper presentation 2.
Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill & David A. Wood Presented by: Eduardo Cuervo.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Demand Paging.
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.
Hardware and Software transactional memory and usages in MRE
HARD: Hardware-Assisted lockset- based Race Detection P.Zhou, R.Teodorescu, Y.Zhou. HPCA’07 Shimin Chen LBA Reading Group Presentation.
Novel Paradigms of Parallel Programming Prof. Smruti R. Sarangi IIT Delhi.
Translation Lookaside Buffer
Maurice Herlihy and J. Eliot B. Moss,  ISCA '93
James Larus and Christos Kozyrakis
Outline Introduction Centralized shared-memory architectures (Sec. 5.2) Distributed shared-memory and directory-based coherence (Sec. 5.4) Synchronization:
Irina Calciu Justin Gottschlich Tatiana Shpeisman Gilles Pokam
Software Coherence Management on Non-Coherent-Cache Multicores
Speculative Lock Elision
Transactional Memory : Hardware Proposals Overview
Part 2: Software-Based Approaches
PHyTM: Persistent Hybrid Transactional Memory
Lecture 19: Coherence and Synchronization
Why The Grass May Not Be Greener On The Other Side: A Comparison of Locking vs. Transactional Memory By McKenney, Michael, Triplett and Walpole.
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Hardware Transactional Memory
Two Ideas of This Paper Using Permissions-only Cache to deduce the rate at which less-efficient overflow handling mechanisms are invoked. When the overflow.
Lecture 19: Transactional Memories III
The University of Adelaide, School of Computer Science
A Qualitative Survey of Modern Software Transactional Memory Systems
Lecture: Consistency Models, TM
Lecture 6: Transactions
Chapter 10 Transaction Management and Concurrency Control
Lecture 21: Transactional Memory
Transactional Memory An Overview of Hardware Alternatives
Lecture 21: Synchronization and Consistency
Lecture 22: Consistency Models, TM
Lecture: Coherence and Synchronization
Hybrid Transactional Memory
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
CS333 Intro to Operating Systems
Lecture 17 Multiprocessors and Thread-Level Parallelism
Lecture 23: Transactional Memory
Lecture 21: Transactional Memory
Lecture: Coherence and Synchronization
Lecture: Transactional Memory
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Presentation transcript:

Minh, Trautmann, Chung, McDonald, Bronson, Casper, Kozyrakis, Olukotun An Effective Hybrid Transactional Memory System with Strong Isolation Guarantees Minh, Trautmann, Chung, McDonald, Bronson, Casper, Kozyrakis, Olukotun Presented by Cynthia Sturton 5/5/08

Outline Software Transactional Memory Hardware Transactional Memory SigTM

Software Transactional Memory Lazy versioning Global version clock Write set buffer Lazy conflict detection Lock associated with every word in memory Bloom filter to maintain write set

Software Transactional Memory Compiler High-level Low-level ListNode n; atomic { n = head; if (n != null) { head = head.next; } ListNode n; STMstart(); n = STMread(&head); if (n != null) { ListNode t; t = STMread(&head.next); STMwrite(&head, t); } STMcommit();

Software Transactional Memory - Start Checkpoint current execution environment Read global version clock value into RV

Software Transactional Memory – Read Check if in write set Check for conflicts with committed or committing transactions Abort! Insert address into read set (FIFO) Load word from memory, return value to user

Software Transactional Memory - Write Check for conflict from committed or committing transactions Abort! Insert address in Bloom filter for write set Insert address and data in write set

Software Transactional Memory - Commit Acquire locks for write set Atomically increment global clock Validate items in read set ** Transaction Validated ** Copy write set values to memory Release locks on write sets

Correctness in STM Strong Isolation Data races Privatization code Read sets not validated until commit

Strong Isolation Thread 1 Thread 2 ListNode n; atomic { n = head; if (n != null) head = head.next; } // use n.val many times atomic { ListNode n = head; while (n != null) { n.val++; n = n.next; } Thread 1 can read partially committed transaction state of Thread 2

Hardware Transactional Memory Lazy versioning Write set buffered in cache W and R bits added to cache line hardware Eager conflict detection (reads & writes) Cache coherency messages

Hardware Transactional Memory - Start Register checkpoint done by hardware

Hardware Transactional Memory - Read Cache hit: Set R bit if W bit isn’t already set Cache miss: Request line in shared state Set R bit

Hardware Transactional Memory - Write Cache miss: Request line in shared state Cache hit: If data is modified write back to underlying memory Write to cache and set W bit

Hardware Transactional Memory - Commit Acquire commit lock Acquire exclusive state on all lines in write set ** Transaction Validated ** Reset W and R bits Release commit lock Modified data in cache can be read by others

Hardware Transactional Memory – Conflict Detection Process receives exclusive request for data in read set Process receives any request for data in write set Generated by committing or non-transactional process Software abort handler invoked Invalidate all cache lines in R and W set Restore register checkpoint Forward progress – validated transaction cannot abort No starvation – starving transactions acquire commit lock at outset

SigTM Hardware – Software transactional memory hybrid Eager conflict detection (on read set) Hardware signature (Bloom filter) Lazy versioning Write set buffer in SW Strong isolation guarantees

SigTM - Start Take a checkpoint Enable read set signature lookups for exclusive coherence requests

SigTM - Read Check if address is in write set Insert address into read set signature Read word from memory

SigTM - Write Add address to write signature Update address and value in software write set

SigTM - Commit Enable coherence lookups in write set for all requests Acquire exclusive access for every address in write set Enable NACKs for requests in write set ** Transaction validated ** Reset read set signature Store values from write set to memory Reset write set signature Disable NACKing

SigTM vs. STM Read barriers accelerated with read set signature No locking or timestamps Commit accelerated Two traversals of write set No read set validation Early conflict detection False positives with read or write signatures?

SigTM vs. HTM No hardware cache modification Flexible Nested transactions

Performance Evaluation

Accuracy of Read and Write Signatures

SigTM

STM vs. HTM STM HTM Maintenance and validation of read set. During commit – one read barrier and timestamp validation per word in read set. 3 traversals of write set in Validate and commit: Acquire locks Write to memory Release locks Lazy conflict detection (at end of execution when validating read set) – wasted work on aborted transactions No additional instructions to maintain read/write set Read set validation occurs continuously One traversal of write set on commit Virtualization on cache overflow/associativity conflict  STM-like performance in that case False conflicts due to cache-line level granularity Strong isolation

Transactional Memory “Provide good performance with simple parallel code that frequently uses coarse-grain synchronization” Version management for transaction data Conflict detection as transactions execute concurrently SigTM: Lazy versioning Eager conflict detection (on reads)