Novel Paradigms of Parallel Programming Prof. Smruti R. Sarangi IIT Delhi.

Slides:



Advertisements
Similar presentations
Time-based Transactional Memory with Scalable Time Bases Torvald Riegel, Christof Fetzer, Pascal Felber Presented By: Michael Gendelman.
Advertisements

Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.
What is Concurrent Process (CP)? Multiple users access databases and use computer systems Multiple users access databases and use computer systems simultaneously.
IDA / ADIT Lecture 10: Database recovery Jose M. Peña
TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.
Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.
Lock-Based Concurrency Control
Pessimistic Software Lock-Elision Nir Shavit (Joint work with Yehuda Afek Alexander Matveev)
Lecture 11 Recoverability. 2 Serializability identifies schedules that maintain database consistency, assuming no transaction fails. Could also examine.
Hybrid Transactional Memory Nir Shavit MIT and Tel-Aviv University Joint work with Alex Matveev (and describing the work of many in this summer school)
Chapter 6 Process Synchronization: Part 2. Problems with Semaphores Correct use of semaphore operations may not be easy: –Suppose semaphore variable called.
Transactional Memory (TM) Evan Jolley EE 6633 December 7, 2012.
Transactions A process that reads or modifies the DB is called a transaction. It is a unit of execution of database operations. Basic JDBC transaction.
Recovery from Crashes. Transactions A process that reads or modifies the DB is called a transaction. It is a unit of execution of database operations.
PARALLEL PROGRAMMING with TRANSACTIONAL MEMORY Pratibha Kona.
Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.
1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
Recovery from Crashes. ACID A transaction is atomic -- all or none property. If it executes partly, an invalid state is likely to result. A transaction,
1 Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, “lazy” implementation.
Quick Review of May 1 material Concurrent Execution and Serializability –inconsistent concurrent schedules –transaction conflicts serializable == conflict.
1 Lecture 23: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
1 Lecture 8: Transactional Memory – TCC Topics: “lazy” implementation (TCC)
1 Lecture 24: Transactional Memory Topics: transactional memory implementations.
Supporting Nested Transactional Memory in LogTM Authors Michelle J Moravan Mark Hill Jayaram Bobba Ben Liblit Kevin Moore Michael Swift Luke Yen David.
Chapter 19 Database Recovery Techniques. Slide Chapter 19 Outline Databases Recovery 1. Purpose of Database Recovery 2. Types of Failure 3. Transaction.
Transaction Management
Unbounded Transactional Memory Paper by Ananian et al. of MIT CSAIL Presented by Daniel.
Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
Solution to Dining Philosophers. Each philosopher I invokes the operations pickup() and putdown() in the following sequence: dp.pickup(i) EAT dp.putdown(i)
Cosc 4740 Chapter 6, Part 3 Process Synchronization.
Concurrency Server accesses data on behalf of client – series of operations is a transaction – transactions are atomic Several clients may invoke transactions.
1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.
1 How can several users access and update the information at the same time? Real world results Model Database system Physical database Database management.
Low-Overhead Software Transactional Memory with Progress Guarantees and Strong Semantics Minjia Zhang, 1 Jipeng Huang, Man Cao, Michael D. Bond.
Transactional Memory Lecturer: Danny Hendler. 2 2 From the New York Times…
Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Election algorithms –Bully algorithm –Ring algorithm Distributed.
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
7c.1 Silberschatz, Galvin and Gagne ©2003 Operating System Concepts with Java Module 7c: Atomicity Atomic Transactions Log-based Recovery Checkpoints Concurrent.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.
Lecture 27 Multiprocessor Scheduling. Last lecture: VMM Two old problems: CPU virtualization and memory virtualization I/O virtualization Today Issues.
MULTIVIE W Slide 1 (of 21) Software Transactional Memory Should Not Be Obstruction Free Paper: Robert Ennals Presenter: Emerson Murphy-Hill.
Multidatabase Transaction Management COP5711. Multidatabase Transaction Management Outline Review - Transaction Processing Multidatabase Transaction Management.
Parallel Data Structures. Story so far Wirth’s motto –Algorithm + Data structure = Program So far, we have studied –parallelism in regular and irregular.
On Transactional Memory, Spinlocks and Database Transactions Khai Q. Tran Spyros Blanas Jeffrey F. Naughton (University of Wisconsin Madison)
18 September 2008CIS 340 # 1 Last Covered (almost)(almost) Variety of middleware mechanisms Gain? Enable n-tier architectures while not necessarily using.
Novel Paradigms of Parallel Programming
Database Recovery Techniques
Transaction Management
Minh, Trautmann, Chung, McDonald, Bronson, Casper, Kozyrakis, Olukotun
Part- A Transaction Management
Challenges in Concurrent Computing
Expander: Lock-free Cache for a Concurrent Data Structure
COS 418: Advanced Computer Systems Lecture 5 Michael Freedman
Concurrency Control II (OCC, MVCC)
Lecture: Consistency Models, TM
Lecture 6: Transactions
Lecture 21: Transactional Memory
Lecture 22: Consistency Models, TM
Introduction of Week 13 Return assignment 11-1 and 3-1-5
Locking Protocols & Software Transactional Memory
Database Recovery 1 Purpose of Database Recovery
Lecture 23: Transactional Memory
Lecture 21: Transactional Memory
Lecture: Transactional Memory
Concurrency control (OCC and MVCC)
CSE 542: Operating Systems
CSE 542: Operating Systems
Presentation transcript:

Novel Paradigms of Parallel Programming Prof. Smruti R. Sarangi IIT Delhi

Outline Multicore Processors Parallel Programming Pardigms Transactional Memory: Basics Software Transactional Memory(STM) Hardware Transactional Memory

Multicores in the last Five Years Source : Intel IDF

Rise and Rise of Multicore Processors According to Moore’s Law, the number of cores are doubling each year. Source: extremetech.com

Future of Multicores Cores doubling every two years ◦ 16 cores by 2014 ◦ 32 cores by 2016 ◦ 64 cores by 2018 Increasing number of threads per core ◦ Intel processors – 2 threads (hyperthreading mode) ◦ IBM Power 7 – upto 4 threads per core

Main Challenges Programming and Scaling How to design a system that scales to hundreds of cores? How to program it effectively? Scaling Programming Computer Architects are working on it … We need to work on it …

Leveraging Multicore Processors Each core does a separate job  , editor, music player, video player Suitable for only desktop applications

What about Enterprise/ Scientific Applications? Need support for parallel programming Traditional Methods ◦ Lock based Non-traditional methods ◦ Non-blocking methods (lock free/ wait free) ◦ Transactional Memory  Software  Hardware

Outline Multicore Processors Parallel Programming Paradigms Transactional Memory: Basics Software Transactional Memory(STM) Hardware Transactional Memory

Conventional Lock-Based Programming val = account.balance; newval = val + 100; account.balance = newval Can this code be executed in parallel by multiple threads?

What is the problem? We need to clearly order one computation before the other Otherwise, the result will be incorrect val = account.balance; newval = val + 100; account.balance = newval val = account.balance; newval = val + 100; account.balance = newval

Solution: Use Locks Problems with Locks ◦ Does not allow disjoint access parallelism lock(); val = account.balance; newval = val + 100; account.balance = newval; unlock();

What is disjoint access parallelism? Allows code from different threads to run in parallel if they do not access the same data. Allows code from different threads to run in parallel if they do not access the same data.

Other problems with locks In a typical UNIX futex based implementation ◦ If a thread cannot get a lock for 100 µs, it invokes the kernel and goes to sleep ◦ System calls have an additional overhead ◦ They lead to OS jitter, which can be as high as tens of milliseconds ◦ [Sarangi and Kallurkar]  OS jitter slows down parallel applications by more than 10%

How to get rid of locks? Use the HW instruction ◦ CAS (atomic compare and set) Example: ◦ CAS a, 10, 5 a 5

Lock free Algorithm while(true) { = account.balance; newval = val + 100; newts = ts + 1; if (CAS (account.balance,, ) ) break; } while(true) { = account.balance; newval = val + 100; newts = ts + 1; if (CAS (account.balance,, ) ) break; } value timestamp account.balance

Issues with the Lockfree Algorithm The loop might never terminate Can lead to starvation There are two metrics that we need to optimize fairness speed

How to increase the balance? Wait free algorithms Basic Idea A request, T, first finds another request, R, that is waiting for a long time T decides to help R This strategy ensures that no request is left behind Also known as an altruistic algorithm

Support Required dcas (double CAS) instruction dcas(a, v1, v2, b, v3, v4) if ((a = v1), and (b = v3)) set a = v2 set b = v4 if ((a = v1), and (b = v3)) set a = v2 set b = v4 Atomic

Implementation of a Wait Free Algorithm while(true) { = T.account.balance; newval = val + 100; newts = ts + 1; if (dcas (T.account.balance,,, T.status, 0, 1) ) break; if(T.status == 1) break; } while(true) { = T.account.balance; newval = val + 100; newts = ts + 1; if (dcas (T.account.balance,,, T.status, 0, 1) ) break; if(T.status == 1) break; } repeat until (R = null) R = needsHelp(); if (R != null) help (R); help (T) repeat until (R = null) R = needsHelp(); if (R != null) help (R); help (T)

Issues in implementing a wait free algorithm The dcas instruction is not available on most machines Possible to implement it with regular cas instructions Wait free algorithms are thus very complicated fairness speed

Implementing a wait free algorithm is the same as … A black belt in programming

Outline Multicore Processors Parallel Programming Paradigms Transactional Memory: Basics Software Transactional Memory(STM) Hardware Transactional Memory

Transactional Memory (TM) What is the best way to achieve both speed and fairness? Try transactional memory: begin(atomic) { val = account.balance; newval = val + 100; account.balance = newval; } begin(atomic) { val = account.balance; newval = val + 100; account.balance = newval; }

Advantages Easy to program Tries to provide the optimal fairness and speed Similar to database transactions ◦ ACID  (atomic, consistent, isolated, durable) Hardware TM Software TM

Basics of Transactional Memory Notion of a conflict Two transactions conflict, when there is a possibility of an error, if they execute in parallel Formally: set of variables that are read by the transaction set of variables that are read by the transaction set of variables that are written by the transaction set of variables that are written by the transaction read set write set

When do transactions conflict? Let R i and W i, be the read and write sets of transaction, i Similarly, let R j and W j be the read and write sets of transaction, j There is a conflict iff: W i ∩ W j ≠ φ W i ∩ R j ≠ φ R i ∩ W j ≠ φ OR

Abort and Commit Commit ◦ A transaction completed without any conflicts ◦ Finished writing its data to main memory Abort ◦ A transaction could not complete due to conflicts ◦ Did not make any of its writes visible

Basics of Concurrency Control A conflict occurs when the read-write sets overlap A conflict is detected when the TM system becomes aware of it A conflict is resolved when the TM system either ◦ delays a transaction ◦ aborts it occurrence detection resolution

Pessimistic vs Optimistic Concurrency Control pessimistic concurrency control occurrence, detection, resolution occurrence, detection, resolution optimistic concurrency control occurrence detection resolution detection resolution

Version Management Eager version management ◦ Write directly to memory ◦ Maintain an undo log Lazy version management ◦ Write to a buffer (redo log) ◦ Transfer the buffer to memory on a commit commit abort flush undo log writeback undo log commit abort writeback redo log writeback redo log flush redo log

Conflict Detection Eager ◦ Check for conflicts as soon as a transaction accesses a memory location Lazy ◦ Check at the time of committing a transaction

Semantics of Transactions Serializable ◦ Transactions can be ordered sequentially Strictly Serializable ◦ The sequential ordering is consistent with the real time ordering Linearizable ◦ A transaction appears to execute instantaneously Opacity ◦ A transaction is strictly serializable with respect to non-transactional accesses also

What happens after an abort? The transaction restarts and re-executes Might wait for a random duration of time to minimize future conflicts ◦ do { … } while (! Tx.commit())

Outline Multicore Processors Parallel Programming Pardigms Transactional Memory: Basics Software Transactional Memory(STM) Hardware Transactional Memory

Software Transactional Memory Concurrency Control ◦ Optimistic or Pessimistic Version Management ◦ Lazy or Eager Conflict Detection ◦ Lazy or Eager choices

Support Required Augment every transactional object/ variable with meta data object metadata 1.Transaction that has locked the object 2.Read or write 1.Transaction that has locked the object 2.Read or write

Maintaining Read –Write Sets Each transaction maintains a list of locations that it has ◦ read in the read-set ◦ written in the write-set Every memory read or write operation is augmented ◦ readTX (read, and enter in the read set) ◦ writeTX(write, and enter in the write set, make changes to the undo/redo log)

Bartok STM Every variable has the following fields ◦ version ◦ value ◦ lock value version Transactional Variable

Read Operation Record the version of the variable Add the variable to the read set Read the value

Write Operation Lock the variable Abort if it is already locked Lock the variable Abort if it is already locked Add the old value to the undo log Write the new value

Commit Operation Check if the version of the variable is still the same For each entry in the read set No Yes For each entry in the write set Increment the version Release the lock Abort

Pros and Cons simple reads are simple provide a strong semantics for transactions provide a strong semantics for transactions does not provide opacity does not provide opacity uses locks

TL2 STM Uses lazy version management  redo log Uses a global timestamp Provides strong guarantees with respect to other transactions, and even operations that are not within the context of a transaction Locks variables only at commit time Every transaction tX has a unique version (tx.V) that is assigned to it when it starts

Read Operation read (tX, obj) obj in the redo log v1 = obj.timestamp result = obj.value v2 = obj.timestamp if( (v1 != v2) || (v1 > tX.V) || obj.lock) abort(); addToReadSet(obj); return result; v1 = obj.timestamp result = obj.value v2 = obj.timestamp if( (v1 != v2) || (v1 > tX.V) || obj.lock) abort(); addToReadSet(obj); return result; No Yes Return value in the redo log

Write Operation Add entry to the redo log if required Perform the write

Commit Operation For each entry in the write set Lock object abort failure version  globalClock + 1 For each entry e in the read set if (e.version > tx.V) abort failure writeback redo log For each entry in the write set set the version, undo lock set the version, undo lock

Pros and Cons simple provides opacity A redo log is slower uses locks holds locks for a lesser amount of time holds locks for a lesser amount of time

Outline Multicore Processors Parallel Programming Pardigms Transactional Memory: Basics Software Transactional Memory(STM) Hardware Transactional Memory

pessimistic concurrency control eager conflict detection lazy version management Processor L1 Cache L2 Cache Augment with a speculative bit

Basics of Hardware Transactions – Extend the Directory Protocol Start Transaction Write back all the modified lines in the L1 cache to the lower level Write Operation 1. If not in the M state, broadcast the write to all the processors 2. If any processor has speculatively written to the location, then one Tx aborts else mark line as speculative. 1. If not in the M state, broadcast the write to all the processors 2. If any processor has speculatively written to the location, then one Tx aborts else mark line as speculative. Read Operation 1. Broadcast the read to all the processors (to change to the S state). Abort a Tx if another processor is speculatively writing to that line. Commit Operation 1. Convert all speculative data to non-speculative (gang clear mechanism) Abort Operation 1. Convert all speculative data to invalid