Presentation is loading. Please wait.

Presentation is loading. Please wait.

Transactional Memory : Hardware Proposals Overview

Similar presentations


Presentation on theme: "Transactional Memory : Hardware Proposals Overview"— Presentation transcript:

1 Transactional Memory : Hardware Proposals Overview
Manu Awasthi Architecture Reading Club Fall 2006

2 Why do we care? Today’s methodologies (Locks)
The rise of multicore architectures, CMP’s (Support for) Lots of cheap threads available Synchronization will be an issue Concurrent updates on shared memory Today’s methodologies (Locks) Are not scalable Fail to exploit concurrency to the fullest

3 Why Locks are EVIL? Locks: objects only one thread can hold at a time
Organization: lock for each shared structure Usage: (block)  acquire  access  release Correctness issues Under-locking  data races Acquires in different orders  deadlock Performance issues Conservative serialization Overhead of acquiring Difficult to find right granularity Blocking

4 Example of evil Locks struct Shared_Structure{ int shared_var1;
: };

5 Example of evil Locks struct Shared_Structure{ int shared_var1;
: };

6 Example of evil Locks struct Shared_Structure{ int shared_var1;
: };

7 Example of evil Locks struct Shared_Structure{ int shared_var1;
: };

8 Coarse-Grained Locking
Easily made correct … But not scalable.

9 Fine-Grained Locking more scalable
High overhead in acquire and release Increased complexity

10 Enter Transactions… Code segments with three features:
Atomicity Serialization only on conflicts Rollback support <begin_transaction> { statement_1; statement_2; statement_3;….. } <end_transaction> Generally, critical section = transaction atomic instructions

11 Agenda Transactions: what all the hoopla’s about Research Proposals
Usages Implementations Disclaimer 1: Covering only hardware support Disclaimer 2: Purely an overview

12 Hardware Overview Exploit Cache coherence protocols
Already do almost what we need Invalidation Consistency checking Exploit Speculative execution Branch prediction = optimistic synchro!

13 Execution Strategy Four main components:
Logging/buffering (Speculative Execution) Conflict detection Abort/rollback Commit All papers present different methods of doing the above.

14 HW Transactional Memory
read active T caches Interconnect memory

15 Transactional Memory read active active T T caches memory

16 Transactional Memory active committed active T T caches memory

17 Transactional Memory write committed active T D caches memory

18 Rewind write aborted active active T T D caches memory

19 Transaction Commit At commit point Mark transactional entries
If no cache conflicts, we win. Mark transactional entries Read-only: valid Modified: dirty (eventually written back)

20 But…. Limits to Transaction cannot commit if it is
Transactional cache size Scheduling quantum Transaction cannot commit if it is Too big Too slow Actual limits platform-dependent

21 TLR/SLE Transactional execution of critical sections.
[Rajwar & Goodman, ASPLOS ‘02] TLR/SLE Transactional execution of critical sections. Locks define scope of a transaction Doesn’t change the programming model H/W identifies and speculatively executes critical sections. Timestamps provide serializabilty.

22 SLE Mechanism to identify lock acquires and releases
Enabling mechanism for TLR Concept of silent stores

23 SLE Algo

24 SLE Algo

25 Livelocks

26 TLR Algo..

27 TCC @ Stanford Again, speculative transaction execution
[Hammond+, ISCA ‘04 & ASPLOS ‘04] Stanford Again, speculative transaction execution Identify transaction start and end Read set, write set. Save architectural state Check for conflicts on memory references Snoop over system bus to check for violations Fold the commit state in a packet Send over sys bus, commit Centralized bus arbiter => scalability limits!!

28 TCC – Programming Model
Divide into transactions Here, its programmer’s job However, easier to do than locks. Why? Specify order In case relative ordering of transaction commit matters e.g.? Assign phase numbers to transactions.

29 TCC Node

30 Some Results Small read state (6-12 kB) Write state (4-8 kB)
Both of above per benchmark, per processor Significant speedup Not so modest bandwidth requirements

31 UTM/LTM @ Stanford Most transactions are small
99.9% touch 54 cache lines or less BUT, some go upto 8000 lines (!!!!!) Thesis : transaction footprint should be unbounded Added ISA support for the same Book-keeping, in memory, transaction log Helps survive interrupts, process migration

32 So, What’s New? Rollback Support : Rename Tables snapshot.
ISA support XBEGIN pc XEND Rollback Support : Rename Tables snapshot. Xstate data structure for memory state has log records of all active transactions Log = commit record + log entry vector Log pointer RW bit

33 Processor Modifications

34 The Xstate DS

35 Interesting Results

36 LogTM @ UW-Madison Motivation : Make the common case fast
Commits are more frequent than aborts Basic Strategy : similar to UTM Store new values in place, old values in log Log properties Per thread log Cacheable in virtual memory i.e. part of thread address space reserved for logging. Log writes mostly cache hits (small transactions) Low TLB translation overhead (small transactions)

37

38 Conflict Detection Directory based protocol Extended Directory states
Send request to directory Directory forwards requests to processors Each processors checks for conflicts Ack (No conflict), Nack (Conflict) Resolve conflict based on responses. Extended Directory states For taking care of transactional line overflow

39 More Work @ UW-Madison VTM (Rajwar+) Thread Level TM (Goodman +)
Goal: persistent transactions with less overhead Approach: group transactions by process Implementation: buffer in cache + overflow table in virtual memory + various interesting optimizations

40 Summary Transactions: Promising approach to synchronization Challenges
Simple interface + efficient implementation Uses: optimistic lock removal, lock-free data structures, general-purpose synchronization, parallelization, ?? Challenges Implementation Interface OS involvement I/O + rollback


Download ppt "Transactional Memory : Hardware Proposals Overview"

Similar presentations


Ads by Google