Unbounded Transactional Memory Paper by Ananian et al. of MIT CSAIL Presented by Daniel.

Slides:



Advertisements
Similar presentations
1/1/ / faculty of Electrical Engineering eindhoven university of technology Memory Management and Protection Part 3:Virtual memory, mode switching,
Advertisements

1 Lecture 18: Transactional Memories II Papers: LogTM: Log-Based Transactional Memory, HPCA’06, Wisconsin LogTM-SE: Decoupling Hardware Transactional Memory.
CS 162 Memory Consistency Models. Memory operations are reordered to improve performance Hardware (e.g., store buffer, reorder buffer) Compiler (e.g.,
Hardware Transactional Memory for GPU Architectures Wilson W. L. Fung Inderpeet Singh Andrew Brownsword Tor M. Aamodt University of British Columbia In.
Transactional Memory Supporting Large Transactions Anvesh Komuravelli Abe Othman Kanat Tangwongsan Hardware-based.
1 Lecture 20: Speculation Papers: Is SC+ILP=RC?, Purdue, ISCA’99 Coherence Decoupling: Making Use of Incoherence, Wisconsin, ASPLOS’04 Selective, Accurate,
1 Hardware Transactional Memory Royi Maimon Merav Havuv 27/5/2007.
Thread-Level Transactional Memory Decoupling Interface and Implementation UW Computer Architecture Affiliates Conference Kevin Moore October 21, 2004.
Transactional Memory (TM) Evan Jolley EE 6633 December 7, 2012.
Nested Transactional Memory: Model and Preliminary Architecture Sketches J. Eliot B. Moss Antony L. Hosking.
1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
Efficient and Flexible Architectural Support for Dynamic Monitoring YUANYUAN ZHOU, PIN ZHOU, FENG QIN, WEI LIU, & JOSEP TORRELLAS UIUC.
1 MetaTM/TxLinux: Transactional Memory For An Operating System Hany E. Ramadan, Christopher J. Rossbach, Donald E. Porter and Owen S. Hofmann Presenter:
Architectural Support for OS March 29, 2000 Instructor: Gary Kimura Slides courtesy of Hank Levy.
[ 1 ] Agenda Overview of transactional memory (now) Two talks on challenges of transactional memory Rebuttals/panel discussion.
1 Lecture 23: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
Multiprocessing Memory Management
1 Virtual Memory vs. Physical Memory So far, all of a job’s virtual address space must be in physical memory However, many parts of programs are never.
1 Lecture 8: Transactional Memory – TCC Topics: “lazy” implementation (TCC)
1 Lecture 24: Transactional Memory Topics: transactional memory implementations.
Supporting Nested Transactional Memory in LogTM Authors Michelle J Moravan Mark Hill Jayaram Bobba Ben Liblit Kevin Moore Michael Swift Luke Yen David.
1 Last Class: Introduction Operating system = interface between user & architecture Importance of OS OS history: Change is only constant User-level Applications.
LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood Presented by Colleen Lewis.
1 OS & Computer Architecture Modern OS Functionality (brief review) Architecture Basics Hardware Support for OS Features.
KAUSHIK LAKSHMINARAYANAN MICHAEL ROZYCZKO VIVEK SESHADRI Transactional Memory: Hybrid Hardware/Software Approaches.
Memory Management in Windows and Linux &. Windows Memory Management Virtual memory manager (VMM) –Executive component responsible for managing memory.
Transactional Memory CDA6159. Outline Introduction Paper 1: Architectural Support for Lock-Free Data Structures (Maurice Herlihy, ISCA ‘93) Paper 2: Transactional.
Sutirtha Sanyal (Barcelona Supercomputing Center, Barcelona) Accelerating Hardware Transactional Memory (HTM) with Dynamic Filtering of Privatized Data.
1 Hardware Transactional Memory (Herlihy, Moss, 1993) Some slides are taken from a presentation by Royi Maimon & Merav Havuv, prepared for a seminar given.
Virtual Memory Expanding Memory Multiple Concurrent Processes.
Operating Systems Lecture No. 2. Basic Elements  At a top level, a computer consists of a processor, memory and I/ O Components.  These components are.
Colorama: Architectural Support for Data-Centric Synchronization Luis Ceze, Pablo Montesinos, Christoph von Praun, and Josep Torrellas, HPCA 2007 Shimin.
Implementing Precise Interrupts in Pipelined Processors James E. Smith Andrew R.Pleszkun Presented By: Ravikumar Source:
Transactional Coherence and Consistency Presenters: Muhammad Mohsin Butt. (g ) Coe-502 paper presentation 2.
Operating System 1 COMPUTER SYSTEM OVERVIEW Achmad Arwan, S.Kom.
Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill & David A. Wood Presented by: Eduardo Cuervo.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Demand Paging.
Making the Fast Case Common and the Uncommon Case Simple in Unbounded Transnational Memory Qi Zhu CSE 340, Spring 2008 University of Connecticut Paper.
Hardware and Software transactional memory and usages in MRE
Implementing Precise Interrupts in Pipelined Processors James E. Smith Andrew R.Pleszkun Presented By: Shrikant G.
1 Lecture 20: Speculation Papers: Is SC+ILP=RC?, Purdue, ISCA’99 Coherence Decoupling: Making Use of Incoherence, Wisconsin, ASPLOS’04.
1 Computer Architecture. 2 Basic Elements Processor Main Memory –volatile –referred to as real memory or primary memory I/O modules –secondary memory.
Architectural Features of Transactional Memory Designs for an Operating System Chris Rossbach, Hany Ramadan, Don Porter Advanced Computer Architecture.
Lecture 20: Consistency Models, TM
Maurice Herlihy and J. Eliot B. Moss,  ISCA '93
Irina Calciu Justin Gottschlich Tatiana Shpeisman Gilles Pokam
Free Transactions with Rio Vista
Minh, Trautmann, Chung, McDonald, Bronson, Casper, Kozyrakis, Olukotun
Virtualizing Transactional Memory
Transactional Memory : Hardware Proposals Overview
PHyTM: Persistent Hybrid Transactional Memory
The University of Adelaide, School of Computer Science
Commit out of order Phd student: Adrián Cristal.
Two Ideas of This Paper Using Permissions-only Cache to deduce the rate at which less-efficient overflow handling mechanisms are invoked. When the overflow.
Lecture 19: Transactional Memories III
FIGURE 12-1 Memory Hierarchy
Lecture 6: Transactions
Transactional Memory An Overview of Hardware Alternatives
Free Transactions with Rio Vista
Overview: File system implementation (cont)
Lecture 22: Consistency Models, TM
Hybrid Transactional Memory
LogTM-SE: Decoupling Hardware Transactional Memory from Caches
The University of Adelaide, School of Computer Science
Architectural Support for OS
Lecture 17 Multiprocessors and Thread-Level Parallelism
Lecture 23: Transactional Memory
Lecture: Consistency Models, TM
Concurrent Cache-Oblivious B-trees Using Transactional Memory
The University of Adelaide, School of Computer Science
Presentation transcript:

Unbounded Transactional Memory Paper by Ananian et al. of MIT CSAIL Presented by Daniel

Outline Motivation UTM vs LTM UTM in detail – Processor changes – Transaction state data structure – Operational description LTM – changes required – description Simulation Results

Motivation Transactional memory is great, but currently saddled with hardware-imposed limitations. Transactional memory must allow arbitrary sized transactions to provide ‘ease of programming’. Otherwise, TM can be as difficult to program with as locks, because programmers need to figure out how to break transactions up.

UTM vs LTM Unbounded Transactional Memory (UTM) is the first, more flexible but more complicated and hardware costly approach. Large Transactional Memory (LTM) is a less costly compromise, that still allows for transactions bigger than transactional cache, but no larger than physical memory.

UTM processor changes Two new processor instructions: transaction being and transaction end XBEGIN pc Begins a transaction, incrementing the transaction counter and saving the abort handler located at ‘pc.’ Similar to a branch instruction. XEND Finishes the current transaction, atomically committing all data.

UTM processor changes (cont.)‏ XBEGIN causes all current physical registers in use to be marked ‘saved’ and the register rename table is saved. Saved physical registers are moved to a register reserved list, not the free list, upon graduation. Because inner transactions are flattened, only one copy of architectural state is saved.

UTM processor changes (cont.)‏

UTM transaction state data structure A single data structure, the xstate holds all transactional information, and is stored in main memory. xstate contains: – All transaction logs – For each block in memory (and each paged block), a log pointer and a read or write bit. Each active transaction gets a transaction log. Transaction log contains: – Pointer to commit record (pending, aborted, committed)‏ – An array of log entries

UTM xstate (cont.)‏ Each block of memory touched by a transaction gets a log entry. Log entries contain: – Pointer to the block of memory – The clean value – Pointer to the commit record – A linked list of all log entries in all transaction logs referring to this block

UTM’s xstate

UTM description The status of each block of memory determined by following the log pointer, then following the commit record pointer. A commit consists of setting the commit record, then deleting all log pointers that are part of the transaction log. Because the speculative value is stored in memory, cleanup is only required for aborts, optimizing for the common case of success.

UTM description (cont.)‏ When a transaction loads, it ensures that the block is not part of a transaction, or that the Read bit is set. When a transaction stores, it checks that this block belongs only to this transaction. In case of conflict, newer transactions are aborted.

UTM description (cont.)‏ UTM supports caching like in earlier HTM systems. Transaction state is only moved into xstate when there is overflow or a cache coherence conflict. UTM supports transactions as large as virtual memory, by paging the xstate out to disk and using global virtual addresses.

LTM Limited to transactions the size of physical memory. Transactions are aborted upon interrupts or thread migration. Only system changes required are to the cache and processor.

LTM modifications

LTM modifications (cont.)‏ Each cache line now has a Transaction bit set when it is read or written during a transaction. If a cache line is evicted, an Overflow bit is set.

LTM description Once a transaction starts, all cache lines that are read or written cause the T bit to be set. If a cache line is accessed with the O bit, main memory is checked for that cache line. The cache coherency protocol detects conflicts as proceeds as per other HTM systems. To commit, clear all T bits and write overflowed data back.

Simulation Results LTM offers much better scaling than Conditional Load/Store locks. Overhead is less than 10%. Time dealing with overflow is insignificant. There are some applications that need huge transactional memory footprints. TM does increase concurrency by decreasing serial regions, including in the Linux kernel.