Christopher J. Rossbach, Owen S. Hofmann, Donald E. Porter, Hany E. Ramadan, Aditya Bhandari, and Emmett Witchel - Presentation By Sathish P.

Slides:



Advertisements
Similar presentations
Operating Systems Semaphores II
Advertisements

Database Systems (資料庫系統)
Concurrency: Mutual Exclusion and Synchronization Chapter 5.
Resource management and Synchronization Akos Ledeczi EECE 354, Fall 2010 Vanderbilt University.
Chapter 6: Process Synchronization
5.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 5: CPU Scheduling.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
EEE 435 Principles of Operating Systems Interprocess Communication Pt II (Modern Operating Systems 2.3)
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
Lock-Based Concurrency Control
Parallel Processing (CS526) Spring 2012(Week 6).  A parallel algorithm is a group of partitioned tasks that work with each other to solve a large problem.
Big Picture Lab 4 Operating Systems Csaba Andras Moritz.
PARALLEL PROGRAMMING with TRANSACTIONAL MEMORY Pratibha Kona.
The Performance of Spin Lock Alternatives for Shared-Memory Microprocessors Thomas E. Anderson Presented by David Woodard.
1 MetaTM/TxLinux: Transactional Memory For An Operating System Hany E. Ramadan, Christopher J. Rossbach, Donald E. Porter and Owen S. Hofmann Presenter:
Figure 2.8 Compiler phases Compiling. Figure 2.9 Object module Linking.
Concurrent Processes Lecture 5. Introduction Modern operating systems can handle more than one process at a time System scheduler manages processes and.
Concurrency: Deadlock and Starvation Chapter 6. Revision Describe three necessary conditions for deadlock Which condition is the result of the three necessary.
Concurrency: Mutual Exclusion, Synchronization, Deadlock, and Starvation in Representative Operating Systems.
CS510 Concurrent Systems Class 13 Software Transactional Memory Should Not be Obstruction-Free.
TxLinux: Using and Managing Hardware Transactional Memory in an Operating System Christopher J. Rossbach, Owen S. Hofmann, Donald E. Porter, Hany E. Ramadan,
1 Concurrency: Deadlock and Starvation Chapter 6.
CS533 - Concepts of Operating Systems 1 CS533 Concepts of Operating Systems Class 8 Synchronization on Multiprocessors.
Why The Grass May Not Be Greener On The Other Side: A Comparison of Locking vs. Transactional Memory Written by: Paul E. McKenney Jonathan Walpole Maged.
Concurrency: Deadlock and Starvation Chapter 6. Goal and approach Deadlock and starvation Underlying principles Solutions? –Prevention –Detection –Avoidance.
Introduction to Embedded Systems
Chapter 6 Concurrency: Deadlock and Starvation Operating Systems: Internals and Design Principles, 6/E William Stallings Dave Bremer Otago Polytechnic,
Solution to Dining Philosophers. Each philosopher I invokes the operations pickup() and putdown() in the following sequence: dp.pickup(i) EAT dp.putdown(i)
The Linux Kernel: A Challenging Workload for Transactional Memory Hany E. Ramadan Christopher J. Rossbach Emmett Witchel Operating Systems & Architecture.
Cosc 4740 Chapter 6, Part 3 Process Synchronization.
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Introduction to Concurrency.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
Internet Software Development Controlling Threads Paul J Krause.
Concurrency: Mutual Exclusion and Synchronization Chapter 5.
Kernel Locking Techniques by Robert Love presented by Scott Price.
CS140 Project 1: Threads Slides by Kiyoshi Shikuma.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
CS510 Concurrent Systems Why the Grass May Not Be Greener on the Other Side: A Comparison of Locking and Transactional Memory.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
Copyright © Curt Hill Concurrent Execution An Overview for Database.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
© 2008 Multifacet ProjectUniversity of Wisconsin-Madison Pathological Interaction of Locks with Transactional Memory Haris Volos, Neelam Goyal, Michael.
CGS 3763 Operating Systems Concepts Spring 2013 Dan C. Marinescu Office: HEC 304 Office hours: M-Wd 11: :30 AM.
Solving Difficult HTM Problems Without Difficult Hardware Owen Hofmann, Donald Porter, Hany Ramadan, Christopher Rossbach, and Emmett Witchel University.
Operating Systems Unit 2: – Process Context switch Interrupt Interprocess communication – Thread Thread models Operating Systems.
1 Critical Section Problem CIS 450 Winter 2003 Professor Jinhua Guo.
Architectural Features of Transactional Memory Designs for an Operating System Chris Rossbach, Hany Ramadan, Don Porter Advanced Computer Architecture.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
Big Picture Lab 4 Operating Systems C Andras Moritz
Background on the need for Synchronization
Process Synchronization
Minh, Trautmann, Chung, McDonald, Bronson, Casper, Kozyrakis, Olukotun
Advanced Operating Systems - Fall 2009 Lecture 8 – Wednesday February 4, 2009 Dan C. Marinescu Office: HEC 439 B. Office hours: M,
Designing Parallel Algorithms (Synchronization)
Christopher J. Rossbach, Owen S. Hofmann, Donald E. Porter, Hany E
Chapter 15 : Concurrency Control
Lecture 2 Part 2 Process Synchronization
Threads Chapter 4.
Concurrency: Mutual Exclusion and Process Synchronization
Software Transactional Memory Should Not be Obstruction-Free
Kernel Synchronization II
CSE 451: Operating Systems Autumn 2003 Lecture 7 Synchronization
CSE 451: Operating Systems Autumn 2005 Lecture 7 Synchronization
CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization
CSE 153 Design of Operating Systems Winter 19
CS333 Intro to Operating Systems
CONCURRENCY Concurrency is the tendency for different tasks to happen at the same time in a system ( mostly interacting with each other ) .   Parallel.
CSE 542: Operating Systems
CSE 542: Operating Systems
Presentation transcript:

Christopher J. Rossbach, Owen S. Hofmann, Donald E. Porter, Hany E. Ramadan, Aditya Bhandari, and Emmett Witchel - Presentation By Sathish P

 What is TxLinux?  What are the contributions of this paper?  Locks vs Transactions  Transaction Memory  Hardware Transaction memory  MetaTM  Output Commit Problem  Cooperative Transactional Locking  Implementing Cxspinlocks  Priority and Policy Inversion  Transaction aware scheduling  Contention Management Performance  Conclusion

 It is a variant of Linux.  It is the first OS to use Hardware Transactional Memory (HTM) as a synchronization primitive.  It is the first OS to manage HTM in the OS scheduler.

 The paper introduces the concept of cooperation between locks and transactions (Cxspinlocks).  The paper introduces integration of HTM with the OS scheduler.

s p i n _ l o c k (&aryList) ; o f f s e t = aryList[first]; i f (aryList[first]==aryList[last]) aryList[last]= 0 ; s p i n _ u n l o c k (&aryList ) ; i f ( ! ( calculateIfAnyZero()) ) goto f a i l e d ; s p i n _ l o c k (&aryList) ; l i s t _ a d d _ t a i l (val,&aryList) ; s p i n _ u n l o c k (&aryList ) ; Only one thread holds the lock. Other threads spins and waits for lock. Minimizing critical region size is required. Less conncurrency. Works good with high contention and I/O. Works as pessimistic.

xbegin ; o f f s e t = aryList[first]; i f (aryList[first]==aryList[last]) aryList[last]= 0 ; xend ; i f ( ! ( calculateIfAnyZero()) ) goto f a i l e d ; xbegin ; l i s t _ a d d _ t a i l (val,&aryList) ; xend ; Many threads run the critical section. Only one wins and others rollback. More concurrency. Critical region size can be large. Doesn’t suit when high contention. Cannot rollback when performing I/O. Works as optimistic.

 Concept of applying transactions to memory operations.  Following are the steps ◦ Step1: Check if same memory location is part of another transaction. ◦ Step2: If yes abort current transaction. ◦ Step3: If no record the current transaction referenced memory location so that other transaction in step1 can find it.

 Transactions are implemented with hardware support.  Data is stored in hardware registers and cache such that all actions are performed atomically in hardware and data is written to main memory upon committing the transaction.  If two hardware transactions are accessing the same memory then conflict occurs and hence HTM aborts one transaction.

 MetaTM is a architectural model to run TxLinux.  MetaTM uses eager conflict detection i.e. the first detection of a conflict read/write to the same address will cause transaction to restart, rather than waiting until commit time to detect and handle conflicts.  MetaTM uses the commands: ◦ Xbegin,Xend, Xpush, Xpop

 MetaTM uses xbegin, xend to start and commit a transaction.  MetaTM uses xpush to suspend a transaction, saving its state so it can continue later without restarting. Instructions executed after xpush are independent of suspended transaction. If suspended transaction can have conflicts like working transactions then the suspended transaction restarts when it resumes.  MetaTM uses xpop to resume a xpushed transaction, allowing the suspended transaction to resume.

 Transaction with operations, such as I/O, cannot be rolled back in the event that a transaction causes a conflict.  Transactions perform poor with high contention.  Hence there comes a need for mixing locks and transactions.

 In order to allow both transactions and locks in the OS, we propose a synchronization API called Cxspinlocks.  Cxspinlocks allow different executions of a single critical section to be synchronized with either locks or transactions.  So concurrency of transactions and safety of locks are added.  They support both transactional and non- transactional code maintaining fairness and high concurrency.

 Multiple transactional threads can enter a critical region without conflicting on lock variable.  Transactional threads poll the cxspinlock using the Xtest instruction set, which allow the transaction to avoid restarting when the lock is released.  Non-transactional threads acquire cxspinlocks using a hardware instruction xcas (xcas instructions favors transactional threads, mutually exclusive threads, reader etc).  This enables fairness between transactional and non-transactional threads.

 Acquired using 2 functions: cx_optimistic and cx_exclusive  cx_optimistic optimistically attempts to protect a critical section using transactions and reverts to using locks with a conflict or I/O.  cx_exclusive are used for sections which always perform I/O.

void c x _ o p t imi s t i c ( l o c k ) { s t a t u s = xbegin ; / / Use mutua l e x c l u s i o n i f r e q u i r e d i f ( s t a t u s == NEED_EXCLUSIVE) { xend ; / / x r e s t a r t f o r c l o s e d n e s t i n g i f ( g e t t x i d ) x r e s t a r t (NEED_EXCLUSIVE ) ; e l s e c x _ e x c l u s i v e ( l o c k ) ; r e t u r n ; } / / Spin wa i t i n g f o r l o c k t o be f r e e (==1) wh i l e ( x t e s t ( lock, 1)==0) ; / / s p i n d i s a b l e _ i n t e r r u p t s ( ) ; }  The status word is checked to determine whether this transaction has restarted and if so, the critical section is entered exclusively, using cx_exclusive.  If mutual exclusion is not entered, then the thread waits for the spinlock to be unlocked, indicating there are zero non-transactional threads in the critical section.  The code that polls the lock uses xtest to avoid adding the lock variable into its read set and hence preventing from restarting.

void c x _ e x c l u s i v e ( l o c k ) { / / Only f o r non−t r a n s a c t i o n a l t h r e a d s i f ( x g e t t x i d ) x r e s t a r t (NEED_EXCLUSIVE ) ; wh i l e ( 1 ) { / / Spin wa i t i n g f o r l o c k t o be f r e e wh i l e ( l o c k != 1) ; / / s p i n d i s a b l e _ i n t e r r u p t s ( ) ; / / Acqui r e l o c k by s e t t i n g i t t o 0 / / Co n t e n t i o n manager a r b i t r a t e s l o c k i f ( xcas ( lock, 1, 0 ) ) b r e a k ; e n a b l e _ i n t e r r u p t s ( ) ; }  cx_exclusive uses xgettxid to detect an active transaction. If there is an active transaction, then that transaction is made exclusive.  The code issues xrestart with a status code NEED_EXCLUSIVE indicating that exclusion is required.  If there is no active transaction, the non-transactional thread enters the critical section by locking the cxspinlock using the xcas instruction.

 Locks can invert OS scheduling priority, resulting in a higher priority thread waiting for a lower-priority thread.  The contention manager of an HTM system provides solution for priority inversion.  Whenever a conflict occurs in transaction, then contention manager solves it by favoring it to thread of higher priority.  Another simple hardware contention management is using timestamp, the oldest transaction wins.

 OS provide real-time threads to synchronize with non real-time threads. Such synchronization can cause policy inversion where a real-time thread waits for a non- real-time thread.

 MetaTM implements os_prio to solve priority and policy inversion.  It schedules the transactions with the greatest scheduling value to the OS.  When the scheduling priority value ties then os_prio employs SizeMatters.  If the transaction sizes are equal, then os_prio employs timestamp.

 The operating system’s scheduler uses processes transaction state to mitigate the effects of high contention.  MetaTM uses the transaction status word to determine the status of the current transaction.  Using the status information, the scheduler dynamically adjusts priority or de-schedules processes preventing them from repeated restarts.

 Average of 9.5% of all transactional conflicts resolved in favor of thread with lower OS priority when using a simple “SizeMatters” contention management policy.  Using OS priority in contention management entirely eliminates inversions at the cost of 2.5% of performance using the default Linux scheduler and of 1.0% using a modified scheduler.

 The cxspinlock primitive is solution to the long-standing problem of I/O in transactions.  The cxspinlock API eases conversion from locking primitives to transactions.  HTM aware scheduling eliminates priority inversion, and provides better management of very high contention.