On the Performance of Window-Based Contention Managers for Transactional Memory Gokarna Sharma and Costas Busch Louisiana State University.

Slides:



Advertisements
Similar presentations
Time-based Transactional Memory with Scalable Time Bases Torvald Riegel, Christof Fetzer, Pascal Felber Presented By: Michael Gendelman.
Advertisements

Raphael Eidenbenz Roger Wattenhofer Roger Wattenhofer Good Programming in Transactional Memory Game Theory Meets Multicore Architecture.
Steal-on-abort Improving Transactional Memory Performance through Dynamic Transaction Reordering Mohammad Ansari University of Manchester.
SLA-Oriented Resource Provisioning for Cloud Computing
Presented by: Dmitri Perelman.  Intro  “Don’t touch my read-set” approach  “Precedence graphs” approach  On avoiding spare aborts  Your questions.
IDIT KEIDAR DMITRI PERELMAN RUI FAN EuroTM 2011 Maintaining Multiple Versions in Software Transactional Memory 1.
Evaluating Database-Oriented Replication Schemes in Software Transacional Memory Systems Roberto Palmieri Francesco Quaglia (La Sapienza, University of.
Transactional Contention Management as a Non-Clairvoyant Scheduling Problem Alessia Milani [Attiya et al. PODC 06] [Attiya and Milani OPODIS 09]
Concurrent Data Structures in Architectures with Limited Shared Memory Support Ivan Walulya Yiannis Nikolakopoulos Marina Papatriantafilou Philippas Tsigas.
Transactional Memory (TM) Evan Jolley EE 6633 December 7, 2012.
Variability in Architectural Simulations of Multi-threaded Workloads Alaa R. Alameldeen and David A. Wood University of Wisconsin-Madison
Progress Guarantee for Parallel Programs via Bounded Lock-Freedom Erez Petrank – Technion Madanlal Musuvathi- Microsoft Bjarne Steensgaard - Microsoft.
Presented by: Ofer Kiselov & Omer Kiselov Supervised by: Dmitri Perelman Final Presentation.
TOWARDS A SOFTWARE TRANSACTIONAL MEMORY FOR GRAPHICS PROCESSORS Daniel Cederman, Philippas Tsigas and Muhammad Tayyab Chaudhry.
DMITRI PERELMAN IDIT KEIDAR TRANSACT 2010 SMV: Selective Multi-Versioning STM 1.
Idit Keidar and Dmitri Perelman Technion 1 SPAA 2009.
Lock-free Cuckoo Hashing Nhan Nguyen & Philippas Tsigas ICDCS 2014 Distributed Computing and Systems Chalmers University of Technology Gothenburg, Sweden.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
1 Johannes Schneider Transactional Memory: How to Perform Load Adaption in a Simple And Distributed Manner Johannes Schneider David Hasenfratz Roger Wattenhofer.
Distributed Systems 2006 Styles of Client/Server Computing.
1 MetaTM/TxLinux: Transactional Memory For An Operating System Hany E. Ramadan, Christopher J. Rossbach, Donald E. Porter and Owen S. Hofmann Presenter:
Lock vs. Lock-Free memory Fahad Alduraibi, Aws Ahmad, and Eman Elrifaei.
University of Michigan Electrical Engineering and Computer Science 1 Parallelizing Sequential Applications on Commodity Hardware Using a Low-Cost Software.
Transactional contention Management as a Non- Clairvoyant Scheduling Problem Hagit Attiya, Leah Epstein, Hadas Shachnai, Tami Tamir Presented by Anastasia.
Selfishness in Transactional Memory Raphael Eidenbenz, Roger Wattenhofer Distributed Computing Group Game Theory meets Multicore Architecture.
1 Scalable Transactional Memory Scheduling Gokarna Sharma (A joint work with Costas Busch) Louisiana State University.
A Dynamic Elimination-Combining Stack Algorithm Gal Bar-Nissan, Danny Hendler and Adi Suissa Department of Computer Science, BGU, January 2011 Presnted.
By- Jaideep Moses, Ravi Iyer , Ramesh Illikkal and
Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.
A Transaction-Friendly Dynamic Memory Manager for Embedded Multicore Systems Maurice Herlihy Joint with Thomas Carle, Dimitra Papagiannopoulou Iris Bahar,
An Introduction to Software Transactional Memory
Authors: Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn [Systems Technology Lab, Intel Corporation] Source: 2007 ACM/IEEE conference on Supercomputing.
Efficient Scheduling of Heterogeneous Continuous Queries Mohamed A. Sharaf Panos K. Chrysanthis Alexandros Labrinidis Kirk Pruhs Advanced Data Management.
Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads Iraklis Psaroudakis (EPFL), Tobias Scheuer (SAP AG), Norman May.
Software Transactional Memory for Dynamic-Sized Data Structures Maurice Herlihy, Victor Luchangco, Mark Moir, William Scherer Presented by: Gokul Soundararajan.
Window-Based Greedy Contention Management for Transactional Memory Gokarna Sharma (LSU) Brett Estrade (Univ. of Houston) Costas Busch (LSU) 1DISC 2010.
Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.
1 Parallelizing FPGA Placement with TMSteffan Parallelizing FPGA Placement with Transactional Memory Steven Birk*, Greg Steffan**, and Jason Anderson**
Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory Alexander Matveev Nir Shavit MIT.
School of Information Technologies Michael Cahill 1, Uwe Röhm and Alan Fekete School of IT, University of Sydney {mjc, roehm, Serializable.
A Qualitative Survey of Modern Software Transactional Memory Systems Virendra J. Marathe Michael L. Scott.
Integrating and Optimizing Transactional Memory in a Data Mining Middleware Vignesh Ravi and Gagan Agrawal Department of ComputerScience and Engg. The.
Low-Overhead Software Transactional Memory with Progress Guarantees and Strong Semantics Minjia Zhang, 1 Jipeng Huang, Man Cao, Michael D. Bond.
Transactional Lee’s Algorithm 1 A Study of a Transactional Routing Algorithm Ian Watson, Chris Kirkham & Mikel Lujan School of Computer Science University.
Wait-Free Multi-Word Compare- And-Swap using Greedy Helping and Grabbing Håkan Sundell PDPTA 2009.
CS510 Concurrent Systems Why the Grass May Not Be Greener on the Other Side: A Comparison of Locking and Transactional Memory.
Technology from seed Exploiting Off-the-Shelf Virtual Memory Mechanisms to Boost Software Transactional Memory Amin Mohtasham, Paulo Ferreira and João.
Parallelism without Concurrency Charles E. Leiserson MIT.
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
Improving the Performance Competitive Ratios of Transactional Memory Contention Managers Gokarna Sharma Costas Busch Louisiana State University, USA WTTM.
Vertex Coloring Distributed Algorithms for Multi-Agent Networks
Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences Processes Scheduling on Heterogeneous Multi-core Architecture.
Lecture 27 Multiprocessor Scheduling. Last lecture: VMM Two old problems: CPU virtualization and memory virtualization I/O virtualization Today Issues.
ECE 1747: Parallel Programming Short Introduction to Transactions and Transactional Memory (a.k.a. Speculative Synchronization)
Window-Based Greedy Contention Management for Transactional Memory Gokarna Sharma (LSU) Brett Estrade (Univ. of Houston) Costas Busch (LSU) DISC
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Transactional Contention Management as a Non-Clairvoyant Scheduling Problem Hagit Attiya, Alessia Milani Technion, Haifa-LABRI, University of Bordeaux.
Maurice Herlihy and J. Eliot B. Moss,  ISCA '93
Irina Calciu Justin Gottschlich Tatiana Shpeisman Gilles Pokam
Algorithmic Improvements for Fast Concurrent Cuckoo Hashing
Transactional Memory: How to Perform Load Adaption
PHyTM: Persistent Hybrid Transactional Memory
Timothy Zhu and Huapeng Zhou
Challenges in Concurrent Computing
Department of Computer Science University of California, Santa Barbara
Gokarna Sharma Costas Busch Louisiana State University, USA
Yiannis Nikolakopoulos
CS510 - Portland State University
Department of Computer Science University of California, Santa Barbara
Dynamic Performance Tuning of Word-Based Software Transactional Memory
Presentation transcript:

On the Performance of Window-Based Contention Managers for Transactional Memory Gokarna Sharma and Costas Busch Louisiana State University

Agenda Introduction and Motivation Previous Studies and Limitations Execution Window Model  Theoretical Results  Experimental Results Conclusions and Future Directions

Retrospective 1993  A seminal paper by Maurice Herlihy and J. Eliot B. Moss: “Transactional Memory: Architectural Support for Lock-Free Data Structures” Today  Several STM/HTM implementation efforts by Intel, Sun, IBM; growing attention Why TM?  Many drawbacks of traditional approaches using Locks, Monitors: error-prone, difficult, composability, … lock data modify/use data unlock data Lock: only one thread can execute TM: many threads can execute atomic { modify/use data }

Transactional Memory Transactions perform a sequence of read and write operations on shared resources and appear to execute atomically TM may allow transactions to run concurrently but the results must be equivalent to some sequential execution Example: ACI(D) properties to ensure correctness Initially, x == 1, y == 2 atomic { x = 2; y = x+1; } atomic { r1 = x; r2 = y; } T1 T2 T1 then T2 r1==2, r2==3 T2 then T1 r1==1, r2==2 x = 2; y = 3; T1 r1 == 1 r2 = 3; T2 Incorrect r1 == 1, r2 == 3

Software TM Systems Conflicts:  A contention manager decides  Aborts or delay a transaction Centralized or Distributed:  Each thread may have its own CM Example: atomic { … x = 2; } atomic { y = 2; … x = 3; } T1 T2 Initially, x == 1, y == 1 conflict Abort undo changes (set x==1) and restart atomic { … x = 2; } atomic { y = 2; … x = 3; } T1 T2 conflict Abort (set y==1) and restart OR wait and retry

Transaction Scheduling The most common model:  m concurrent transactions on m cores that share s objects  Sequence of operations and a operation takes one time unit  Duration is fixed Throughput Guarantees:  Makespan: the time needed to commit all m transactions  Makespan of my CM Makespan of optimal CM Problem Complexity:  NP-Hard (related to vertex coloring) Challenge:  How to schedule transactions so that makespan is minimized? Competitive Ratio :

Literature Lots of proposals  Polka, Priority, Karma, SizeMatters, … Drawbacks  Some need globally shared data (i.e., global clock)  Workload dependent  Many have no theoretical provable properties i.e., Polka – but overall good empirical performance Mostly empirical evaluation using different benchmarks  Choice of a contention manager significantly affects the performance  Do not perform well in the worst-case (i.e., contention, system size, and number of threads increase)

Literature on Theoretical Bounds

Objectives Scalable transactional memory scheduling:  Design contention managers that exhibit both good theoretical and empirical performance guarantees  Design contention managers that scale well with the system size and complexity

1 23 n n m m Transactions... Threads Execution Window Model Collection of n sets of m concurrent transactions that share s objects Assuming maximum degree in conflict graph C and execution time duration τ Serialization upper bound: τ. min(Cn,mn) One-shot bound: O(sn) [Attiya et al., PODC’06] Using RandomizedRounds: O(τ. Cn log m)

Theoretical Results Offline Algorithm: (maximal independent sets)  For scheduling with conflicts environments, i.e., traffic intersection control, dining philosophers problem  Makespan: O(τ. (C + n log (mn)), (C is the conflict measure)  Competitive ratio: O(s + log (mn)) whp Online Algorithm: (random priorities)  For online scheduling environments  Makespan: O(τ. (C log (mn) + n log 2 (mn)))  Competitive ratio: O(s log (mn) + log 2 (mn))) whp Adaptive Algorithm  Conflict graph and maximum degree C both not known  Adaptively guesses C starting from 1

Intuition (1) Introduce random delays at the beginning of the execution window 1 23 n n m m Transactions... n n’ Random interval n m Random delays help conflicting transactions shift avoiding many conflicts

Intuition (2) Frame based execution to handle conflicts m Frame size q1q1 q2q2 q3q3 q4q4 F 11 F 12 F 1n F 21 F 31 F 41 F m1 F 3n Threads Makespan: max {qi} + No of frames X frame size

Experimental Results (1) Platform used  Intel i7 (4-core processor) with 8GB RAM and hyperthreading on Implemented window algorithms in DSTM2, an eager conflict management STM implementation Benchmarks used  List, RBTree, SkipList, and Vacation from STAMP suite. Experiments were run for 10 seconds and the data plotted are average of 6 experiments Contention managers used for comparison  Polka – Published best CM but no theoretical provable properties  Greedy – First CM with both theoretical and empirical properties  Priority – Simple priority-based CM

Experimental Results (2) Performance throughput:  No of txns committed per second  Measures the useful work done by a CM each time step

Experimental Results (3) Performance throughput: Conclusion #1: Window CMs always improve throughput over Greedy and Priority Conclusion #2: Throughput is comparable to Polka (outperforms in Vacation)

Experimental Results (4) Aborts per commit ratio:  No of txns aborted per txn commit  Measures efficiency of a CM in utilizing computing resources

Experimental Results (5) Aborts per commit ratio: Conclusion #3: Window CMs always reduce no of aborts over Greedy and Priority Conclusion #4: No of aborts are comparable to Polka (outperform in Vacation)

Experimental Results (6) Execution time overhead:  Total time needed to commit all transactions  Measures scalability of a CM in different contention scenarios

Experimental Results (7) Execution time overhead: Conclusion #5: Window CMs generally reduce execution time over Greedy and Priority (except SkipList) Conclusion #6: Window CMs good at high contention due to randomization overhead

Future Directions Encouraging theoretical and practical results Plan to explore (experimental)  Wasted Work  Repeat Conflicts  Average Response Time  Average committed transactions durations Plan to do experiments using more complex benchmarks  E.g., STAMP, STMBench7, and other STM implementations Plan to explore (theoretical)  Other contention managers with both theoretical and empirical guarantees

Conclusions TM contention management is an important online scheduling problem Contention managers should scale with the size and complexity of the system Theoretical as well as practical performance guarantees are essential for design decisions Need to explore mechanisms that scale well in other multi-core architectures:  ccNUMA and hierarchical multilevel cache architectures  Large scale distributed systems

Thank you for your attention!!!