TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems Present By: Blair Fort Oct. 28, 2004.

Slides:



Advertisements
Similar presentations
The Effect of Network Total Order, Broadcast, and Remote-Write on Network- Based Shared Memory Computing Robert Stets, Sandhya Dwarkadas, Leonidas Kontothanassis,
Advertisements

Relaxed Consistency Models. Outline Lazy Release Consistency TreadMarks DSM system.
Multiple-Writer Distributed Memory. The Sequential Consistency Memory Model P1P2 P3 switch randomly set after each memory op ensures some serial order.
Distributed Shared Memory
1 Release Consistency Slides by Konstantin Shagin, 2002.
1 Munin, Clouds and Treadmarks Distributed Shared Memory course Taken from a presentation of: Maya Maimon (University of Haifa, Israel).
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
November 1, 2005Sebastian Niezgoda TreadMarks Sebastian Niezgoda.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 25: Distributed Shared Memory All slides © IG.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
(Software) Distributed Shared Memory (aka Shared Virtual Memory)
Lightweight Logging For Lazy Release Consistent DSM Costa, et. al. CS /01/01.
Memory consistency models Presented by: Gabriel Tanase.
Implementing an OpenMP Execution Environment on InfiniBand Clusters Jie Tao ¹, Wolfgang Karl ¹, and Carsten Trinitis ² ¹ Institut für Technische Informatik.
Programming Language Semantics Java Threads and Locks Informal Introduction The Java Specification Language Chapter 17.
Memory Consistency Models
Consistency. Consistency model: –A constraint on the system state observable by applications Examples: –Local/disk memory : –Database: What is consistency?
CSS434 DSM1 CSS434 Distributed Shared Memory Textbook Ch18 Professor: Munehiro Fukuda.
Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.
Distributed Shared Memory Systems and Programming
Shared Memory – Consistency of Shared Variables The ideal picture of shared memory: CPU0CPU1CPU2CPU3 Shared Memory Read/ Write The actual architecture.
TreadMarks Distributed Shared Memory on Standard Workstations and Operating Systems Pete Keleher, Alan Cox, Sandhya Dwarkadas, Willy Zwaenepoel.
Distributed Shared Memory: A Survey of Issues and Algorithms B,. Nitzberg and V. Lo University of Oregon.
Lazy Release Consistency for Software Distributed Shared Memory Pete Keleher Alan L. Cox Willy Z.
TECHNIQUES FOR REDUCING CONSISTENCY- RELATED COMMUNICATION IN DISTRIBUTED SHARED-MEMORY SYSTEMS J. B. Carter University of Utah J. K. Bennett and W. Zwaenepoel.
TreadMarks Presented By: Jason Robey. Cool pic from last semester.
Case Study in Computational Science & Engineering - Lecture 2 1 Parallel Architecture Models Shared Memory –Dual/Quad Pentium, Cray T90, IBM Power3 Node.
Compiling Several Classes of Communication Patterns on a Multithreaded Architecture Gagan Agrawal Department of Computer and Information Sciences Ohio.
A Performance Comparison of DSM, PVM, and MPI Paul Werstein Mark Pethick Zhiyi Huang.
1 Lecture 13: LRC & Interconnection Networks Topics: LRC implementation, interconnection characteristics.
An Efficient Lock Protocol for Home-based Lazy Release Consistency Electronics and Telecommunications Research Institute (ETRI) 2001/5/16 HeeChul Yun.
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
Distributed Memory and Cache Consistency (some slides courtesy of Alvin Lebeck)
Distributed Shared Memory (part 1). Distributed Shared Memory (DSM) mem0 proc0 mem1 proc1 mem2 proc2 memN procN network... shared memory.
Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.
Release Consistency Yujia Jin 2/27/02. Motivations Place partial order on memory accesses for correct parallel program behavior Relax partial order for.
Implementation and Performance of Munin (Distributed Shared Memory System) Dongying Li Department of Electrical and Computer Engineering University of.
DISTRIBUTED COMPUTING
Page 1 Distributed Shared Memory Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content of this presentation.
CIS 720 Distributed Shared Memory. Shared Memory Shared memory programs are easier to write Multiprocessor systems Message passing systems: - no physically.
Design Issues of Prefetching Strategies for Heterogeneous Software DSM Author :Ssu-Hsuan Lu, Chien-Lung Chou, Kuang-Jui Wang, Hsiao-Hsi Wang, and Kuan-Ching.
Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26.
1 Chapter 9 Distributed Shared Memory. 2 Making the main memory of a cluster of computers look as though it is a single memory with a single address space.
Making a DSM Consistency Protocol Hierarchy-Aware: An Efficient Synchronization Scheme Gabriel Antoniu, Luc Bougé, Sébastien Lacour IRISA / INRIA & ENS.
Lazy Release Consistency for Software Distributed Shared Memory Pete Keleher Alan L. Cox Willy Z. By Nooruddin Shaik.
1/12 Distributed Transactional Memory for Clusters and Grids EuroTM, Paris, May 20th, 2011 Michael Schöttner.
OpenMP for Networks of SMPs Y. Charlie Hu, Honghui Lu, Alan L. Cox, Willy Zwaenepoel ECE1747 – Parallel Programming Vicky Tsang.
Programming an SMP Desktop using Charm++ Laxmikant (Sanjay) Kale Parallel Programming Laboratory Department of Computer Science.
1 March 17, 2006Zhiyi’s RSL VODCA: View-Oriented, Distributed, Cluster-based Approach to parallel computing Dr Zhiyi Huang Dept of Computer Science University.
August 13, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 11: Multiprocessors: Uniform Memory Access * Jeremy R. Johnson Monday,
FastTrack: Efficient and Precise Dynamic Race Detection [FlFr09] Cormac Flanagan and Stephen N. Freund GNU OS Lab. 23-Jun-16 Ok-kyoon Ha.
Implementation and Performance of Munin (Distributed Shared Memory System) Dongying Li Department of Electrical and Computer Engineering University of.
Distributed Memory and Cache Consistency (some slides courtesy of Alvin Lebeck)
Software Coherence Management on Non-Coherent-Cache Multicores
Distributed Shared Memory
The Echo Algorithm The echo algorithm can be used to collect and disperse information in a distributed system It was originally designed for learning network.
BRX Technical Training
Relaxed Consistency models and software distributed memory
Pete Keleher, Alan L. Cox, Sandhya Dwarkadas and Willy Zwaenepoel
Lecture 26 A: Distributed Shared Memory
Shared Memory Programming
User-level Distributed Shared Memory
Distributed Shared Memory
CSS490 Distributed Shared Memory
Implementing an OpenMP Execution Environment on InfiniBand Clusters
Exercises for Chapter 16: Distributed Shared Memory
Design Alternatives for SAS: The Beauty of Mobile Homes
Lecture 26 A: Distributed Shared Memory
An Implementation of User-level Distributed Shared Memory
Presentation transcript:

TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems Present By: Blair Fort Oct. 28, 2004

Overview Introduction and Motivation Implementation Experiments and Results Conclusions My two cents

Introduction Threadmarks is a Distributed Shared Memory system Unix workstations over an ATM or Ethernet network

Cluster Configuration

Distributed Shared Memory

Motivation No widely available DSM system Eliminate problems of other system  Bad portability  Bad performance  False sharing

Goals Ease of Use Portability Good Performance  Also show that it works for real programs

Overview Introduction and Motivation Implementation Experiments and Results Conclusions My two cents

Ease of Use Looks a lot like pthreads Implicit message passing Implicit process creation

Portability Only standard Unix System Calls  Message Passing  Memory Management

Performance False sharing Excessive message passing

Conventional DSM Implementation

Sequential vs Release Consistency Every Write is broadcasted More Message Passing Writes are broadcasted only synchronization points More Memory overhead

Read-Write False Sharing w(x) r(y) r(x) w(x)

Read-Write False Sharing w(x) r(y) r(x) synch

Write-Write False Sharing w(x) w(y) r(x) synch w(x)

Multiple-Writer False Sharing w(x) w(y) r(x) synch w(x)

Eager vs. Lazy RC Sends Messages at release of lock or at barriers Broadcasts Messages to all nodes Sends Messages when locks are acquired Message goes only to the required node

Eager vs. Lazy RC

Memory Consistency Done by creating diffs Eager RC creates diffs at barriers Lazy RC creates diffs at the first use of a page

Twin Creation

Diff Organization

Vector Timestamps w(x) rel acq w(y) rel p1 p2 p3 acq r(x) r(y)

Diff chain in Proc 4

Garbage Collection Used to merge all diffs – recover memory Occurs only at barriers All nodes that have a pages must have all diffs of that page.

Overview Introduction and Motivation Implementation Experiments and Results Conclusions My two cents

Testing Platform 8 DECstation-5000/240’s running Ultrix V4.3 Network:  ATM 100Mbps  Ethernet 10Mbps

Testing Programs Modified Water from Splash Jacobi TSP QuickSort ILINK

Unix Overhead

ThreadMarks Overhead

Network Comparison - Water

Lazy vs Eager RC

Message Rate

Data Rate

Diff Creation Rate

Overview Introduction and Motivation Implementation Experiments and Results Conclusions My two cents

Conclusions Automated Distributed Shared Memory system works for real programs! LRC improves performance over ERC for most cases

Overview Introduction and Motivation Implementation Experiments and Results Conclusions My two cents

My Thoughts Good design – promotes re-use Would like to see a comparison over hand- coding the message passing Why not a partial merging of diffs?

Comments/Questions