Distributed Shared Memory

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

Relaxed Consistency Models. Outline Lazy Release Consistency TreadMarks DSM system.
Distributed Shared Memory
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
November 1, 2005Sebastian Niezgoda TreadMarks Sebastian Niezgoda.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
1 Chapter 9 Distributed Shared Memory. 2 Making the main memory of a cluster of computers look as though it is a single memory with a single address space.
Distributed Resource Management: Distributed Shared Memory
Communication Models for Parallel Computer Architectures 4 Two distinct models have been proposed for how CPUs in a parallel computer system should communicate.
1 Threads Chapter 4 Reading: 4.1,4.4, Process Characteristics l Unit of resource ownership - process is allocated: n a virtual address space to.
Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
Distributed Shared Memory Systems and Programming
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
TreadMarks Distributed Shared Memory on Standard Workstations and Operating Systems Pete Keleher, Alan Cox, Sandhya Dwarkadas, Willy Zwaenepoel.
Distributed Shared Memory: A Survey of Issues and Algorithms B,. Nitzberg and V. Lo University of Oregon.
Lazy Release Consistency for Software Distributed Shared Memory Pete Keleher Alan L. Cox Willy Z.
ECE200 – Computer Organization Chapter 9 – Multiprocessors.
TECHNIQUES FOR REDUCING CONSISTENCY- RELATED COMMUNICATION IN DISTRIBUTED SHARED-MEMORY SYSTEMS J. B. Carter University of Utah J. K. Bennett and W. Zwaenepoel.
CS425/CSE424/ECE428 – Distributed Systems Nikita Borisov - UIUC1 Some material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra,
Ch 10 Shared memory via message passing Problems –Explicit user action needed –Address spaces are distinct –Small Granularity of Transfer Distributed Shared.
Distributed Shared Memory Based on Reference paper: Distributed Shared Memory, Concepts and Systems.
Cache Coherence Protocols 1 Cache Coherence Protocols in Shared Memory Multiprocessors Mehmet Şenvar.
Distributed Shared Memory Presentation by Deepthi Reddy.
1 Chapter 9 Distributed Shared Memory. 2 Making the main memory of a cluster of computers look as though it is a single memory with a single address space.
DISTRIBUTED COMPUTING
Page 1 Distributed Shared Memory Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content of this presentation.
1 Chapter 9 Distributed Shared Memory. 2 Making the main memory of a cluster of computers look as though it is a single memory with a single address space.
Lazy Release Consistency for Software Distributed Shared Memory Pete Keleher Alan L. Cox Willy Z. By Nooruddin Shaik.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
OpenMP for Networks of SMPs Y. Charlie Hu, Honghui Lu, Alan L. Cox, Willy Zwaenepoel ECE1747 – Parallel Programming Vicky Tsang.
Distributed shared memory u motivation and the main idea u consistency models F strict and sequential F causal F PRAM and processor F weak and release.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Group Members Hamza Zahid (131391) Fahad Nadeem khan Abdual Hannan AIR UNIVERSITY MULTAN CAMPUS.
These slides are based on the book:
Last Class: Introduction
Software Coherence Management on Non-Coherent-Cache Multicores
Chapter 2: System Structures
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Lecture 18: Coherence and Synchronization
Ivy Eva Wu.
The University of Adelaide, School of Computer Science
Advanced Operating Systems
CMSC 611: Advanced Computer Architecture
The University of Adelaide, School of Computer Science
Page Replacement.
Consistency Models.
Chapter 2: System Structures
Outline Midterm results summary Distributed file systems – continued
Distributed Shared Memory
Fast Communication and User Level Parallelism
Threads Chapter 4.
User-level Distributed Shared Memory
Concurrency: Mutual Exclusion and Process Synchronization
Distributed Shared Memory
Outline Chapter 2 (cont) OS Design OS structure
High Performance Computing
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Lecture 24: Multiprocessors
Distributed Resource Management: Distributed Shared Memory
Lecture 17 Multiprocessors and Thread-Level Parallelism
Programming with Shared Memory Specifying parallelism
A Virtual Machine Monitor for Utilizing Non-dedicated Clusters
Lecture 18: Coherence and Synchronization
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Presentation transcript:

Distributed Shared Memory Chapter 9

Introduction Distributed Shared Memory Making the main memory of a cluster of computers look as though it is a single memory with a single address space. Then you can use shared memory programming techniques.

1. Distributed Shared Memory Still need messages or mechanisms to get data to processor, but these are hidden from the programmer:

1. Distributed Shared Memory Advantages of DSM System scalable Hides the message passing - do not explicitly specific sending messages between processes Can use simple extensions to sequential programming Can handle complex and large data bases without replication or sending the data to processes

1. Distributed Shared Memory Disadvantages of DSM May incur a performance penalty Must provide for protection against simultaneous access to shared data (locks, etc.) Little programmer control over actual messages being generated Performance of irregular problems in particular may be difficult

2. Implementing Distributed Shared Memory Hardware Special network interfaces and cache coherence circuits Software Modifying the OS kernel Adding a software layer between the operating system and the application - most convenient way for teaching purposes

2.1 Software DSM Systems Page based Shared variable approach Using the system’s virtual memory Shared variable approach Using routines to access shared variables Object based Shared data within collection of objects. Access to shared data through object oriented discipline (ideally)

2.1 Software DSM Systems Software Page Based DSM Implementation

2.1 Software DSM Systems Some Software DSM Systems Treadmarks JIAJIA Page based DSM system Apparently not now available JIAJIA C based Some users have said it required significant modifications to work (in message-passing calls) Adsmith object based C++ library routines

2.2 Hardware DSM Implementation Special network interfaces and cache coherence circuits are added to the system to make a memory reference to a remote look like a reference to a local memory location Typical examples are in clusters of SMP machines

2.3 Managing Shared Data There are several ways that a processor could be given access to shared data The simplest solution is to have a central server responsible for all read and write routines. Problem – the server is the bottleneck Solution?

2.3 Managing Shared Data Multiple copies of data. Allows simultaneous access by different processors. How do you maintain these copies using a coherence policy? One option is Multiple Reader/Single Writer When the writer changes the data they have 2 choices Update policy Invalidate policy

2.3 Managing Shared Data Another option is Multiple Reader / Multiple Writer This is the most complex scenario.

2.4 Multiple Reader/Single Writer Policy in a Page-Based System Remember that a page holds more than one variable location. Problem… A and B can change different things in the same page How do you deal with the consistency?

3. Achieving Consistent Memory in a DSM System The term memory consistency model addresses when the current value of a shared variable is seen by other processors. There are various models with decreasing constraints to provide higher performance.

3. Achieving Consistent Memory in a DSM System Consistency Models Strict Consistency - Processors sees most recent update, i.e. read returns the most recent write to location. Sequential Consistency - Result of any execution same as an interleaving of individual programs. Relaxed Consistency- Delay making write visible to reduce messages.

3. Achieving Consistent Memory in a DSM System Consistency Models (cont) Weak consistency - programmer must use synchronization operations to enforce sequential consistency when necessary. Release Consistency - programmer must use specific synchronization operators, acquire and release. Lazy Release Consistency - update only done at time of acquire.

3. Achieving Consistent Memory in a DSM System Strict Consistency Every write immediately visible Disadvantages: number of messages, latency, maybe unnecessary.

3. Achieving Consistent Memory in a DSM System Release Consistency An extension of weak consistency in which the synchronization operations have been specified: acquire operation - used before a shared variable or variables are to be read. release operation - used after the shared variable or variables have been altered (written) and allows another process to access to the variable(s) Typically acquire is done with a lock operation and release by an unlock operation (although not necessarily).

3. Achieving Consistent Memory in a DSM System Release Consistency Arrows show messages

3. Achieving Consistent Memory in a DSM System Lazy Release Consistency Messages only on acquire Advantages: Fewer Messages

4. Distributed Shared Memory Programming Primitives 4 fundamental primitives Process/thread creation (and termination) Shared-data creation Mutual-exclusion synchronization (controlled access to shared data) Process/thread and event synchronization.

4.1 Process Creation Simple routines such as: dsm_spawn(filename, num_processes); dsm_wait();

4.2 Shared Data Creation Routines to construct shares data dsm_malloc(); dsm_free();

4.3 Shared Data Access In a DSM system employing relaxed-consistency model (most DSM systems) dsm_lock(lock1) dsm_refresh(sum); *sum++; dsm_flush(sum); dsm_unlock(lock1);

4.4 Synchronization Access Two types must be provided Global Synchronization Process-Pair Synchronization Typically done with identifiers to barriers dsm_barrier(identifier);

4.5 Features to Improve Performance Overlapping computation with communication. dsm_prefetch(sum); This tells the system to get the variable because we will need it soon. Similar to speculative loads used in processor/cache algorithms.

4.5 Features to Improve Performance Reducing the Number of Messages dsm_acquire(sum); *sum++; dsm_release(sum); Old Way: dsm_lock(lock1); dsm_refresh(sum); *sum++; dsm_flush(sum); dsm_unlock(lock1);

5. Distributed Shared Memory Programming Uses the same concepts as shared memory programming. Message passing occurs that is hidden from the user, and thus there are additional inefficiency considerations. You may have to modify the code to request variables before you need them. Would be nice if the compiler could do this

6. Implementing a Simple DSM System It is relatively straightforward to write your own simple DSM system. In this section we will review how this can be done.

6.1 User Interface Using Classes and Methods. Using the OO methodology of C++ we might have a wrapper class SharedInteger sum = new SharedInteger(); This class could extend the Integer class to provide the following methods. sum.lock() sum.refresh() sum++ sum.flush() sum.unlock()

6.2 Basic Shared-Variable Implementation Option 1: Centralized Server

6.2 Basic Shared-Variable Implementation Option 2: Multiple Servers

6.2 Basic Shared-Variable Implementation Option 3: Multiple Readers (1 owner)

6.2 Basic Shared-Variable Implementation Option 4: Migrating the owner

DSM Projects Write a DSM system in C++ using MPI for the underlying message-passing and process communication. (More advanced) One of the fundamental disadvantages of software DSM system is the lack of control over the underlying message passing. Provide parameters in a DSM routine to be able to control the message-passing. Write routines that allow communication and computation to be overlapped.