Group Members Hamza Zahid (131391) Fahad Nadeem khan Abdual Hannan AIR UNIVERSITY MULTAN CAMPUS.

Slides:

Advertisements

Similar presentations

Multiple Processor Systems

Advertisements

1 Uniform memory access (UMA) Each processor has uniform access time to memory - also known as symmetric multiprocessors (SMPs) (example: SUN ES1000) Non-uniform.

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.

Topics Parallel Computing Shared Memory OpenMP 1.

1 Parallel Scientific Computing: Algorithms and Tools Lecture #3 APMA 2821A, Spring 2008 Instructors: George Em Karniadakis Leopold Grinberg.

Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.

Introduction to MIMD architectures

1 Introduction to MIMD Architectures Sima, Fountain and Kacsuk Chapter 15 CSE462.

Multiprocessors ELEC 6200: Computer Architecture and Design Instructor : Agrawal Name: Nam.

1 Multiprocessors. 2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad.

Chapter 17 Parallel Processing.

Multiprocessors CSE 471 Aut 011 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor.

Arquitectura de Sistemas Paralelos e Distribuídos Paulo Marques Dep. Eng. Informática – Universidade de Coimbra Ago/ Machine.

Implications for Programming Models Todd C. Mowry CS 495 September 12, 2002.

 Parallel Computer Architecture Taylor Hearn, Fabrice Bokanya, Beenish Zafar, Mathew Simon, Tong Chen.

1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.

Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı

Parallel Architectures

Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.

ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.

Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.

Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.

August 15, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 12: Multiprocessors: Non-Uniform Memory Access * Jeremy R. Johnson.

Parallel Computer Architecture and Interconnect 1b.1.

Introduction, background, jargon Jakub Yaghob. Literature T.G.Mattson, B.A.Sanders, B.L.Massingill: Patterns for Parallel Programming, Addison- Wesley,

Planned AlltoAllv a clustered approach Stephen Booth (EPCC) Adrian Jackson (EPCC)

Spring 2003CSE P5481 Issues in Multiprocessors Which programming model for interprocessor communication shared memory regular loads & stores message passing.

MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 8 Multiple Processor Systems Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,

1 Message Passing Models CEG 4131 Computer Architecture III Miodrag Bolic.

Outline Why this subject? What is High Performance Computing?

1 Lecture 1: Parallel Architecture Intro Course organization:  ~18 parallel architecture lectures (based on text)  ~10 (recent) paper presentations 

3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.

3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.

Multiprocessor  Use large number of processor design for workstation or PC market  Has an efficient medium for communication among the processor memory.

CDA-5155 Computer Architecture Principles Fall 2000 Multiprocessor Architectures.

Background Computer System Architectures Computer System Software.

CMSC 611: Advanced Computer Architecture Shared Memory Most slides adapted from David Patterson. Some from Mohomed Younis.

Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.

INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.

These slides are based on the book:

Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.

Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming

Multiprocessor System Distributed System

Introduction to Parallel Processing

Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming

N-Tier Architecture.

Introduction to parallel programming

Distributed Shared Memory

CS5102 High Performance Computer Systems Thread-Level Parallelism

Parallel Computers Definition: “A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems fast.”

Chapter 1: Introduction

CS 147 – Parallel Processing

CMSC 611: Advanced Computer Architecture

Distributed Systems CS

Parallel and Multiprocessor Architectures – Shared Memory

Lecture 1: Parallel Architecture Intro

Chapter 17 Parallel Processing

Message Passing Models

Multiprocessors - Flynn’s taxonomy (1966)

CS 213: Parallel Processing Architectures

Introduction to Multiprocessors

Parallel Processing Architectures

Lecture 24: Memory, VM, Multiproc

Distributed Systems CS

Shared Memory. Distributed Memory. Hybrid Distributed-Shared Memory.

Hybrid Programming with OpenMP and MPI

High Performance Computing

Chapter 4 Multiprocessors

Lecture 24: Virtual Memory, Multiprocessors

Lecture 23: Virtual Memory, Multiprocessors

Presentation transcript:

Group Members Hamza Zahid (131391) Fahad Nadeem khan Abdual Hannan AIR UNIVERSITY MULTAN CAMPUS

Shared memory VS Message passing

Topics Message passing Shared memory Difference b/w message passing and shared memory

Message Passing

INTRODUCTION The architecture is used to communicate data among a set of processors without the need for a global memory Each PE has its own local memory and communicates with other PE using message

MP network Two important factors must be considered;  Link bandwidth –the number of bits that can be transmitted per unit of times(bits/s)  Message transfer through the network

Process communication  Processes running on a given processor use what is called internal channels to exchange messages among themselves  Processes running on different processors use the external channesls to exchange messages

Data exchanged  Data exchanged among processors cannot be shared; it is rather copied (using send/ receive messages)  An important advantage of this form of data exchange is the elimination of the need for synchronization constructs, such as semaphores, which results in performance improvement

Message-Passing Interface – MPI Standardization  MPI is the only message passing library which can be considered a standard. It is supported on virtually all HPC platforms. Practically, it has replaced all previous message passing libraries. Portability  There is no need to modify your source code when you port your application to a different platform that supports the MPI standard

Message-Passing Interface – MPI Performance Opportunities  Vendor implementations should be able to exploit native hardware features to optimize performance. Functionality  Over 115 routines are defined. Availability  A variety of implementations are available, both vendor and public domain.

MPI basics  Start Processes  Send Messages  Receive Messages  Synchronize  With these four capabilities, you can construct any program.  MPI offers over 125 functions.

Shared memory

Introduction  Processors communicate with shared address space  Easy on small-scale machines  Shared memory allows multiple processes to share virtual memory space.  This is the fastest but not necessarily the easiest (synchronization- wise) way for processes to communicate with one another.  In general, one process creates or allocates the shared memory segment.  The size and access permissions for the segment are set when it is created.

Uniform Memory Access (UMA)  Most commonly represented today by Symmetric Multiprocessor (SMP) machines  Identical processors  Equal access and access times to memory  Sometimes called CC-UMA - Cache Coherent UMA. Cache coherent means if one processor updates a location in shared memory, all the other processors know about the update. Cache coherency is accomplished at the hardware level.

Shared Memory (UMA)

Non-Uniform Memory Access (NUMA)  Often made by physically linking two or more SMPs  One SMP can directly access memory of another SMP  Not all processors have equal access time to all memories  Memory access across link is slower  If cache coherency is maintained, then may also be called CC- NUMA - Cache Coherent NUMA

Shared Memory (NUMA)

Advantages  Global address space provides a user-friendly programming perspective to memory  Model of choice for uniprocessors, small-scale MPs  Ease of programming  Lower latency  Easier to use hardware controlled caching  Data sharing between tasks is both fast and uniform due to the proximity of memory to CPUs

Disadvantages  Primary disadvantage is the lack of scalability between memory and CPUs. Adding more CPUs can geometrically increases traffic on the shared memory-CPU path, and for cache coherent systems, geometrically increase traffic associated with cache/memory management.  Programmer responsibility for synchronization constructs that ensure "correct" access of global memory.  Expense: it becomes increasingly difficult and expensive to design and produce shared memory machines with ever increasing numbers of processors.

Difference

Message Passing vs. Shared Memory Difference: how communication is achieved between tasks  Message passing programming model –Explicit communication via messages –Loose coupling of program components –Analogy: telephone call or letter, no shared location accessible to all  Shared memory programming model –Implicit communication via memory operations (load/store) –Tight coupling of program components –Analogy: bulletin board, post information at a shared space Suitability of the programming model depends on the problem to be solved. Issues affected by the model include:  Overhead, scalability, ease of programming

Message Passing vs. Shared Memory Hardware Difference: how task communication is supported in hardware  Shared memory hardware (or machine model) –All processors see a global shared address space Ability to access all memory from each processor –A write to a location is visible to the reads of other processors  Message passing hardware (machine model) –No global shared address space –Send and receive variants are the only method of communication between processors (much like networks of workstations today, i.e. clusters) Suitability of the hardware depends on the problem to be solved as well as the programming model.

Programming Model vs. Architecture Machine  Programming Model –Join at network, so program with message passing model –Join at memory, so program with shared memory model –Join at processor, so program with SIMD or data parallel Programming Model  Machine –Message-passing programs on message-passing machine –Shared-memory programs on shared-memory machine –SIMD/data-parallel programs on SIMD/data-parallel machine

Separation of Model and Architecture  Shared Memory –Single shared address space –Communicate, synchronize using load / store –Can support message passing  Message Passing –Send / Receive –Communication + synchronization –Can support shared memory