April 26, 20051 CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.

Slides:



Advertisements
Similar presentations
CSE 160 – Lecture 9 Speed-up, Amdahl’s Law, Gustafson’s Law, efficiency, basic performance metrics.
Advertisements

Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Distributed Systems CS
Computer Abstractions and Technology
Concurrency The need for speed. Why concurrency? Moore’s law: 1. The number of components on a chip doubles about every 18 months 2. The speed of computation.
Master/Slave Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al.
Introduction CSCI 444/544 Operating Systems Fall 2008.
Parallel Database Systems
PARALLEL PROCESSING COMPARATIVE STUDY 1. CONTEXT How to finish a work in short time???? Solution To use quicker worker. Inconvenient: The speed of worker.
Lincoln University Canterbury New Zealand Evaluating the Parallel Performance of a Heterogeneous System Elizabeth Post Hendrik Goosen formerly of Department.
11Sahalu JunaiduICS 573: High Performance Computing5.1 Analytical Modeling of Parallel Programs Sources of Overhead in Parallel Programs Performance Metrics.
Distributed Processing, Client/Server, and Clusters
Reference: Message Passing Fundamentals.
Arquitectura de Sistemas Paralelos e Distribuídos Paulo Marques Dep. Eng. Informática – Universidade de Coimbra Ago/ Quantitative.
Introduction What is Parallel Algorithms? Why Parallel Algorithms? Evolution and Convergence of Parallel Algorithms Fundamental Design Issues.
Chapter 1 Introduction 1.1A Brief Overview - Parallel Databases and Grid Databases 1.2Parallel Query Processing: Motivations 1.3Parallel Query Processing:
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
©Brooks/Cole, 2003 Chapter 7 Operating Systems Dr. Barnawi.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
07/14/08. 2 Points Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic.
CPU Performance Assessment As-Bahiya Abu-Samra *Moore’s Law *Clock Speed *Instruction Execution Rate - MIPS - MFLOPS *SPEC Speed Metric *Amdahl’s.
Advances in Language Design
Juan Mendivelso.  Serial Algorithms: Suitable for running on an uniprocessor computer in which only one instruction executes at a time.  Parallel Algorithms:
CLUSTER COMPUTING Prepared by: Kalpesh Sindha (ITSNS)
PMIT-6102 Advanced Database Systems
Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.
Lecture 3 – Parallel Performance Theory - 1 Parallel Performance Theory - 1 Parallel Computing CIS 410/510 Department of Computer and Information Science.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Performance Evaluation of Parallel Processing. Why Performance?
1 Interconnects Shared address space and message passing computers can be constructed by connecting processors and memory unit using a variety of interconnection.
Lappeenranta University of Technology / JP CT30A7001 Concurrent and Parallel Computing Introduction to concurrent and parallel computing.
Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Introduction  Client/Server technology is seen by many as the solution to the difficulty of linking together the various departments of corporation.
Parallel Processing - introduction  Traditionally, the computer has been viewed as a sequential machine. This view of the computer has never been entirely.
CS453 Lecture 3.  A sequential algorithm is evaluated by its runtime (in general, asymptotic runtime as a function of input size).  The asymptotic runtime.
Performance Measurement. A Quantitative Basis for Design n Parallel programming is an optimization problem. n Must take into account several factors:
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 3: Operating-System Structures System Components Operating System Services.
Parallel Processing Steve Terpe CS 147. Overview What is Parallel Processing What is Parallel Processing Parallel Processing in Nature Parallel Processing.
Scaling Area Under a Curve. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.
PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer.
From lecture slides for Computer Organization and Architecture: Designing for Performance, Eighth Edition, Prentice Hall, 2010 CS 211: Computer Architecture.
Chapter 2 Introduction to Systems Architecture. Chapter goals Discuss the development of automated computing Describe the general capabilities of a computer.
Classic Model of Parallel Processing
Advanced Computer Networks Lecture 1 - Parallelization 1.
A System Performance Model Distributed Process Scheduling.
Parallel processing
Computer Organization CS224 Fall 2012 Lesson 52. Introduction  Goal: connecting multiple computers to get higher performance l Multiprocessors l Scalability,
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
Scaling Conway’s Game of Life. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.
August 13, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 11: Multiprocessors: Uniform Memory Access * Jeremy R. Johnson Monday,
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3.
Unit - 4 Introduction to the Other Databases.  Introduction :-  Today single CPU based architecture is not capable enough for the modern database.
LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 28, 2005 Session 29.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
These slides are based on the book:
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Parallel Processing - introduction
Introduction to Parallelism.
Chapter 16: Distributed System Structures
Distributed System Structures 16: Distributed Structures
CSE8380 Parallel and Distributed Processing Presentation
COMP60621 Fundamentals of Parallel and Distributed Systems
By Brandon, Ben, and Lee Parallel Computing.
PERFORMANCE MEASURES. COMPUTATIONAL MODELS Equal Duration Model:  It is assumed that a given task can be divided into n equal subtasks, each of which.
Mattan Erez The University of Texas at Austin
COMP60611 Fundamentals of Parallel and Distributed Systems
Presentation transcript:

April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University

April 26, Parallel Processing Multianalysis --- Compare Parallel Processing with Sequential Processing

April 26, Why did I select this topic?

April 26, Outline  Definition  Characteristics of Parallel Processing and Sequential Processing  Implementation of Parallel Processing and Sequential Processing  Performance of Parallel Processing and Sequential Processing  Parallel Processing Evaluation  Major Application of parallel processing

April 26, Definition  Parallel Processing Definition Parallel Processing refers to the simultaneous use of multiple processors to execute the same task in order to obtain faster results. These processors either communicate each other to solve a problem or work completely independent, under the control of another processor which divides the problem into a number of parts to other processors and collects results from them.

April 26, Definition.2  Sequential Processing Definition Sequential processing refers to a computer architecture in which a single processor carries out a single task by series of operations in sequence. It is also called serial processing.

April 26, Characteristics of Parallel Processing and Sequential Processing  Characteristics of Parallel Processing ● Each processor can perform tasks concurrently. ● Tasks may need to be synchronized. ● Processors usually share resources, such as data, disks, and other devices.

April 26, Characteristics of Parallel Processing and Sequential Processing. 2  Characteristics of Sequential Processing ● Only one single processor performs task. ● The single processor performs a single task. ● Task is executed in sequence.

April 26, Implementation of parallel processing and sequential processing  Executing single task In sequential processing, the task is executed as a single large task. In parallel processing, the task is divided into multiple smaller tasks, and each component task is executed on a separate processor.

April 26, Implementation of parallel processing and sequential processing. 2

April 26, Implementation of parallel processing and sequential processing. 3 Total Elapsed Time Processor 1 Processor 2 Processor 3 Processor 4 Processor 5 Processor 6 Processor 7 Figure 2 Parallel Processing: Executing Component Tasks in Parallel Component task (runtime)

April 26, Implementation of parallel processing and sequential processing. 4  Executing multiple independent task ● In sequential processing, independent tasks compete for a single resource. Only task 1 runs without having to wait. Task 2 must wait until task 1 has completed; task 3 must wait until tasks 1 and 2 have completed, and so on.

April 26, Implementation of parallel processing and sequential processing. 5  Executing multiple independent task ● By contrast, in parallel processing, for example, a parallel server on a symmetric multiprocessor, more CPU power is assigned to the tasks. Each independent task executes immediately on its own processor: no wait time is involved.

April 26, Implementation of parallel processing and sequential processing. 6 Total Elapsed Time Processor 1 Processor 2 Processor 3 Processor 4 Processor 5 Processor 6 Processor 7 Figure 3 Sequential Processing of Multiple Independent Tasks Task (runtime) Wait

April 26, Implementation of parallel processing and sequential processing. 7 Total Elapsed Time Figure 4 Parallel Processing: Executing Independent Tasks in Parallel Task (runtime) Processor 1 Processor 2 Processor 3 Processor 4 Processor 5 Processor 6 Processor 7

April 26, Performance of parallel processing and sequential processing  Sequential Processing Performance ● Take long time to execute task. ● Can’t handle too large task. ● Can’t handle large loads well. ● Return is diminishing. ● More increasingly expensive to make a single processor faster.

April 26, Performance of parallel processing and sequential processing. 2  Solution: using parallel processing - use lots of relatively fast, cheap processors in parallel.

April 26, Performance of parallel processing and sequential processing.3  Parallel Processing Performance ● Cheaper, in terms of price and performance. ● Faster than equivalently expensive uniprocessor machines. ● Scalable. The performance of a particular program may be improved by execution on a large machine.

April 26, Performance of parallel processing and sequential processing.4  Parallel Processing Performance ● Reliable. In theory if processors fail we can simply use others. ● Can handle bigger problems. ● Communicate with each other readily, important in calculations.

April 26, Parallel Processing Evaluation  Several ways to evaluate the parallel processing performance: ● Scale-up ● Speedup ● Efficiency ● Overall solution time ● Price/performance

April 26, Parallel Processing Evaluation. 2  Scale-up Scale-up is enhanced throughput, refers to the ability of a system n times larger to perform an n times larger job, in the same time period as the original system. With added hardware, a formula for scale-up holds the time constant, and measures the increased size of the job which can be done.

April 26, Parallel Processing Evaluation. 3 Sequential System: Hardware Time 100% Task Parallel System: Hardware Time 200% Task Figure 5 Scale-up

April 26, Parallel Processing Evaluation. 4  Scale-up measurement formula: Scale-up = Transaction volume of multiprocessors Transaction volume of uniprocessor

April 26, Parallel Processing Evaluation. 5  For example, if the uniprocessor system can process 100 transactions in a given amount of time, and the parallel system can process 200 transactions in this amount of time, then the value of scale-up would be equal to 200/100 = 2.  Value 2 indicates the ideal of linear scale-up: when twice as much, hardware can process twice the data volume in the same amount of time.

April 26, Parallel Processing Evaluation. 6  Speedup Speedup, the improved response time, defined as the time it takes a program to execute in sequential (with one processor) divided by the time it takes to execute in parallel (with many processors). It can be achieved by two ways: breaking up a large task into many small fragments and reducing wait time.

April 26, Parallel Processing Evaluation. 7 Sequential System: Hardware Time 100% Task Parallel System: Hardware Time 50% Task Figure 6 Speedup

April 26, Parallel Processing Evaluation. 8  Speedup measurement formula: Elapsed time of a uniprocessor Speedup = Elapsed time of the multiprocessors

April 26, Parallel Processing Evaluation. 9  For example, if the uniprocessor took 40 seconds to perform a task, and two parallel systems took 20 seconds, then the value of speedup = 40 / 20 = 2.  Value 2 indicates the ideal of linear speedup: when twice as much, hardware can perform the same task in half the time.

April 26, Parallel Processing Evaluation. 10 Workload Scale-up Speedup OLTP Yes No DSS Yes Batch (Mixed) Yes Possible Parallel Query Yes Table 1 Scale-up and Speedup for Different Types of Workload

April 26, Parallel Processing Evaluation. 11 Figure 7 Linear and actual speedup

April 26, Parallel Processing Evaluation. 12 Amdahl’s Law Amdahl's Law is a law governing the speedup of using parallel processors on a problem, versus using only one sequential processor. Amdahl’s law attempts to give a maximum bound for speedup from the nature of the algorithm:

April 26, Parallel Processing Evaluation. 13 Amdahl’s Law S: purely sequential part P: parallel part S + P = 1 (for simplicity) Maximum speedup = S + P S + P n = 1

April 26, Parallel Processing Evaluation. 14 Figure 8 Example speedup: Amdahl & Gustafson

April 26, Parallel Processing Evaluation. 15 Gustafson’s Law If the size of a problem is scaled up as the number of processors increases, speedup very close to the ideal speedup is possible. That is, a problem size is virtually never independent of the number of processors.

April 26, Parallel Processing Evaluation. 16 Gustafson’s Law Maximum speedup = S + (P * n) S + P = n + (1 - n) * S

April 26, Parallel Processing Evaluation. 17  Efficiency The relative efficiency can be a useful measure as to what percentage of a processor’s time is being spent in useful computation. Efficiency = Speedup * 100 Number of processors

April 26, Parallel Processing Evaluation. 18 Figure 9 Optimum efficiency & actual efficiency

April 26, Parallel Processing Evaluation. 19 Figure 10 Optimum number of processors in actual speedup

April 26, Parallel Processing Evaluation. 20  Problems in Parallel Processing Parallel processing is like a dog’s walking on its hind legs. It is not done well, but you are surprised to find it done at all. ----Steve Fiddes (University of Bristol)

April 26, Parallel Processing Evaluation. 21  Problems in Parallel Processing ● Its software is heavily platform-dependent and has to be written for a specific machine. ● It also requires a different, more difficult method of programming, since the software needs to appropriately, through algorithms, divide the work across each processor.

April 26, Parallel Processing Evaluation. 22  Problems in Parallel Processing ● There isn't a wide array of shrink-wrapped software ready for use with parallel machines. ● Parallelization is problem-dependent and cannot be automated. ● Speedup is not guaranteed.

April 26, Parallel Processing Evaluation. 23  Solution 1: ● Decide which architecture is most appropriate for a given application. The characteristics of application should drive decision as to how it should be parallelized; the form of the parallelization should then determine what kind of underlying system, both hardware and software, is best suited to running your parallelized application.

April 26, Parallel Processing Evaluation. 24  Solution 2: ● Clustering

April 26, Major Applications of parallel processing  Clustering ● Clustering is a form of parallel processing that takes a group of workstations connected together in a local-area network and applies middleware to make them act like a parallel machine.

April 26, Major Applications of parallel processing. 2  Clustering Clustering is a form of parallel processing that takes a group of workstations connected together in a local-area network and applies middleware to make them act like a parallel machine.

April 26, Major Applications of parallel processing. 3  Clustering ● Parallel processing using Linux Clusters can yield supercomputer performance for some programs that perform complex computations or operate on large data sets. And it can accomplish this task by using cheap hardware. ● Clustering can be used at night when networks are idle, it is an inexpensive alternative to parallel- processing machines.

April 26, Major Applications of parallel processing. 4  Clustering can work with two separate but similar implementations: ● A Parallel Virtual Machine (PVM), is an environment that allows messages to pass between computers as it would in an actual parallel machine. ● A Message-Passing Interface (MPI), allows programmers to create message-passing parallel applications, using parallel input/output functions and dynamic process management.

April 26, Reference  Andrew Boucher, “Parallel Machines”  Stephane vialle, “Past and Future Parallelism Challenges to Encompass sequential Processor evolution”

April 26, The end Thank you!