What is Serial Computing? Traditionally, software has been written for serial computation: To be run on a single computer having a single Central Processing Unit (CPU); A problem is broken into a discrete series of instructions. Instructions are executed one after another. Only one instruction may execute at any moment in time.
What is Parallel Computing? In the simplest sense, parallel computing is the simultaneous use of multiple compute resources to solve a computational problem: To be run using multiple CPUs A problem is broken into discrete parts that can be solved concurrently Each part is further broken down to a series of instructions Instructions from each part execute simultaneously on different CPUs
The compute resources might be: A single computer with multiple processors or cores; An arbitrary number of computers connected by a network; A combination of both; A special video card.
Hardware. Cluster A computer cluster is a group of linked computers, working together closely thus in many respects forming a single computer. The components of a cluster are commonly, but not always, connected to each other through fast local area networks.
is a supercomputer being produced by Fujitsu at the RIKEN Advanced Institute for Computational Science campus in Kobe, Japan K became the world's fastest supercomputer in June 2011, as recorded by the TOP500 it is expected to become fully operational in November 2012 K topped the LINPACK benchmark with the performance of 8.162 petaflops, or 8.162 quadrillion calculations per second 68,544 2.0GHz 8-core SPARC64 VIIIfx processors packed in 672 cabinets, for a total of 548,352 cores
Multi-core processor A multi-core processor is a single computing component with two or more independent actual processors (called "cores"), which are the units that read and execute program instructions.
Nvidia CUDA CUDA or Compute Unified Device Architecture is a parallel computing architecture developed by Nvidia Using CUDA, the latest Nvidia GPUs become accessible for computation like CPUs
Parallel Computing The computational problem should be able to: Be broken apart into discrete pieces of work that can be solved simultaneously; Execute multiple program instructions at any moment in time; Be solved in less time with multiple compute resources than with a single compute resource.
Parallel Computing Example PI Calculation The value of PI can be calculated in a number of ways. Consider the following method of approximating PI 1.Inscribe a circle in a square 2.Randomly generate points in the square 3.Determine the number of points in the square that are also in the circle 4.Let r be the number of points in the circle divided by the number of points in the square 5.PI ~ 4 r 6.Note that the more points generated, the better the approximation
Serial pseudo code for this procedure: Note that most of the time in running this program would be spent executing the loop.
Parallel solution Computationally intensive Minimal communication Minimal I/O
Parallel solution Parallel strategy: break the loop into portions that can be executed by the tasks. For the task of approximating PI: – Each task executes its portion of the loop a number of times. – Each task can do its work without requiring any information from the other tasks (there are no data dependencies). – Uses the SPMD model. One task acts as master and collects the results.