Presentation is loading. Please wait.

Presentation is loading. Please wait.

3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3.

Similar presentations


Presentation on theme: "3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3."— Presentation transcript:

1 3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3

2 Computing Technology ● Hardware ➢ Vacuum tubes, relay memory ➢ Discrete transistors, core memory ➢ Integrated circuits, pipelined CPU ➢ VLSI microprocessors, solid state memory ● Languages and Software ➢ Machine / Assembly languages ➢ Algol / Fortran with compilers, batch processing OS ➢ C, multiprocessing, timesharing OS ➢ C++ / Java, parallelizing compilers, distributed OS

3 Computing Technology ● The Driving Force Behind the Technology Advances ➢ The ever-increasing demands on computing power ✔ Scientific computing (e.g. Large-scale simulations) ✔ Commercial computing (e.g. Databases) ✔ 3D graphics and realistic animation ✔ Multimedia internet applications

4 Challenge Problem ● Simulations of the earth’s climate Resolution: 10 kilometers Period: 1 year Ocean and biosphere models: simple ➢ Total requirements: 10 16 floating-point operations per second ➢ With a supercomputer capable of 10 Giga FLOPS, it will take 10 days to execute ● Real-time processing of 3D graphics Number of data elements: 10 9 (1024 in each dimension) Number of operations per element : 200 Update rate: 30 times per second ➢ Total requirements: 6.4 x 10 12 operations per second ➢ With processor capable of 10 Giga IOPS, we need 640 of them

5 Motivations for Parallelism ● Conventional computers and sequential a single CPU a single stream of instructions executing one instruction at a time (not completely true) ➢ Single-CPU processor has a performance limit ➢ Moore’s Law can’t go on forever

6 Motivation for Parallelism ● How to increase computing power? Better processor design ➢ More transistors, larger caches, advanced architectures Better system design ➢ Faster / larger memory, faster buses, better OS Scale up the computer (parallelism) ➢ Replicate hardware at component or whole computer levels Parallel processor’s power is virtually unlimited ➢ 10 processor @ 500 Mega FLOPS each = 5 Giga FLOPS ➢ 100 processor @ 500 Mega FLOPS each = 50 Giga FLOPS ➢ 1,000 processor @ 500 Mega FLOPS each = 500 Giga FLOPS

7 Motivation for Parallelism ● Additional Motivations ➢ Solving bigger problems ➢ Lowering cost

8 Terminology ● Hardware ➢ Multicomputers tightly networked, multiple uniform computers ➢ Multiprocessors tightly networked, multiple uniform processors with additional memory units ➢ Supercomputers general purpose and high-performance, nowadays almost always parallel ➢ Clusters Loosely networked commodity computers

9 Terminology ● Programming ➢ Pipelining divide computation into stages (segments) assign separate functional units to each stage ➢ Data Parallelism multiple (uniform) functional units apply same operation simultaneously to different elements of data set ➢ Control Parallelism multiple (specialized) functional units apply distinct operations to data elements concurrently

10 Terminology ● Performance ➢ Throughput number of results per unit time ➢ Speedup Time needed for the most efficient sequential algorithm S= —————————————————— Time needed on a pipelined / parallel machine

11 Terminology ➢ Scalability An algorithm is scalable if the available parallelism increases at least linearly with problem size An architecture is scalable if it gives same performance per processor, as the number of processors and the size of the problem are both increased Data-parallel algorithms tend to be more scalable than control-parallel algorithms

12 Example ● Problem ➢ Find all primes less than or equal to some positive integer n ● Method (the sieve algorithm) ➢ Write down all th integers from 1 to n ➢ Cross out from the list all multiples of 2, 3, 5, 7, … up to sqrt (n)

13 Example ● sequential Implementation ➢ Boolean array representing the integers from 1 to n ➢ Buffer for holding current prime ➢ Index for loop iterating through the array

14 Example ● Control-Parallel Approach ➢ Different processors strike out multiples of different primes ➢ The boolean array and the current prime is shared; each processor has its own private copy of loop index

15 Example ● Data-Parallel Approach ➢ Each processor responsible for a unique range of the integers, it does all the striking in that range ➢ Processor 1 is responsible for broadcasting its findings to other processors

16 Example ● Performance Analysis ➢ Sequential Algorithm Cost of sieving multiples of 2: [(n-3)/2] Cost of sieving multiples of 3: [(n-8)/3] Cost of sieving multiples of 5: [(n-24)/5]... For n=1,000, T=1,411

17 Example Control-Parallel Algorithm For p=2, n=1,000, T=706 For p=3, n=1,000, T=499 For p=4, n=1,000, T=499

18 Example ➢ Data-Parallel Algorithm Cost of broadcasting: k(P-1) Cost of striking: ([(n/p)/2]+ [(n/p)/3]+ … + [(n/p)/  k ])  For p=2, n=1,000, T≈781 For p=3, n=1,000, T≈ 471 For p=4, n=1,000, T≈ 337


Download ppt "3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3."

Similar presentations


Ads by Google