Download presentation

Presentation is loading. Please wait.

Published byCuthbert Patterson Modified over 4 years ago

1
Parallel Processing1 Parallel Processing (CS 676) Overview Jeremy R. Johnson

2
Parallel Processing2 Goals Parallelism: To run large and difficult programs fast. Course: To become effective parallel programmers –“How to Write Parallel Programs” –“Parallelism will become, in the not too distant future, an essential part of every programmer’s repertoire” –“Coordination – a general phenomenon of which parallelism is one example – will become a basic and widespread phenomenon in CS” Why? –Some problems require extensive computing power to solve –The most powerful computer by definition is a parallel machine –Parallel computing is becoming ubiquitous –Distributed & networked computers with simultaneous users require coordination

3
Parallel Processing3 Top 500

4
Parallel Processing4 LINPACK Benchmark Solve a dense N N system of linear equations, y = Ax, using Gaussian Elimination with partial pivoting –2/3N 3 + 2N 2 FLOPS High Performance LINPACK used to measure performance for TOP500 (introduced by Jack Dongarra)

5
Parallel Processing5 Example LU Decomposition Solve the following linear system Find LU decomposition A = PLU

6
Parallel Processing6 Big Machines Cray 2 DoE-Lawrence Livermore National Laboratory (1985) 3.9 gigaflops 8 processor vector machine Cray XMP/4 DoE, LANL,… (1983) 941 megaflops 4 processor vector machine

7
Parallel Processing7 Big Machines Cray Jaguar ORNL (2009) 1.75 petaflops 224,256 AMD Opteron cores Tianhe-1A NSC Tianjin, China (2010) 2.507 petaflops 14,336 Xeon X5670 processors 7,168 Nvidia Tesla M2050 GPUS

8
Parallel Processing8 Need for Parallelism

9
Parallel Processing9 Multicore Intel Core i7

10
Parallel Processing10 Multicore IBM Blue Gene/L 2004-2007 478.2 teraflops 65,536 "compute nodes” Cyclops64 80 gigaflops 80 cores @ 500 megahertz multiply-accumulate

11
Parallel Processing11 Multicore

12
Parallel Processing12 Multicore

13
Parallel Processing13 GPU Nvidia GTX 480 1.34 teraflops 480 SP (700 MHz) Fermi chip 3 billion transistors

14
Parallel Processing14 Google Server 2003: 15,000 servers ranging from 533 MHz Intel Celeron to dual 1.4 GHz Intel Pentium III 2005: 200,000 servers 2006: upwards of servers

15
Drexel Machines Tux 5 nodes –4 Quad-Core AMD Opteron 8378 processors (2.4 GHz) –32 GB RAM Draco 20 nodes –Dual Xeon Processor X5650 (2.66 GHz) –6 GTX 480 –72 GB RAM 4 nodes –6 C2070 GPUs Parallel Processing15

16
Parallel Processing16 Programming Challenge “But the primary challenge for an 80-core chip will be figuring out how to write software that can take advantage of all that horsepower.” Read more: http://news.cnet.com/Intel-shows-off-80-core- processor/21001006_36158181.html?tag=mncol#ixzz1AHCK 1LEc

17
Parallel Processing17 Basic Idea One way to solve a problem fast is to break the problem into pieces, and arrange for all of the pieces to be solved simultaneously. The more pieces, the faster the job goes - upto a point where the pieces become too small to make the effort of breaking-up and distributing worth the bother. A “parallel program” is a program that uses the breaking up and handing-out approach to solve large or difficult problems.

Similar presentations

© 2020 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google