Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parallel Computers Past and Present Yenchi Lin Apr 17,2003.

Similar presentations


Presentation on theme: "Parallel Computers Past and Present Yenchi Lin Apr 17,2003."— Presentation transcript:

1 Parallel Computers Past and Present Yenchi Lin Apr 17,2003

2 Outline Concepts/Background on Parallel Computers Connection Machines Earth Simulator Conclusion

3 Quick architecture overview SIMD, MIMD Shared memory, distributed memory MPP, PVP, SMP NOW  Network of Workstations (clusters)

4 SIMD, MIMD SIMD – Single Instruction Multiple Data  All processors perform same instruction on different pieces of data  Some processors can be masked out from executing certain instructions MIMD – Multiple Instruction Multiple Data  Each processor executes different instruction on different data

5 Memory Shared Memory  Single, unified address space across all processors Distributed Memory  Each processor has its own address space Hybrid  Multiple processors within a computing node share the same address space, while the whole system has many different address spaces.

6 Processors PVP – parallel vector processors  Cray, NEC, Hitachi MPP – massively parallel processors  Connection Machines SMP – symmetric multiple processor  Sun SunFire, DEC (Compaq/HP) AlphaServer

7 D.E. Culler, J.P. Singh, A. Gupta “Parallel Computer Architecture – A Hardware/Software Approach”

8 Trends (cont.) D.E. Culler, J.P. Singh, A. Gupta “Parallel Computer Architecture – A Hardware/Software Approach” The trend of MPP overtaking SMP has continued, as number of NOW (clusters) grow in TOP 500 list.

9 Connection Machines Invented by Dennis Hills of Thinking Machines Corp. while at MIT. Originally designed to run artificial intelligence applications  First working application on CM-1 : Game of Life CM-1(1985), CM-2 (1986) and CM-5 (1992) Richard Feynman helped in building the first CM-1s. At its peak, 70 machines were installed around the world and all in TOP 500 list. Thinking Machines Corp. filed bankruptcy in 1993, changed to pure software company in 1996, bought by Oracle in 1999.

10 CM-2 – 1986 SIMD hypercube connection 1bit processor in groups of 16. 8 dimension for 8192 processor configuration, 12 dimension for 65536 processor configuration. Programming languages – C*, * lisp, CM Fortran

11 Sprint Node in CM-2 1 bit-serial processors 16 in a group, two groups on the board Two groups share same memory and floating point unit Router has limited processing power 12 degree connectivity!

12 Hypercube Connection in CM-2 Maximum hop count in hypercube = dimension of hypercube Router randomly pick the next hop High wire count Four dimensional hypercube

13 CM-5 – 1992 Distributed memory multi- processor Sparc + custom vector units Fat Tree structure Programming Languages – C*, * lisp, CM Fortran, HPF, C++, etc Supports partitioning, multi-user

14 Processing Element in CM-5 33Mhz SPARC Vector processor Network interface 32MB memory Connected using Sun MBus Network access treated equally as memory access – expensive for larger message

15 Fat-Tree of CM-5 Three networks – data, control and diagnostic, synchronized on 40Mhz clock 4-ary fat tree, each processor as leaf  Two parents per child for the first two levels  Four parents per child for higher levels Data network of CM-5

16 Transition from CM-2 to CM-5 1-bit serial processors -> 64bit SPARCs SIMD -> MIMD  Use SPMD to emulate SIMD behavior Hypercube -> Fat-Tree  Randomness preserved by random routing

17 Earth Simulator – 2002 Collection of modified NEC SX-6 640 nodes, 8 way each 12.3GB/s x 2 network Theoretical throughput 40TFlops Max throughput 36TFlops running Linpack

18 Programming Models of ES MPI/HPF on node level and process level OpenMP, threads Automatic Vectorization

19 Organization of ES 320 processor node (PN) cabinet, 2 nodes each 65 interconnect (IN) cabinet Crossbar of 640 nodes  12.3GB/s x 2 (bidirectional) node-to-node, 8TB/s aggregated 900TB disk space, 1.6 PB tape storage

20 PN of ES Arithmetic Processor (SX-6) Memory (512MB)

21 Arithmetic Processor Total of 640 x 8 = 5112 arithmetic processors

22 remarks Initial Cost:  Development: 40Billion Yen (USD $400M)  Physical Building: 7Billion Yen (USD $70M) Operating cost:  Maintenance: 8Billion Yen/Year (USD $80M) USD $2.54/sec  Electricity: 800Million Yen/Year (USD $8M)

23 Eye Candies PN cabinet, 9AP’s in one Back of a PN cabinet 1 AP, 9 in one cabinet SX-6i

24 Conclusion Connection machines were interesting Earth simulator is also interesting Early designs versus recent design  GigaFlops vs. TeraFlops When will Americans take back the crown in supercomputing?

25 references Top 500.org http://www.top500.org/ORSC/http://www.top500.org/ORSC/ Earth simulator - http://www.es.jamstec.go.jp/http://www.es.jamstec.go.jp/ http://ails.arc.nasa.gov/Images/InfoSys/AC93-0146-2.html http://ails.arc.nasa.gov/Images/InfoSys/AC90-0563-7.html http://archive.ncsa.uiuc.edu/Pubs/TechReports/TR023/Summary.html http://www.netlib.org/benchmark/top500/reports/report94/Architec/node32.html http://mission.base.com/tamiko/cm/cm-text.htm http://www.longnow.org/about/articles/ArtFeynman.html D.E. Culler, J.P. Singh and A. Gupta. “Parallel Computer Architecture – A Hardware/Software Approach” 1999 Hennessy, Patterson. “Computer Architecture – A Quantitative Approach, 2 nd Ed.” 2002 D. J. Kerbyson, A. Hoisie, H. Wasserman. “A Comparison Between the Earth Simulator and AlphaServer Systems using Predictive Application Performance Models” 2002 Thinking Machines Corp. “The Network Architecture of the Connection Machine CM-5” 1992 E. Blelloch, et. All. “A Comparison of Sorting Algorithms for the Connection Machine CM-2” 1991


Download ppt "Parallel Computers Past and Present Yenchi Lin Apr 17,2003."

Similar presentations


Ads by Google