Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 THE EARTH SIMULATOR SYSTEM By: Shinichi HABATA, Mitsuo YOKOKAWA, Shigemune KITAWAKI Presented by: Anisha Thonour.

Similar presentations


Presentation on theme: "1 THE EARTH SIMULATOR SYSTEM By: Shinichi HABATA, Mitsuo YOKOKAWA, Shigemune KITAWAKI Presented by: Anisha Thonour."— Presentation transcript:

1 1 THE EARTH SIMULATOR SYSTEM By: Shinichi HABATA, Mitsuo YOKOKAWA, Shigemune KITAWAKI Presented by: Anisha Thonour

2 Extracted from the government website A high-end supercomputer (the Earth Simulator) is just like an Alien with a very big head (brain) but small arms and legs. To make the most of its CPU power, thousands of arms and legs are necessary.

3 3 Definitions  Super Computer: A supercomputer is a computer that leads the world in terms of processing capacity, particularly speed of calculation, at the time of its introduction. “Cost is no object with advanced technologies” – Dr.Pfeiffer  Parallel Processing: Processing in which multiple processors work on a single application simultaneously.

4 4 Cross-sectional View of the Earth Simulator Building

5 5 Topicsto be introduced……  Introduction  System Overview  Processor Node  Interconnection Network  Performance  Conclusion

6 6 Introduction  Global change prediction using computer simulation  1000 times faster  1997 - February 2002  87.5% peak performance(35.86TFLOPS) – LINPACK  64.9% peak performance(26.58TFLOPS) – global atmospheric circulation model with the spectral method

7 7 System Overview  Parallel vector super computer  640 processor node and interconnection network  1 processor node holds 8 arithmetic processors and main memory  Peak performance Processor node = 40TFLOPS  Achieved performance Processor node = 35.86TFLOPS  Interconnection network = 640 x 640 non-blocking crossbar switch  Bandwidth = 12.3GB/s

8 8 System Overview ctd….

9 9 System Overview ctd…..  1 cluster consist of 16 processor nodes, a cluster control station, an I/O control station and system disk  640 nodes divided into 40 clusters  2 types of clusters – S cluster(1), L cluster(39)  S cluster- 2 nodes are used to for interactive use and another for small-size batch jobs  User disks - storing user files  Mass storage system – cartridge tape library system

10 10 ctd….  Super cluster control station manages all 40 clusters and provide a single system images operational environment  High-performance and high-efficiency Architectural features:  Vector Processor  Shared memory  High-bandwidth and non-blocking interconnection crossbar network  Parallelizing, high-sustained performance  Vector processing on a processor  Parallel processing with shared memory within a node  Parallel processing among distributed nodes via the interconnection network

11 11 Processor Node  Each PN consist of 8AP, a main memory system, a remote-access control unit and an I/O processor.  Arithmetic processor can deliver up to 8GFLOPS and there are 8 APs.  It uses a high efficiency heat sink using heat pipe.  High speed main memory device to reduce the memory access latency.  Paradigms provided within a processor node is  Vector processing on a processor.  Parallel processing with shared memory.

12 12 Processor Node Configuration

13 13

14 14 Interconnection Network  640 x 640 non-blocking crossbar switch  Byte-slicing technique  Control unit and 128 data switch unit  320 PN cabinets and 65 IN cabinets  Each PN cabinets consist of 2 processor nodes and 65 IN cabinets containing the interconnection network.

15 15

16 16 Interconnection Network Wiring

17 17 Inter-node communication mechanism  Node A requests the control unit to reserve a data path from node A to node B, and the control unit reserves the data path, then replies to node A.  Node A begins data transfer to node B  Node B receives all the data, then sends the data transfer completion code to node A.

18 18 Inter-node interface with ECC codes

19 19 Inter-node interface with ECC codes  To resolve the error occurrence rate problem, ECC codes are added to the transfer data.  A receiver node detects the occurrence of intermittent inter-node communication failure by checking ECC codes, and the error byte data can almost always be corrected by RCU within the receiver node.  ECC used for recovering from inter-node communication failure from a data switch unit malfunction.  Correction done until switch unit is repaired.

20 20 Barrier Synchronization mechanism using GBC

21 21 Barrier synchronization mechanism using GBC GBC-Global barrier counter GBF-Global barrier flag  Barrier synchronization mechanism  The master node sets the number of nodes used for the parallel program into GBC within the IN’s control unit  The control unit resets all GBF’s of the nodes used for the program  The node, on which task completes, decrements GBC within the control unit, and repeats to check GBF until GBF is asserted  When GBC=0, the control unit asserts all GBF’s of the nodes used for the program  All the nodes begin to process the next tasks.  The barrier synchronization time is constantly less than 3.5µsec

22 22 Bird's-eye View of the Earth Simulator System

23 23

24 24 Performance  Using GBC feature, MPI-Barrier synchronization time is constantly less than 3.5µsec.  The software barrier synchronization time increases, or is proportional to the number of nodes.

25 25 Performance  The interconnection network is a single stage network so this performance is always achieved for every two-node communication.

26 26 Performance  The ratio of peak performance is more than 85%.  Performance is proportional to the number of nodes.

27 27 Conclusion  High-performance and high-efficiency Architectural features:  Vector Processor  Shared memory  High-bandwidth and non-blocking interconnection crossbar network  Parallelizing, high-sustained performance  Vector processing on a processor  Parallel processing with shared memory within a node  Parallel processing among distributed nodes via the interconnection network  87.5% peak performance(35.86TFLOPS) – LINPACK  64.9% peak performance(26.58TFLOPS) – global atmospheric circulation model with the spectral method

28 28 Applications-Solid Earth Simulation Group  We are developing new algorithms for the geophysical simulations as well as new grid systems in the spherical geometry.

29 29 Solid Earth Simulation Group

30 30 To understand the mechanism of the variability with time scale from a few days to decades and to study the predictability in the atmosphere.

31 31 To study the effects of meso-scale phenomena on the ocean general circulation and the material transport.

32 32 To understand the mechanism of the variability and to study the predictability in the coupled atmosphere–ocean system.

33 33 References  http://www.thocp.net/hardware/nec_ess.htm  http://www.es.jamstec.go.jp/esc/eng/Hardware/in.html

34 Thank You


Download ppt "1 THE EARTH SIMULATOR SYSTEM By: Shinichi HABATA, Mitsuo YOKOKAWA, Shigemune KITAWAKI Presented by: Anisha Thonour."

Similar presentations


Ads by Google