Presentation is loading. Please wait.

Presentation is loading. Please wait.

2006/1/23Yutaka Ishikawa, The University of Tokyo1 An Introduction of GridMPI Yutaka Ishikawa and Motohiko Matsuda University of Tokyo Grid Technology.

Similar presentations


Presentation on theme: "2006/1/23Yutaka Ishikawa, The University of Tokyo1 An Introduction of GridMPI Yutaka Ishikawa and Motohiko Matsuda University of Tokyo Grid Technology."— Presentation transcript:

1 2006/1/23Yutaka Ishikawa, The University of Tokyo1 An Introduction of GridMPI Yutaka Ishikawa and Motohiko Matsuda University of Tokyo Grid Technology Research Center, AIST (National Institute of Advanced Industrial Science and Technology) This work is partially supported by the NAREGI project (1,2) (2) (1) (2) http://www.gridmpi.org/

2 2006/1/23Yutaka Ishikawa, The University of Tokyo2 Motivation MPI, Message Passing Interface, has been widely used to program parallel applications. Users want to run such applications over the Grid environment without any modifications of the program. However, the performance of existing MPI implementations is not scaled up on the Grid environment. Wide-area Network Single (monolithic) MPI application over the Grid environment computing resource site A computing resource site A computing resource site B computing resource site B

3 2006/1/23Yutaka Ishikawa, The University of Tokyo3 Motivation Focus on metropolitan-area, high-bandwidth environment:  10Gpbs,  500miles (smaller than 10ms one-way latency) –We have already demonstrated that the performance of the NAS parallel benchmark programs are scaled up if one-way latency is smaller than 10ms using an emulated WAN environment. Wide-area Network Single (monolithic) MPI application over the Grid environment computing resource site A computing resource site A computing resource site B computing resource site B Motohiko Matsuda, Yutaka Ishikawa, and Tomohiro Kudoh, ``Evaluation of MPI Implementations on Grid-connected Clusters using an Emulated WAN Environment,'' CCGRID2003, 2003 Motohiko Matsuda, Yutaka Ishikawa, and Tomohiro Kudoh, ``Evaluation of MPI Implementations on Grid-connected Clusters using an Emulated WAN Environment,'' CCGRID2003, 2003

4 2006/1/23Yutaka Ishikawa, The University of Tokyo4 Internet Issues High Performance Communication Facilities for MPI on Long and Fat Networks –TCP vs. MPI communication patterns –Network Topology Latency and Bandwidth Interoperability –Most MPI library implementations use their own network protocol. Fault Tolerance and Migration –To survive a site failure Security TCPMPI Designed for streams. Burst traffic. Repeat the computation and communication phases. Change traffic by communication patterns.

5 2006/1/23Yutaka Ishikawa, The University of Tokyo5 Interne t Issues High Performance Communication Facilities for MPI on Long and Fat Networks –TCP vs. MPI communication patterns –Network Topology Latency and Bandwidth Interoperability –Many MPI library implementations. Most implementations use their own network protocol. Fault Tolerance and Migration –To survive a site failure Security TCPMPI Designed for streams. Burst traffic. Repeat the computation and communication phases. Change traffic by communication patterns. Using Vendor C’s MPI library Using Vendor A’s MPI library Using Vendor B’s MPI library Using Vendor D’s MPI library

6 2006/1/23Yutaka Ishikawa, The University of Tokyo6 GridMPI Features MPI-2 implementation IMPI (Interoperable MPI) protocol and extension for Grid –MPI-2 –New Collective protocols –Checkpoint Integration of Vendor MPI –IBM, Solaris, Fujitsu, and MPICH2 High Performance TCP/IP implementation on Long and Fat Networks –Pacing the transmission ratio so that the burst transmission is controlled according to the MPI communication pattern. Checkpoint IMPI Cluster XCluster Y VendorMPI YAMPII

7 2006/1/23Yutaka Ishikawa, The University of Tokyo7 Evaluation It is almost impossible to reproduce the execution behavior of communication performance in the wide area network. A WAN emulator, GtrcNET-1, is used to scientifically examine implementations, protocols, communication algorithms, etc. GtrcNET-1 GtrcNET-1 is developed at AIST. injection of delay, jitter, error, … traffic monitor, frame capture Four 1000Base-SX ports One USB port for Host PC FPGA (XC2V6000) http://www.gtrc.aist.go.jp/gnet/

8 2006/1/23Yutaka Ishikawa, The University of Tokyo8 Experimental Environment 8 PCs CPU: Pentium4/2.4GHz, Memory: DDR400 512MB NIC: Intel PRO/1000 (82547EI) OS: Linux-2.6.9-1.6 (Fedora Core 2) Socket Buffer Size: 20MB WAN Emulator GtrcNET-1 8 PCs Node7 Host 0 Node0 Catalyst 3750 Node15 Host 0 Node8 Catalyst 3750 ……… Bandwidth:1Gbps Delay: 0ms -- 10ms

9 2006/1/23Yutaka Ishikawa, The University of Tokyo9 GridMPI vs. MPICH-G2 (1/4) FT (Class B) of NAS Parallel Benchmarks 3.2 on 8 x 8 processes One way delay (msec) Relative Performance

10 2006/1/23Yutaka Ishikawa, The University of Tokyo10 GridMPI vs. MPICH-G2 (2/4) IS (Class B) of NAS Parallel Benchmarks 3.2 on 8 x 8 processes One way delay (msec) Relative Performance

11 2006/1/23Yutaka Ishikawa, The University of Tokyo11 GridMPI vs. MPICH-G2 (3/4) LU (Class B) of NAS Parallel Benchmarks 3.2 on 8 x 8 processes One way delay (msec) Relative Performance

12 2006/1/23Yutaka Ishikawa, The University of Tokyo12 GridMPI vs. MPICH-G2 (4/4) NAS Parallel Benchmarks 3.2 Class B on 8 x 8 processes One way delay (msec) Relative Performance No parameters tuned in GridMPI

13 2006/1/23Yutaka Ishikawa, The University of Tokyo13 GridMPI on Actual network NAS Parallel Benchmarks run using 8 node (2.4GHz) cluster at Tsukuba and 8 node (2.8GHz) cluster at Akihabara –16 nodes Comparing the performance with –result using 16 node (2.4 GHz) –result using 16 node (2.8 GHz) JGN2 Network 10Gbps Bandwidth 1.5 msec RTT JGN2 Network 10Gbps Bandwidth 1.5 msec RTT Pentium-4 2.4GHz x 8 connected by 1G Ethernet @ Tsukuba Pentium-4 2.8 GHz x 8 Connected by 1G Ethernet @ Akihabara 60 Km (40mi.) Benchmarks Relative performance

14 2006/1/23Yutaka Ishikawa, The University of Tokyo14 Demonstration Easy installation –Download the source –Make it and set up configuration files Easy use –Compile your MPI application –Run it ! JGN2 Network 10Gbps Bandwidth 1.5 msec RTT JGN2 Network 10Gbps Bandwidth 1.5 msec RTT Pentium-4 2.4GHz x 8 connected by 1G Ethernet @ Tsukuba Pentium-4 2.8 GHz x 8 Connected by 1G Ethernet @ Akihabara 60 Km (40mi.)

15 2006/1/23Yutaka Ishikawa, The University of Tokyo15 NAREGI Software Stack (Beta Ver. 2006) ( Globus,Condor,UNICORE  OGSA / WSRF) Grid-Enabled Nano-Applications Grid PSE Grid Programing -Grid RPC -Grid MPI Grid Visualization Grid VM Distributed Information Service Grid Workflow Super Scheduler High-Performance & Secure Grid Networking Data

16 2006/1/23Yutaka Ishikawa, The University of Tokyo16 GridMPI Current Status GridMPI version 0.9 was released –MPI-1.2 features are fully supported –MPI-2.0 features are supported except for MPI-IO and one sided communication primitives –Conformance Tests MPICH Test Suite: 0/142 (Fails/Tests) Intel Test Suite: 0/493 (Fails/Tests) GridMPI version 1.0 will be released in this Spring –MPI-2.0 fully supported http://www.gridmpi.org/

17 2006/1/23Yutaka Ishikawa, The University of Tokyo17 Concluding Remarks GridMPI is integrated into the NaReGI package. GridMPI is not only for production but also our research vehicle for Grid environment in the sense that the new idea in Grid is implemented and tested. We are currently studying high-performance communication mechanisms in the long and fat network: –Modifications of TCP Behavior M Matsuda, T. Kudoh, Y. Kodama, R. Takano, and Y. Ishikawa, “TCP Adaptation for MPI on Long-and-Fat Networks,” IEEE Cluster 2005, 2005. –Precise Software Pacing R. Takano, T. Kudoh, Y. Kodama, M. Matsuda, H. Tezuka, Y. Ishikawa, “Design and Evaluation of Precise Software Pacing Mechanisms for Fast Long-Distance Networks”, PFLDnet2005, 2005. –Collective communication algorithms with respect to network latency and bandwidth.

18 2006/1/23Yutaka Ishikawa, The University of Tokyo18 BACKUP

19 2006/1/23Yutaka Ishikawa, The University of Tokyo19 GridMPI Version 1.0 – YAMPII, developed at the University of Tokyo, is used as the core implementation – Intra communication by YAMPII ( TCP/IP 、 SCore ) – Inter communication by IMPI ( TCP/IP ) MPI API TCP/IP PMv2MXO2GVendor MPI P2P Interface Request Layer Request Interface IMPI LACT Layer (Collectives) IMPI sshrshSCoreGlobusVendor MPI RPIM Interface

20 2006/1/23Yutaka Ishikawa, The University of Tokyo20 GridMPI vs. Others (1/2) NAS Parallel Benchmarks 3.2 Class B on 8 x 8 processes One way delay (msec) Relative Performance

21 2006/1/23Yutaka Ishikawa, The University of Tokyo21 GridMPI vs. Others (1/2) NAS Parallel Benchmarks 3.2 Class B on 8 x 8 processes Relative Performance

22 2006/1/23Yutaka Ishikawa, The University of Tokyo22 GridMPI vs. Others (2/2) Relative Performance NAS Parallel Benchmarks 3.2 Class B on 8 x 8 processes

23 2006/1/23Yutaka Ishikawa, The University of Tokyo23 GridMPI vs. Others Relative Performance NAS Parallel Benchmarks 3.2 16 x 16

24 2006/1/23Yutaka Ishikawa, The University of Tokyo24 GridMPI vs. Others Relative Performance NAS Parallel Benchmarks 3.2


Download ppt "2006/1/23Yutaka Ishikawa, The University of Tokyo1 An Introduction of GridMPI Yutaka Ishikawa and Motohiko Matsuda University of Tokyo Grid Technology."

Similar presentations


Ads by Google