Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Using HPS Switch on Bassi Jonathan Carter User Services Group Lead NERSC User Group Meeting June 12, 2006.

Similar presentations


Presentation on theme: "1 Using HPS Switch on Bassi Jonathan Carter User Services Group Lead NERSC User Group Meeting June 12, 2006."— Presentation transcript:

1 1 Using HPS Switch on Bassi Jonathan Carter User Services Group Lead jtcarter@lbl.gov NERSC User Group Meeting June 12, 2006

2 2 IBM Switch Evolution

3 3 YearNamePeak BWLatencyProcessor 1996SP Switch300 MB/s per node 2x150 MB/s channel 20-35 usPower2/ Power3 2000SP Switch2 (Colony) 2GB/s per node 2x500MB/s per port ~17 usPower3/ Power4 2003HPS (Federation) 2GB/s per port5-14 usPower4/ Power5

4 4 HPS Switch Configuration

5 5 Bassi Switch Configuration B0101B0201B0301B0401B0501B0601B0701B0801B0901B1001B1101B1201 B0102B0202B0302B0402B0502B0602B0702B0802B0902B1002B1102B1202 B0103B0203B0303B0403B0503B0603B0703B0803B0903B1003B1103B1203 B2904B0304B0404B0504B0704B0804B0904B1004B1104B1204 B0205B0305B0405B0505B0705B8905B0905B1005B1105B1205 B0206B0306B0406B0506B0706B0806B0906B1006B1106B1206 B0207B0307B0407B0507B0707B8907B0907B1007B1107B1207 B2908B0308B0408B0508B0708B0808B0908B1008B1108B1208 B0209B0309B0409B0709B0809B0909B1009B1109B1209 B0210B0310B0410B0710B0810B0910B1010B1110B1210 B0211B0311B0411B0711B0811B0911B1011B1111B1211 B0212B0312B0412B0712B0812B0912B1012B1112B1212

6 6 IBM Software Parallel Environment (PE 4.2.2) which contains poe and MPI remains unchanged Parallel System Support Package (PSSP 3.5.0), which contains LAPI, absorbed in Reliable Scalable Clustering Technology (RSCT 2.4.2) software stack.

7 7 IBM Software MPI 4.2.2 –Uses LAPI as reliable transport layer –Uses threads not signals for asynchronous activities Binary compatible New performance characteristics –Eager –Bulk transfer –Collectives

8 8 IBM Software Stack HPS SMA3+ Adapter HAL LAPI IF_LS IP MPI Application ESSLPESSLGPFSSockets VSDTCPUDP

9 9 Communication Modes FIFO mode –Chopped into 2KB chunks on host, copied by CPU Remote Direct Memory Access (RDMA) –CPU offload –One I/O bus crossing Adapter CPU User Buffer FIFO RDMA DMA

10 10 RDMA (Bulk transfer) Overlap of communication and computation possible –Asynchronous-messaging applications –One-sided communications Reduce CPU work –Offload fragmentation and reassembly –Minimize packet arrival interrupts Reduce memory subsystem load –Zero copy transport Striping across adapters

11 11 RDMA vs. Packet

12 12 MPI Transfer Protocols Eager: send data immediately; store in remote buffer –No synchronization –Only one message sent –Uses memory for buffering (less for application) Rendezvous: send message header; wait for recv to be posted; send data –No data copy may be required –No memory required for buffering (more for application) –More messages required –Synchronization (standard send blocks until recv posted) P0P1 data ack req ack data ack

13 13 Eager vs. Rendezvous

14 14 Latency SystemIntra (us)Inter (us) Seaborg10.524.5 Jacquard0.64.7 Bassi1.14.5

15 15 Internode Comparison

16 16 Internode Comparison

17 17 Intranode Comparison

18 18 Intranode Comparison

19 19 Packed-node Comparison

20 20 Packed-node Comparison

21 21 MP_SINGLE_THREAD –Set to Yes for slight latency decrease, set to No for MPI I/O and OpenMP, etc. MP_USE_BULK_XFER –Default to Yes MP_BULK_MIN_MSG_SIZE –Default to ~150KB POE environment variables

22 22 MP_BUFFER_MEM –Default is 64MB MP_EAGER_LIMIT –Varies from 32KB to 1KB depending on job size, can be increased in conjunction with MP_BUFFER_MEM LAPI parameters for apps with many blocking send of small mgs: –MP_REXMIT_BUF_SIZE Default 128 bytes –MP_REXMIT_BUF_CNT Default is 128 buffers POE environment variables

23 23 IBM Documentation RSCT for AIX 5L LAPI Programming Guide (SA22-7936-03) –LAPI programming Parallel Environment for AIX 5L V4.2.2 Operation and Use, Vol 1 (SA22-7948-04) –Running jobs Parallel Environment for AIX 5L V4.2.2 Operation and Use, Vol 2 (SA22-7949-04) –Performance tools Parallel Environment for AIX 5L V4.2.2 MPI Programming Guide (SA22-7945-04) –IBM MPI implementation


Download ppt "1 Using HPS Switch on Bassi Jonathan Carter User Services Group Lead NERSC User Group Meeting June 12, 2006."

Similar presentations


Ads by Google