Presentation is loading. Please wait.

Presentation is loading. Please wait.

PHENIX Computing Center in Japan (CC-J) Takashi Ichihara (RIKEN and RIKEN BNL Research Center ) Presented on 08/02/2000 at CHEP2000 conference, Padova,

Similar presentations


Presentation on theme: "PHENIX Computing Center in Japan (CC-J) Takashi Ichihara (RIKEN and RIKEN BNL Research Center ) Presented on 08/02/2000 at CHEP2000 conference, Padova,"— Presentation transcript:

1 PHENIX Computing Center in Japan (CC-J) Takashi Ichihara (RIKEN and RIKEN BNL Research Center ) Presented on 08/02/2000 at CHEP2000 conference, Padova, Italy

2 Takashi Ichihara (RIKEN / RIKEN BNL Research Center) Contents 1. Overview 2. Concept of the system 3. System Requirement 4. Other requirement as a Regional Computing Center 5. Plan and current status 6. WG for constructing the CC-J (CC-J WG) 7. Current configuration of the CC-J 8. Photographs of the CC-J 9. Linux CPU farm 10. Linux NFS performance v.s. kernel 11. HPSS current configuration 12. HPSS performance test 13. WAN performance test 14. Summary

3 Takashi Ichihara (RIKEN / RIKEN BNL Research Center) PHENIX CC-J : Overview  PHENIX Regional Computing Center in Japan (CC-J) at RIKEN  Scope  Principal site of computing for PHENIX simulation  PHENIX CC-J is aiming at covering most of the simulation tasks of the whole PHENIX experiments  Regional Asian computing center  Center for the analysis of RHIC spin physics  Architecture  Essentially follow the architecture of RHIC Computing Facility (RCF) at BNL  Construction  R&D for the CC-J started in April ‘98 at RBRC  Construction began in April ‘99 over a three years period  1/3 scale of of the CC-J will be operational in April 2000

4 Takashi Ichihara (RIKEN / RIKEN BNL Research Center) Concept of the CC-J System

5 Takashi Ichihara (RIKEN / RIKEN BNL Research Center) System Requirement for the CC-J  Annual Data amount DST150 TB micro-DST 45 TB Simulated Data 30 TB Total 225 TB  Hierarchical Storage System Handle data amount of 225TB/year Total I/O bandwidth: 112 MB/s HPSS system  Disk storage system 15 TB capacity All RAID system I/O bandwidth: 520 MB/s  CPU ( SPECint95) Simulation 8200 Sim. Reconst 1300 Sim. ana. 170 Theor. Mode 800 Data Analysis 1000 Total 11470  Data Duplication Facility Export/import DST, simulated data.

6 Takashi Ichihara (RIKEN / RIKEN BNL Research Center) Other Requirements as a Regional Computing Center s Software Environment Software environment of the CC-J should be compatible to the PHENIX Offline Software environment at the RHIC Computing Facility (RCF) at BNL AFS accessibility (/afs/rhic) Objectivity/DB accessibility (replication to be tested soon) s Data Accessibility Need exchange data of 225 TB/year to RCF Most part of the data exchange will be done by SD3 tape cartridges (50GB/volume) Some part of the data exchange will be done over the WAN CC-J will use Asia-Pacific Advanced Network ( APAN ) for US-Japan connection http://www.apan.net/ APAN has currently 70 Mbps bandwidth for Japan-US connection Expecting 10-30% of the APAN bandwidth (7-21 M bps) can be used for this project: 75-230 GB/day ( 27 - 82 TB/year) will be transferred over the WAN

7 Takashi Ichihara (RIKEN / RIKEN BNL Research Center) Plan and current status of the CC-J

8 Takashi Ichihara (RIKEN / RIKEN BNL Research Center) Working Group for the CC-J construction (CC-J WG)  CC-J WG is a main body to construct the CC-J Hold bi-weekly regular meeting at RIKEN Wako, to discuss technical items and project plans etc. n Mailing list of the CC-J WG created (mail traffic: 1600 mails /year)

9 Current configuration of the CC-J

10 Photographs of the PHENIX CC-J at RIKEN

11 Takashi Ichihara (RIKEN / RIKEN BNL Research Center) Linux CPU farms s Memory Requirement : 200-300 MB/CPU for a simulation chain s Node specification Motherboard: ASUS p2b Dual CPU /node (currently total 64 CPU) PentiumII (450MHz) 32 CPU + Pentium III (600 MHz) 32 CPU 512 MB memory / node (1GB SWAP/node) 14 GB HD /node (system 4GB, work 10 GB) 100 BaseT Ethernet interface (DECchip Tulip) Linux Redhat 5.2 (kernel 2.2.11 + nfsv3 patch) Portable Batch System (PBS V2.1) for batch queuing AFS is accessed through the NFS (No AFS client is installed on Linux pc) Daily mirroring of the /afs/rhic contents to a local disk file system is carrying out s PC Assemble (Alta cluster) Remote hardware-reset/power control, Remote CPU temp. monitor Serial port login from the next node (minicom) for maintenance (fsck etc.)

12 Takashi Ichihara (RIKEN / RIKEN BNL Research Center) Linux NFS performance v.s. kernel s NFS Performance test using bonnie benchmark for 2 GB file NFS Server : SUN Enterprise 450 (Solaris 2.6) 4 CPU (400MHz) 1GB memory NFS client : Linux RH5.2, Dual Pentium II 600 MB, 512 MB memory _ NFS performance of the recent Linux kernel seems to be improved _ nfsv3 patch is still useful for the recent kernel (2.2.14) – currently we are using the kernel 2.2.11 + nfsv3 patch – nfsv3 patch is available from http://www.fys.uio.no/~trondmy/src/

13 Takashi Ichihara (RIKEN / RIKEN BNL Research Center) Current HPSS hardware configuration IBM RS6000-SP 5-node (silver node: Quadruple PowerPC604e 332 MHz CPU/node ) Core server : 1, Disk mover : 2, Tape mover : 2 SP switch (300 MB/s) and 1000BaseSX NIC (OEM of Alteon) A StorageTek Powderhorn Tape Robot 4 Redwood drives and 2000 SD3 cartridges (100 TB) dedicated for HPSS Sharing the robot with other HSM systems 6 drives and 3000 cartridges for other HSM systems Gigabit Ethernet Alteon ACE180 switch for Jumbo Frame ( 9 kB MTU) Use of the Jumbo Frame reduces the CPU utilization for transfer CISCO Catalyst 2948G for distribution to 100BaseT Cache Disk : 700 GB (total), 5 components 3 SSA loops (50 GB each) 2 FW-SCSI RAID (270 GB each)

14 Takashi Ichihara (RIKEN / RIKEN BNL Research Center) Performance test of parallel ftp (pftp) of HPSS s pput from SUN-E450 : 12 MB/s for one pftp connection Gigabit Ethernet, Jumbo Frame (9 kB MTU) s pput from LINUX : 6 MB/s for one pftp connection 100BaseT - G.Ether - Jumbo (defragment on a switch) s Totally 〜 50 MB/s pftp performance was obtained for pput

15 Takashi Ichihara (RIKEN / RIKEN BNL Research Center) WAN performance test s RIKEN (12 Mbps) - IMnet - APAN (70 Mbps) -startap- ESnet - BNL Round Trip Time for RIKEN-BNL :170 ms File transfer rate is 47 kB/s for 8 kB TCP widowsize (Solaris default) Large TCP-window size is necessary to obtain high-transfer rate RFC1323 (TCP Extensions for high performance, May 1992) describes the method of using large TCP window-size (> 64 KB)  Large ftp performance (641 kB/s = 5 Mbps) was obtained for a single ftp connection using a large TCP window-size (512 kB) over the pacific ocean (RTT = 170 ms)

16 Takashi Ichihara (RIKEN / RIKEN BNL Research Center) Summary s The construction of the PHENIX Computing Center in Japan (CC-J) at RIKEN Wako campus, which will extend over a three years period, began in April 1999. s The CC-J is intended as the principal site of computing for PHENIX simulation, a regional PHENIX Asian computing center, and a center for the analysis of RHIC spin Physics. s The CC-J will handle the data of about 220 TB/year and the total CPU performance is planned to be 10,000 SPECint95 in 2002. s CPU farm of 64 processors (RH5.2, kernel 2.2.11 with nfsv3 patch) is stable. s About 50 MB/s pftp performance was obtained for HPSS access. s Large ftp performance (641 KB/s = 5 Mbps) was obtained for a single ftp connection using a large TCP window-size (512 kB) over the Pacific Ocean (RTT = 170 ms) s Stress tests for the entire system were carried out successfully. s Replication of the Objectivity/DB over the WAN will be tested soon. s The CC-J operation will be started in April 2000.


Download ppt "PHENIX Computing Center in Japan (CC-J) Takashi Ichihara (RIKEN and RIKEN BNL Research Center ) Presented on 08/02/2000 at CHEP2000 conference, Padova,"

Similar presentations


Ads by Google