Presentation is loading. Please wait.

Presentation is loading. Please wait.

National Institute of Advanced Industrial Science and Technology Gfarm Grid File System for Distributed and Parallel Data Computing Osamu Tatebe

Similar presentations


Presentation on theme: "National Institute of Advanced Industrial Science and Technology Gfarm Grid File System for Distributed and Parallel Data Computing Osamu Tatebe"— Presentation transcript:

1 National Institute of Advanced Industrial Science and Technology Gfarm Grid File System for Distributed and Parallel Data Computing Osamu Tatebe o.tatebe@aist.go.jp Grid Technology Research Center, AIST APAN Workshop on Exploring eScience Aug 26, 2005 Taipei, Taiwan

2 National Institute of Advanced Industrial Science and Technology [Background] Petascale Data Intensive Computing Detector for ALICE experiment Detector for LHCb experiment High Energy Physics CERN LHC, KEK-B Belle ~MB/collision, 100 collisions/sec ~PB/year 2000 physicists, 35 countries Astronomical Data Analysis data analysis of the whole data TB~PB/year/telescope Subaru telescope 10 GB/night, 3 TB/year

3 National Institute of Advanced Industrial Science and Technology Petascale Data-intensive Computing Requirements Peta/Exabyte scale files, millions of millions of files Scalable computational power > 1TFLOPS, hopefully > 10TFLOPS Scalable parallel I/O throughput > 100GB/s, hopefully > 1TB/s within a system and between systems Efficiently global sharing with group-oriented authentication and access control Fault Tolerance / Dynamic re-configuration Resource Management and Scheduling System monitoring and administration Global Computing Environment

4 National Institute of Advanced Industrial Science and Technology Goal and feature of Grid Datafarm Goal Dependable data sharing among multiple organizations High-speed data access, High-performance data computing Grid Datafarm Gfarm Grid File System – Global dependable virtual file system Federates scratch disks in PCs Parallel & distributed data computing Associates Computational Grid with Data GridFeatures Secured based on Grid Security Infrastructure Scalable depending on data size and usage scenarios Data location transparent data access Automatic and transparent replica selection for fault tolerance High-performance data access and computing by accessing multiple dispersed storages in parallel (file affinity scheduling)

5 National Institute of Advanced Industrial Science and Technology Gfarm file system (1) Virtual file system that federates local disks of cluster nodes or Grid nodes Enables transparent access using Global namespace to dispersed file data in a Grid Supports fault tolerance and avoid access concentration by automatic and transparent replica selection It can be shared among all cluster nodes and clients Gfarm File System /gfarm ggfjp aistgtrc file1file3 file2 file4 file1file2 File replica creation Global namespace mapping

6 National Institute of Advanced Industrial Science and Technology Gfarm file system (2) A file can be shared among all nodes and clients Physically, it may be replicated and stored on any file system node Applications can access it regardless of its location In cluster environment, shared secret key is used for authentication GridFTP, samba, NFS server Gfarm metadata server Compute node Client PC Note PC /gfarm metadata Gfarm file system … File A File B File C File A File B File C File B

7 National Institute of Advanced Industrial Science and Technology Grid-wide configuration Grid-wide file system by integrating local disks in several areas GSI authentication It can be shared among all cluster nodes and clients GridFTP and samba servers in each site GridFTP server, samba server, (NFS server) Metaserver node Compute & fs node … GridFTP server, samba server, (NFS server) Compute & fs node … GridFTP server, samba server, (NFS server) Compute & fs node … US JapanSingapore /gfarm Gfarm Grid file system

8 National Institute of Advanced Industrial Science and Technology Feature of Gfarm file system A file can be stored on any file system (compute) node (Distributed file system) A file can be replicated and stored on different nodes (Fault tolerant, access concentration tolerant) When there is a file replica on a compute node, it can be accessed without overhead (High performance, scalable I/O)

9 National Institute of Advanced Industrial Science and Technology More Scalable I/O Performance CPU Gfarm file system Cluster, Grid File A network Job A File A User’s viewPhysical execution view in Gfarm (file-affinity scheduling) File B Job A Job B File B File system nodes = compute nodes Shared network file system Do not separate storage and CPU (SAN not necessary) Move and execute program instead of moving large-scale data Scalable file I/O by exploiting local I/O User A submits that accessesis executed on a node that has User B submits that accessesis executed on a node that has

10 National Institute of Advanced Industrial Science and Technology Gfarm TM Data Grid middleware Open source development Gfarm TM version 1.1.1 released on May 17 th, 2005 ( http://datafarm.apgrid.org/ ) http://datafarm.apgrid.org/ Read-write mode support, more support for existing binary applications, metadata cache server A shared file system in a cluster or a grid Accessibility from legacy applications without any modification Standard protocol support by scp, GridFTP server, samba server,... application Gfarm client library Metadata server CPU... gfsd gfmdslapd Compute and file system nodes Existing applications can access Gfarm file system without any modification using LD_PRELOAD of syscall hooking library or GfarmFS-FUSE

11 National Institute of Advanced Industrial Science and Technology Gfarm TM Data Grid middleware (2) libgfarm – Gfarm client library Gfarm API gfmd, slapd – Metadata server Namespace, replica catalog, host information, process information gfsd – I/O server Remote file access application Gfarm client library Metadata server CPU... gfsd gfmdslapd Compute and file system nodes File, host information Remote file access

12 National Institute of Advanced Industrial Science and Technology Access from legacy applications libgfs_hook.so – system call hooking library It emulates to mount Gfarm file system at /gfarm hooking open(2), read(2), write(2), … When it accesses under /gfarm, call appropriate Gfarm API Otherwise, call ordinal system call Re-link not necessary by specifying LD_PRELOAD Linux, FreeBSD, NetBSD, … Higher portability than developing kernel module Mounting Gfarm file system GfarmFS-FUSE enables to mount Gfarm file system using FUSE mechanism in Linux released on Jul 12, 2005 Need to develop a kernel module for other OSs Need volunteers

13 National Institute of Advanced Industrial Science and Technology Gfarm – Application and performance result http://datafarm.apgrid.org/

14 National Institute of Advanced Industrial Science and Technology Scientific Application (1) ATLAS Data Production Distribution kit (binary) Atlfast – fast simulation Input data stored in Gfarm file system not NFS G4sim – full simulation (Collaboration with ICEPP, KEK) Belle Monte-Carlo/Data Production Online data processing Distributed data processing Realtime histgram display 10 M events generated in a few days using a 50-node PC cluster (Collaboration with KEK, U-Tokyo)

15 National Institute of Advanced Industrial Science and Technology Scientific Application (2) Astronomical Object Survey Data analysis on the whole archive 652 GBytes data observed by SUBARU telescope Large configuration data from Lattice QCD Three sets of hundreds of gluon field configurations on a 24^3*48 4-D space-time lattice (3 sets x 364.5 MB x 800 = 854.3 GB) Generated by the CP-PACS parallel computer at Center for Computational Physics, Univ. of Tsukuba (300Gflops x years of CPU time)

16 National Institute of Advanced Industrial Science and Technology Performance result of parallel grep 25 GBytes text file Xeon 2.8GHz/512KB, 2GB memory NFS340 sec (sequential grep) Gfarm15 sec (16 fs nodes, 16 parallel processes) 22.6 times superlinear speed up Gfarm file system Compute node Compute node Compute node... NFS Compute node *Gfarm file system consists of local disks of compute nodes

17 National Institute of Advanced Industrial Science and Technology GridFTP data transfer performance Two GridFTP servers can provide almost peak performance (1 Gbps) Local disk vs Gfarm (1~2 nodes) ftpd Client

18 National Institute of Advanced Industrial Science and Technology Gaussian 03 in Gfarm Ab initio quantum chemistry Package Install once and run everywhere No modification required to access Gfarm Test415 (IO intensive test input) 1h 54min 33sec (NFS) 1h 0min 51sec (Gfarm) Parallel analysis of all 666 test inputs using 47 nodes Write error! (NFS) Due to heavy IO load 17h 31m 02s (Gfarm) Quite good scalability of IO performance Elapsed time can be reduced by re-ordering test inputs NFS vs Gfarm Compute node Compute node Compute node... NFS vs Gfarm Compute node *Gfarm consists of local disks of compute nodes

19 National Institute of Advanced Industrial Science and Technology Bioinformatics in Gfarm iGAP (Integrative Genome Annotation Pipeline) - A suite of bioinformatics software for protein structural and functional annotation - More than 140 complete or partial proteomes analyzed iGAP on Gfarm - Install once and run everywhere using Gfarm’s high performance file replication and transfer - no modifications required to use distributed compute and storage resource Burkholderia mallei (Bacteria) Gfarm makes it possible to use iGAP to analyze the complete proteome (available 9/28/04) of the bacteria Burkholderia mallei, a known biothreat agent, on distributed resources. This is a collaboration under PRAGMA and the data is available through http://eol.sdsc.edu. Participating sites: SDSC/UCSD (US), BII (Singapore), Osaka Univ, AIST (Japan), Konkuk Univ, Kookmin Univ, KISTI (Korea)

20 National Institute of Advanced Industrial Science and Technology Protein sequences Prediction of : signal peptides (SignalP, PSORT) transmembrane (TMHMM, PSORT) coiled coils (COILS) low complexity regions (SEG) Structural assignment of domains by PSI-BLAST profiles on FOLDLIB Structural assignment of domains by 123D on FOLDLIB Structural assignment of domains by WU-BLAST Data Warehouse Functional assignment by PFAM, NR assignments FOLDLIB Building FOLDLIB: PDB chains SCOP domains PDP domains CE matches PDB vs. SCOP 90% sequence non-identical minimum size 25 aa coverage (90%, gaps <30, ends<30) Domain location prediction by sequence structure info sequence info Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 NR, PFAM SCOP, PDB

21 National Institute of Advanced Industrial Science and Technology Cluster configuration of Worldwide iGAP/Gfarm data analysis NPTypeProc Spee d (GHz) Memor y (GB) Hard Disk (GB) OS Kookmi n 81Athlon10.25640 RH 7.3 Konkuk81 Intel P4 3140 RH 7.3 KISTI81 Intel P4 2.4180 RH 7.3 UCSD52 Intel Xeon 3.06240 Rocks 3.2 BII41 Intel P3 1.3130 RH 8.0 Osaka101 Intel P3 1.4170 RH 7.2 Total434820252.0482500

22 National Institute of Advanced Industrial Science and Technology Preliminary performance result Multiple cluster data analysis 4-node cluster A + 4-node cluster B 17.39 min 4-node cluster A 30.07 min NFS Gfarm

23 National Institute of Advanced Industrial Science and Technology

24 Development Status and Future Plan Gfarm – Grid file system Global virtual file system A dependable network shared file system in a cluster or a grid High performance data computing support Associates Computational Grid with Data Grid Gfarm Grid software Version 1.1.1 released on May 17, 2005 (http://datafarm.apgrid.org/)http://datafarm.apgrid.org/ Version 1.2 available real soon now Existing programs can access Gfarm file system using syscall hooking library or GfarmFS-FUSE Distribute analysis shows scalable I/O performance iGAP/Gfarm – bioinformatics package Gaussian 03 – Ab initio quantum chemistry package Standardization effort with GGF Grid File System WG (GFS-WG) https://datafarm.apgrid.org/


Download ppt "National Institute of Advanced Industrial Science and Technology Gfarm Grid File System for Distributed and Parallel Data Computing Osamu Tatebe"

Similar presentations


Ads by Google