Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grid IO APIs William Gropp Mathematics and Computer Science Division.

Similar presentations


Presentation on theme: "Grid IO APIs William Gropp Mathematics and Computer Science Division."— Presentation transcript:

1 Grid IO APIs William Gropp Mathematics and Computer Science Division

2 University of Chicago Department of Energy Issues in Grid IO Latency Bandwidth Fault tolerance Maintaining transparency to the user APIs that capture the user’s intent while preserving well defined semantics and performance

3 University of Chicago Department of Energy One User Model Parallel application expresses IO using higher- level library (HDF) or parallel IO interface (MPI-IO) For many apps, files are not bytes streams  Sequences of objects such as solution arrays Files are either read or written, not both at the same time  Caching of data on client or server side can be used App HDF5 MPI-IOADIO GlobusIO UDP

4 University of Chicago Department of Energy Grid IO with ROMIO Implementation strategy: Add a new “filesystem” type: RIO (remote I/O) Use ADIO as generic parallel file interface ADIO MPI PVFSNFSUnixOthers ADIO network RIO RIOD

5 University of Chicago Department of Energy Extending ROMIO MPI_File_read_all ADIO_ReadStridedColl ADIOI_PVFS_ReadStridedColl ADIO_PVFS_ReadContig … … Relatively easy to define new devices Layered definition makes simple ports possible …

6 University of Chicago Department of Energy Exploiting Collective Operations Write_all RIO RIOD Aggregated data written Must preserve collective I/O operation to retain performance # of servers  # processes

7 University of Chicago Department of Energy Caching for Noncollective I/O Read RIO RIOD Note that caching is nearly impossible if full POSIX semantics retained Replicas a related, higher-level approach already supported by grid toolkits like Globus

8 University of Chicago Department of Energy Optimizing WAN Data Transfers TCP: Stream interface (in-order delivery)  Implementations use window to optimize for occasional out-of-order delivery File ops: Commonly read/write an object, not a stream  Object may be large (MB to TB)  Exploit by deferring/aggregating acks and retries on an object basis (one kind of fault tolerance)  Stripe data paths (GridFTP uses just this) Preliminary work is promising (Dickens et al; PDPTA 2001)  Exploits user-intent at MPI level

9 University of Chicago Department of Energy Quality of Service for Data Transfers What is the user API? One approach: MPI Attributes  Attributes are a (key,value) pair; value is a pointer  May be attached to communicators, datatypes, and MPI memory windows  Used in MPICH-GQ (Globus-enabled version of MPICH with QoS) MPI File objects do not have attributes  But do have “info”: (key,value) pairs, key and value both a string Allows the use of communicator attributes to implement the underlying communication and info hints on the MPI File as the user API

10 University of Chicago Department of Energy Issues for Grid IO APIs A Grid I/O API must support  Collective I/O Performance Latency  Object-based transfer completion WAN optimizations Appropriate atomicity Exploit opportunities for extensions in standards  Use existing mechanisms (APIs) to transfer QoS, intent, other information Use Grid infrastructure  E.g., Globus Access to Secondary Storage (GASS)


Download ppt "Grid IO APIs William Gropp Mathematics and Computer Science Division."

Similar presentations


Ads by Google