Presentation on theme: "Global Terascale Data Management Scott Atchley, Micah Beck, Terry Moore Vanderbilt ACCRE Workshop 7 April 2005."— Presentation transcript:
Global Terascale Data Management Scott Atchley, Micah Beck, Terry Moore Vanderbilt ACCRE Workshop 7 April 2005
Talk Outline »What is Terascale Data Management? »Accessing Globally Distributed Data »What is Logistical Networking? »Performing Globally Distributed Computation
Terascale Data Management »Scientific Data Sets are Growing Rapidly Growing from 100s of GBs to 10s of TBs Terascale Supernova Initiative (ORNL) Princeton Plasma Physics Lab fusion simulation »Distributed resources Simulate on supercomputer at ORNL or NERSC Analyze/Visualize at ORNL, NCSU, PPPL, etc. »Distributed Collaborators TSI has members at ORNL, NCSU, SUNYSB, UCSD, FSU, FAU, UC-Davis, others
Terascale Data Management »Scientists use NetCDF or HDF to manage dataset Scientist has logical view of data as variables, dimensions, metadata, etc. NetCDF or HDF manages serialization of data to local file Scientist can query for specific variables, metadata, etc. Requires local file access to use (local disk, NFS, NAS, SAN, etc.) Datasets can be too large for the scientist to store locally - if so, can’t browse
NetCDF (Common Data Format) Variable1 Variable2 Variable3 Header NetCDF file stored on disk Scientist generates data Scientist uses NetCDF to organize data
Accessing Globally Distributed Data »Collaboration requires shared file system Nearly impossible over the wide-area network »Other methods include HTTP, FTP, GridFTP, scp Can be painfully slow to transfer large data Can be cumbersome to set up/manage accounts »How to manage replication? More copies can improve performance, but… If more than one copy exists, how to let others know? How to choose which replica to use?
What is Logistical Networking? »An architecture for globally distributed, sharable resources Storage Computation »A globally deployed infrastructure Over 400 storage servers in 30 countries Serving 30 TB of storage »Open-source client tools and libraries Linux, Solaris, MacOSX, Windows, AIX, others Some Java tools
Logistical Networking »Modeled on Internet Protocol »IBP provides generic storage and generic computation Weak semantics Highly scalable »L-Bone provides resource discovery »exNode provides data aggregation (length and replication) and annotation »LoRS provides fault-tolerance, high performance, security »Multiple applications available Physical Layer IBP Logistical Runtime System Applications LBoneexNode
Current Infrastructure Deployment The public deployment includes 400 IBP depots in 30 countries serving 30 TB storage (leverages PlanetLab). Private deployments for DOE, Brazilian and Czech backbones.
Available LoRS Client Tools Binaries for Windows and MacOSX. Source for Linux, Solaris, AIX, others.
LoDN - Web-based File System Store files into the Logistical Network using Java upload/download tools. Manages exNode “warming” (lease renewal and migration). Provides account (single user or group) as well as “world” permissions.
NetCDF/L »Modified NetCDF that stores data in logistical network (lors://) »Uses libxio (Unix IO wrapper) »Ported NetCDf with 13 lines of code »NetCDF 3.6 provides for >2 GB files (64-bit offset) »LoRS parameters available via environment variables
Libxio »Unix IO wrapper (open(), read(), write(), etc.) »Developed in Czech Republic for Distributed Data Storage (DiDaS) project »Port any Unix IO app using 12 lines #ifdef HAVE_LIBXIO #define open xio_open #define close xio_close... #endif »DiDaS ported transcode (video transcoding) and mplayer (video playback) apps using libxio
High Performance Streaming for Large Scale Simulations »Viraj Bhat, Scott Klasky, Scott Atchley, Micah Beck, Doug McCune, Manish Parashar, High Performance Threaded Data Streaming for Large Scale Simulations, in the proceedings of 5th IEEE/ACM International Workshop on Grid Computing, Pittsburgh, PA, Nov 2004. »Streamed data from NERSC to PPPL using LN »Sent data from previous timestep while computation proceeds on next timestep »Automatically adjusts to network latencies Adds threads when needed »Imposed <3% overhead (as compared to no IO) »Writing to GPFS at NERSC imposed 3-10% overhead (depending on block size)
Failsafe Mechanisms for Data Streaming in Energy Fusion Simulation Buffer Overflow PPPL depots depots close by Signal Data Flow Network Failure Write to GPFS or Simulation Depot GPFS file sys exnodercv Re-fetch failed transfers from GPFS/depots Supercomputer nodes depots on simulation end Post- processing routines Replication
LN in the Ctr for Plasma Edge Simulation »Implementing workflow automation Reliable communication between stages Exposed mapping of parallel files »Collaborative use of detailed simulation traces Terabytes per time step Accessed globally Distributed postprocessing and redistribution »Vizualization of partial datasets Heirarchy: Workstation/cluster/disk cache/tape Access prediction, prestaging is vital
Performing Globally Distributed Computation »Adding generic, restricted computing within the depot »Side-effect-free programming only (no system calls or external libraries except malloc allowed) »Uses IBP capabilities to pass arguments »Mobile code “oplets” with restricted environment C/compiler based Java byte code based »Test applications include Text mining: parallel grep Medical vizualization: brain fiber tracing
Transgrep »Stores 1.3 GB (web server) log file »Data is striped and replicated on dozens of depots »Transgrep breaks job into uniform blocks (10 MB) »Has depots search within blocks »Depots return matches as well as partial first and last lines »Client sorts results and searches lines that overlap block boundaries »15-20 times speed up versus local search »Automatically handles slow or failed depots
Logistical Networking First Steps »Try LoDN by browsing as a guest (Java) »Download the LoRS tools and try them out »Download and use NetCDF/L »Use libxio to port your Unix IO apps to use LN »Run your own IBP depots (publicly or privately)
More information http://loci.cs.utk.edu email@example.com