Presentation is loading. Please wait.

Presentation is loading. Please wait.

HPSS The High Performance Storage System Developed by IBM, LANL, LLNL, ORNL, SNL, NASA Langley, NASA Lewis, Cornell, MHPCC, SDSC, UW with funding from.

Similar presentations


Presentation on theme: "HPSS The High Performance Storage System Developed by IBM, LANL, LLNL, ORNL, SNL, NASA Langley, NASA Lewis, Cornell, MHPCC, SDSC, UW with funding from."— Presentation transcript:

1 HPSS The High Performance Storage System Developed by IBM, LANL, LLNL, ORNL, SNL, NASA Langley, NASA Lewis, Cornell, MHPCC, SDSC, UW with funding from DoE, NASA & NSF Presented by Christopher Ho, CSci 599

2 Motivation n In last 10 years, processor speeds have increased 50-fold n Disk transfer rates have increased < 4 X u RAID now successful, inexpensive n Tape speeds have increased < 4 X u tape striping not widespread n Performance gap is widening! n Bigger & bigger files (10s, 100s of GB, soon TB) n => Launch scalable storage initiative

3 IEEE Mass Storage Reference Model n Defines layers of abstraction & transparency u device, location independence n Separation of policy and mechanism n Logical separation of control and data flow n Defines common terminology u compliance does not imply inter-operability n Scalable, Hierarchical Storage Management n see http://www.ssswg.org/sssdocs.html

4 Introduction: Hierarchical Storage n Storage pyramid Magnetic Tape Memory Disk Optical disk Decreasing cost & speed, Increasing capacity

5 HPSS Objectives n Scalable u transfer rate, file size, name space, geography n Modular u software subsystems replaceable, network/tape technologies updateable, API access n Portable u multiple vendor platforms, no kernel modifications, multiple storage technologies, standards-based, leverage commercial products

6 HPSS Objectives (cont) n Reliable u distributed software and hardware components u atomic transactions u mirror metadata u failed/restarted servers can reconnect u storage units can be varied on/offline

7 Access into HPSS n FTP u protocol already supports 3rd party transfers u new: partial file transfer (offset & size) n Parallel FTP u pget, pput, psetblocksize, psetstripewidth n NFS version 2 u most like traditional file system, slower than FTP n PIOFS u parallel distributed FS on IBM SP2 MPP n futures: AFS/DCE DFS, DMIG-API

8 HPSS architecture Network Attached Disk Network Attached Tape HiPPI/FC/ATM data network I/O node MPP interconnect Processing node HPSS server Storage System Mgmt I/O node NFS FTP DMIG-API - NETWORK Control Network

9 Software infrastructure n Encina transaction processing manager u two-phase commit, nested transactions u guarantees consistency of metadata, server state n OSF Distributed Computing Environment u RPC calls for control messages u Thread library u Security (registry & privilege service) F Kerberos authentication n 64 bit Arithmetic functions u file sizes up to 2^64 bytes u 32 bit platforms, big/little endian architectures

10 Software components n Name server u map POSIX filenames to internal file, directory or link n Migration/Purge policy manager u when/where to migrate to next level in hierarchy u after migrated, when to purge copy on this level F purge initiated when usage exceeds administrator- configured high-water mark F each file evaluated by size, time since last read u migration, purge can also be manually initiated

11 Software components (cont) n Bitfile server u provides abstraction of bitfiles to client u provides scatter/gather capability u supports access by file offset, length u supports random and parallel reads/writes u works with file segment abstraction (see Storage server)

12 Software components (cont) n Storage server u map segments onto virtual volumes, virtual volumes onto physical volumes u virtual volumes allow tape striping n Mover u transfers data from a source to a sink F tape, disk, network, memory u device control: seek, load/unload, write tape mark, etc.

13 Software components (cont) n Physical Volume Library u map physical volume to cartridge, cartridge to PVR n Physical Volume Repository u control cartridge mount/dismount functions u modules for Ampex D2, STK 4480/90 & SD-3, IBM 3480 & 3590 robotic libraries n Repack server u deletions leave gaps on sequential media F read live data, rewrite on new sequential volume, free up previous volume

14 Software components (cont) n Storage system management u GUI to monitor/control HPSS u stop/start software servers u monitor events and alarms, manual mounts u vary devices on/offline

15 Parallel transfer protocol - goals n Provide parallel data exchange between heterogeneous systems and devices n Support different combinations of parallel and sequential source/sink n Support gather/scatter and random access u combinations of stripe width, both regular and irregular data block size n Scalable I/O bandwidth n Transport independent (TCP/IP, HiPPI, FCS, ATM)

16 Gather/scatter lists S1S2S3AB logical window ABABABxyzxyzxyz D1D2D3

17 Parallel transport architecture S1SnD1Dn parallel data flow control connections client control connections

18 Parallel FTP transfer (pget) Mover Storage server Bitfile server Name serverParallel FTPd 1 2 3 44 Client mover 5 5 6 6 Parallel FTP client

19 Summary n High performance u up to 1 GB/s aggregate transfer rates n Scalable storage u parallel architecture u terabyte-sized files u petabytes in archive n Robust u transaction processing manager n Portable u IBM, Sun implementations available

20 Conclusion n Feasability has been demonstrated for large, scalable storage n Software exists, is shipping, and is actively used in the national labs on a daily basis n Distributed architecture and parallel capabilities mesh well with grid computing


Download ppt "HPSS The High Performance Storage System Developed by IBM, LANL, LLNL, ORNL, SNL, NASA Langley, NASA Lewis, Cornell, MHPCC, SDSC, UW with funding from."

Similar presentations


Ads by Google