Download presentation
Presentation is loading. Please wait.
Published byIrene Stevens Modified over 9 years ago
2
What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s of billions of files spanning 100s of petabytes for the HPC community. Licensed and supported by IBM
3
Why? Reduced cost Scalability Power usage Reliability Speed Long term storage
4
How? Distributed cluster architecture Metadata engine IBM DB2 Multiple storage classes Striped disks and tapes
5
Who Uses it? NCSA BlueWaters Argonne National Lab Indiana State University
6
Disk and Tape Hierarchical storage management (HSM) Frequently used data cache on disk Archival data on tape Automatic migration (Mirror offsite) Scalable, any instance of HPSS can access many tapes at the same time to provide parallel transfer rates. Pros: Lower cost No power usage Reliable Cons: High latency Pros: Low Latency Cons: Power usage Reliability Higher Cost
7
Standard POSIX interface Users can access files using several methods: FTP – standard FTP from mover PFTP – Parallel transfer of data from multiple movers Client API HSI – transfer files put/get files from HPSS HTAR – archive multiple files together and transfer to HPSS VFS Client XFS
8
Components Core Server Translation Human Readable Name -> HPSS Object Identifiers Translates virtual volumes into physical volumes Allows parallel I/O to the resources Schedules mounting/dismounting of media Migration/Purge Server Manages migration purge policies Disk Migration Purge Once files have been moved down the hierarchy they are purged from disk
9
Components Tape File Migration Make additional copies to multi-site setup Tape Volume Migration Move data between tapes to optimally fill up tapes Gatekeeper (GK) Account validation service Site authorization etc… Location Server (LS) Allows client to determine which location they should contact Improves speed in multi-site setups Physical Volume Library (PVL) Manages all HPSS physical volumes Mounting and dismounting ( => PVR) Atomic mounts for sets of cartridges for parallel access to data
10
Components Physical Volume Repository (PVR) Interface to request cartridge mounts and dismounts One to one with tape libraries Movers Servers Handles actual data transfers Communicates with Core Server to figure out source and destination Retries moves on failures
11
Components
12
Scalability Horizontally scales: Add more movers Add more tape drives
13
BlueWaters Software “RAIT” is being developed jointly by IBM and NCSA Add 8+2 reliability to HPSS striping 40 GbE network 100,000 tape cartridges 38.5 TB per hour
14
Indiana University Multi-site setup Centralized archival storage for all campus clusters
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.