Presentation is loading. Please wait.

Presentation is loading. Please wait.

High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.

Similar presentations


Presentation on theme: "High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009."— Presentation transcript:

1 High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl.gov HEPiX October 26-30, 2009

2 HEPiX, October 26-30, 2009 1 Agenda How HPSS Works Current Features Future Directions (to Extreme Scale)

3 2 HPSS as a Hierarchical Storage Management Top of pyramid is the Class of Service (COS) Pyramid is a single hierarchy, we have many of these Each level is a storage class, each storage class can be striped (disk & tape) and produce multiple copies (tape only) Migration copies files to lower levels Files can exist at all levels within a hierarchy Continually replacing all hardware within a level for technology refresh Capacity Latency Local Disk or Tape High Capacity Disk Fast Disk Remote Disk or Tape HEPiX, October 26-30, 2009

4 3 A HPSS Transfer Switch LAN 2. Core Server accesses metadata on disk 3. Core Server commands Mover to stage file from tape to disk Tape 4. Mover stages file from tape to disk 5. Core Server sends lock and ticket back to client 6. Mover reads data and sends to client over LAN Metadata Client Cluster HPSS Movers HPSS Core Server Data Disks 1. Client issues READ to Core Server HEPiX, October 26-30, 2009

5 HPSS Current Features (v7) Single client transfer optimizations –Globus gridFTP service –Striping a single file across Disk or Tape drives –Aggregation capable clients (HTAR, PSI) Manage 10’s of PBs effectively –Dual copy on tape, delayed or real-time –Technology insertion –Recover data from another copy –Aggregation on migration to tape Data Management Possibilities –User-defined attributes on files File System Interfaces –GPFS/HPSS Integration – IBM –Lustre/HPSS Integration – CEA/Sun-CFS –Virtual File System interface HEPiX, October 26-30, 2009 4

6 HPSS Feature – gridFTP Transfers Data Transfer Working Group –Data transfer nodes at ORNL-LCF, ANL-LCF, and LBNL-NERSC with ESNet –Optimize WAN transfers between global file systems and archives at the sites Dedicated WAN nodes are helping users –Several 20TB days between HPSS and DTN global file system –Several large data set/project movements between sites Have plans for –SRM: BeStMan to aid in scheduling and persistent transfers between sites –Increasing network (ESNet), and transfer nodes as usage increases HEPiX, October 26-30, 2009 5

7 HPSS Feature – Striping transfers across disk/tape 6 Switch LAN Tape Metadata Client Cluster I/O Node HPSS Movers HPSS Core Server Data Disks Client network BW is the bottleneck HEPiX, October 26-30, 2009

8 HPSS Feature – Multi-noded transfers & striping in HPSS 7 Switch LAN Tape Metadata Client Cluster I/O Node HPSS Movers HPSS Core Server Data Disks Match client BW to HPSS mover BW HEPiX, October 26-30, 2009

9 HPSS Feature – Virtual File System 8 Unix/Posix Application Posix File System Interface HPSS VFS Extensions & Daemons HPSS Client API HPSS Data Movers HPSS Core Server Data Buffer Linux Client HPSS Cluster AIX or Linux Control Data Optional SAN Data Path HPSS accessed using standard UNIX/Posix semantics Run standard applications on HPSS such as IBM DB2, IBM TSM, NFSv4, and Samba VFS available for Linux HEPiX, October 26-30, 2009

10 HPSS Feature – User-defined Attributes Goals: –Provide an extensible set of APIs that will insert/update/delete/select UDAs from database –Provide robust search capability Storage based on DB2 pureXML Possible uses: –Checksum type w/value –Application specific –Expiration/action date –File version –Lustre path –Tar file TOCs Planned uses: –HSI: cksum, expiration date, trashcan, annotation, some application specific –HTAR: creator code and expiration date HEPiX, October 26-30, 2009 9

11 Extreme Scale (2018-2020) Series of workshops conducted by users, applications, and organizations starting in 2007 Proposed new program within DOE to realize computing at exascale levels Challenges: –Power 20 MW - ? –Cost (size of the system, # of racks) 3.6 - 300PB of memory –Storage Exabytes of data, millions of concurrent accesses, PBs dataset movement between sites HPSS held a ES workshop and determined the following challenges: –Scalability –Data Management –System Management –Hardware 10 HEPiX, October 26-30, 2009

12 HPSS v8.1 Multiple Metadata Servers –Optimizes multiple client transfers –Enables managing Exabytes of data effectively On-line Upgrades –Ability to upgrade HPSS software while system available to users HEPiX, October 26-30, 2009 11

13 HPSS post 8.1 Advanced Data Management –Collaboration with data management community (SRMs, Content Managers…) Integration with 3 rd party tape monitoring applications –Crossroads, HiStor, Sun solutions? Metadata footprint reduction New client caching for faster pathname operations HEPiX, October 26-30, 2009 12

14 Thank you, Questions? HEPiX, October 26-30, 2009 13


Download ppt "High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009."

Similar presentations


Ads by Google