High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.

Slides:



Advertisements
Similar presentations
A Proposal of Capacity and Performance Assured Storage in The PRAGMA Grid Testbed Yusuke Tanimura 1) Hidetaka Koie 1,2) Tomohiro Kudoh 1) Isao Kojima 1)
Advertisements

The RHIC-ATLAS Computing Facility at BNL HEPIX – Edinburgh May 24-28, 2004 Tony Chan RHIC Computing Facility Brookhaven National Laboratory.
IBM Software Group ® Integrated Server and Virtual Storage Management an IT Optimization Infrastructure Solution from IBM Small and Medium Business Software.
Priority Research Direction (I/O Models, Abstractions and Software) Key challenges What will you do to address the challenges? – Develop newer I/O models.
HPSS Update Jason Hick Mass Storage Group NERSC User Group Meeting September 17, 2007.
Silicon Graphics, Inc. Poster Presented by: SGI Proprietary Technologies for Breakthrough Research Rosario Caltabiano North East Higher Education & Research.
Applying Data Grids to Support Distributed Data Management Storage Resource Broker Reagan W. Moore Ian Fisk Bing Zhu University of California, San Diego.
Jean-Yves Nief, CC-IN2P3 Wilko Kroeger, SCCS/SLAC Adil Hasan, CCLRC/RAL HEPiX, SLAC October 11th – 13th, 2005 BaBar data distribution using the Storage.
NWfs A ubiquitous, scalable content management system with grid enabled cross site data replication and active storage. R. Scott Studham.
What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.
1 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Mass Storage For BaBar at SLAC Andrew Hanushevsky Stanford.
Mass RHIC Computing Facility Razvan Popescu - Brookhaven National Laboratory.
IBM TotalStorage ® IBM logo must not be moved, added to, or altered in any way. © 2007 IBM Corporation Break through with IBM TotalStorage Business Continuity.
XenData Digital Archives Simplify your video archive workflow XenData LTO Video Archive Solutions Overview © Copyright 2013 XenData Limited.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
Pooja Shetty Usha B Gowda.  Network File Systems (NFS)  Drawbacks of NFS  Parallel Virtual File Systems (PVFS)  PVFS components  PVFS application.
Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.
Latest Relevant Techniques and Applications for Distributed File Systems Ela Sharda
High Performance Storage System Harry Hulen
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
DATABASE MANAGEMENT SYSTEMS IN DATA INTENSIVE ENVIRONMENNTS Leon Guzenda Chief Technology Officer.
Storage Tank in Data Grid Shin, SangYong(syshin, #6468) IBM Grid Computing August 23, 2003.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
Hierarchical storage management
Presented by Leadership Computing Facility (LCF) Roadmap Buddy Bland Center for Computational Sciences Leadership Computing Facility Project.
1 Week #10Business Continuity Backing Up Data Configuring Shadow Copies Providing Server and Service Availability.
CASTOR: CERN’s data management system CHEP03 25/3/2003 Ben Couturier, Jean-Damien Durand, Olof Bärring CERN.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Facilities and How They Are Used ORNL/Probe Randy Burris Dan Million – facility administrator.
Test Results of the EuroStore Mass Storage System Ingo Augustin CERNIT-PDP/DM Padova.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
Andrei Moskalenko Storage team, Centre de Calcul de l’ IN2P3. HPSS – The High Performance Storage System Storage at the Computer Centre of the IN2P3 HEPiX.
Randy MelenApril 14, Stanford Linear Accelerator Center Site Report April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader.
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
Office of Science U.S. Department of Energy NERSC Site Report HEPiX October 20, 2003 TRIUMF.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
Data Evolution: 101. Parallel Filesystem vs Object Stores Amazon S3 CIFS NFS.
CASTOR project status CASTOR project status CERNIT-PDP/DM October 1999.
Parallel IO for Cluster Computing Tran, Van Hoai.
Tackling I/O Issues 1 David Race 16 March 2010.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
An Introduction to GPFS
Predrag Buncic CERN Data management in Run3. Roles of Tiers in Run 3 Predrag Buncic 2 ALICEALICE ALICE Offline Week, 01/04/2016 Reconstruction Calibration.
GPFS Parallel File System
SRB at KEK Yoshimi Iida, Kohki Ishikawa KEK – CC-IN2P3 Meeting on Grids at Lyon September 11-13, 2006.
Compute and Storage For the Farm at Jlab
Parallel Virtual File System (PVFS) a.k.a. OrangeFS
Jean-Philippe Baud, IT-GD, CERN November 2007
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Vincenzo Spinoso EGI.eu/INFN
dCache “Intro” a layperson perspective Frank Würthwein UCSD
GWE Core Grid Wizard Enterprise (
Introduction to Data Management in EGI
SAM at CCIN2P3 configuration issues
CERN Lustre Evaluation and Storage Outlook
What’s going on next door? The 2017 HPSS User Forum!
Ákos Frohner EGEE'08 September 2008
SABRes: Atomic Object Reads for In-Memory Rack-Scale Computing
Research Data Archive - technology
Large Scale Test of a storage solution based on an Industry Standard
Kirill Lozinskiy NERSC Storage Systems Group
NERSC Reliability Data
CASTOR: CERN’s data management system
Data Management Components for a Research Data Archive
IBM Tivoli Storage Manager
Presentation transcript:

High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009

HEPiX, October 26-30, Agenda How HPSS Works Current Features Future Directions (to Extreme Scale)

2 HPSS as a Hierarchical Storage Management Top of pyramid is the Class of Service (COS) Pyramid is a single hierarchy, we have many of these Each level is a storage class, each storage class can be striped (disk & tape) and produce multiple copies (tape only) Migration copies files to lower levels Files can exist at all levels within a hierarchy Continually replacing all hardware within a level for technology refresh Capacity Latency Local Disk or Tape High Capacity Disk Fast Disk Remote Disk or Tape HEPiX, October 26-30, 2009

3 A HPSS Transfer Switch LAN 2. Core Server accesses metadata on disk 3. Core Server commands Mover to stage file from tape to disk Tape 4. Mover stages file from tape to disk 5. Core Server sends lock and ticket back to client 6. Mover reads data and sends to client over LAN Metadata Client Cluster HPSS Movers HPSS Core Server Data Disks 1. Client issues READ to Core Server HEPiX, October 26-30, 2009

HPSS Current Features (v7) Single client transfer optimizations –Globus gridFTP service –Striping a single file across Disk or Tape drives –Aggregation capable clients (HTAR, PSI) Manage 10’s of PBs effectively –Dual copy on tape, delayed or real-time –Technology insertion –Recover data from another copy –Aggregation on migration to tape Data Management Possibilities –User-defined attributes on files File System Interfaces –GPFS/HPSS Integration – IBM –Lustre/HPSS Integration – CEA/Sun-CFS –Virtual File System interface HEPiX, October 26-30,

HPSS Feature – gridFTP Transfers Data Transfer Working Group –Data transfer nodes at ORNL-LCF, ANL-LCF, and LBNL-NERSC with ESNet –Optimize WAN transfers between global file systems and archives at the sites Dedicated WAN nodes are helping users –Several 20TB days between HPSS and DTN global file system –Several large data set/project movements between sites Have plans for –SRM: BeStMan to aid in scheduling and persistent transfers between sites –Increasing network (ESNet), and transfer nodes as usage increases HEPiX, October 26-30,

HPSS Feature – Striping transfers across disk/tape 6 Switch LAN Tape Metadata Client Cluster I/O Node HPSS Movers HPSS Core Server Data Disks Client network BW is the bottleneck HEPiX, October 26-30, 2009

HPSS Feature – Multi-noded transfers & striping in HPSS 7 Switch LAN Tape Metadata Client Cluster I/O Node HPSS Movers HPSS Core Server Data Disks Match client BW to HPSS mover BW HEPiX, October 26-30, 2009

HPSS Feature – Virtual File System 8 Unix/Posix Application Posix File System Interface HPSS VFS Extensions & Daemons HPSS Client API HPSS Data Movers HPSS Core Server Data Buffer Linux Client HPSS Cluster AIX or Linux Control Data Optional SAN Data Path HPSS accessed using standard UNIX/Posix semantics Run standard applications on HPSS such as IBM DB2, IBM TSM, NFSv4, and Samba VFS available for Linux HEPiX, October 26-30, 2009

HPSS Feature – User-defined Attributes Goals: –Provide an extensible set of APIs that will insert/update/delete/select UDAs from database –Provide robust search capability Storage based on DB2 pureXML Possible uses: –Checksum type w/value –Application specific –Expiration/action date –File version –Lustre path –Tar file TOCs Planned uses: –HSI: cksum, expiration date, trashcan, annotation, some application specific –HTAR: creator code and expiration date HEPiX, October 26-30,

Extreme Scale ( ) Series of workshops conducted by users, applications, and organizations starting in 2007 Proposed new program within DOE to realize computing at exascale levels Challenges: –Power 20 MW - ? –Cost (size of the system, # of racks) PB of memory –Storage Exabytes of data, millions of concurrent accesses, PBs dataset movement between sites HPSS held a ES workshop and determined the following challenges: –Scalability –Data Management –System Management –Hardware 10 HEPiX, October 26-30, 2009

HPSS v8.1 Multiple Metadata Servers –Optimizes multiple client transfers –Enables managing Exabytes of data effectively On-line Upgrades –Ability to upgrade HPSS software while system available to users HEPiX, October 26-30,

HPSS post 8.1 Advanced Data Management –Collaboration with data management community (SRMs, Content Managers…) Integration with 3 rd party tape monitoring applications –Crossroads, HiStor, Sun solutions? Metadata footprint reduction New client caching for faster pathname operations HEPiX, October 26-30,

Thank you, Questions? HEPiX, October 26-30,