O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY HPSS Features and Futures Presentation to SCICOMP4 Randy Burris ORNL’s Storage Systems Manager.

Slides:



Advertisements
Similar presentations
Data Storage Solutions Module 1.2. Data Storage Solutions Upon completion of this module, you will be able to: List the common storage media and solutions.
Advertisements

HPSS The High Performance Storage System Developed by IBM, LANL, LLNL, ORNL, SNL, NASA Langley, NASA Lewis, Cornell, MHPCC, SDSC, UW with funding from.
ICS 434 Advanced Database Systems
XenData SX-520 LTO Archive Servers A series of archive servers based on IT standards, designed for the demanding requirements of the media and entertainment.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Enigma Data’s SmartMove.
October Dyalog File Server Version 2.0 Morten Kromberg CTO, Dyalog LTD Dyalog’13.
STANFORD UNIVERSITY INFORMATION TECHNOLOGY SERVICES IT Services Storage And Backup Low Cost Central Storage (LCCS) January 9,
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Deployment, Deployment, Deployment March, 2002 Randy Burris Center for Computational Sciences.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
Database Software File Management Systems Database Management Systems.
What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.
ORNL is managed by UT-Battelle for the US Department of Energy Tools Available for Transferring Large Data Sets Over the WAN Suzanne Parete-Koon Chris.
1 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Mass Storage For BaBar at SLAC Andrew Hanushevsky Stanford.
Introduction to client/server architecture
Module – 7 network-attached storage (NAS)
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
Mass RHIC Computing Facility Razvan Popescu - Brookhaven National Laboratory.
1 Objectives Discuss the Windows Printer Model and how it is implemented in Windows Server 2008 Install the Print Services components of Windows Server.
Windows Server MIS 424 Professor Sandvig. Overview Role of servers Performance Requirements Server Hardware Software Windows Server IIS.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
Linux Operations and Administration
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
Fundamentals of Networking Discovery 1, Chapter 2 Operating Systems.
Hands-On Microsoft Windows Server 2008 Chapter 5 Configuring, Managing, and Troubleshooting Resource Access.
Module 13: Configuring Availability of Network Resources and Content.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Project 3.3 Optimizing Shared Access to Tertiary Storage March, 2002 Presenter - Randy Burris.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
1 © 2010 Overland Storage, Inc. © 2012 Overland Storage, Inc. Overland Storage The Storage Conundrum Neil Cogger Pre-Sales Manager.
High Performance Storage System Harry Hulen
Paul Scherrer Institut 5232 Villigen PSI HEPIX_AMST / / BJ95 PAUL SCHERRER INSTITUT THE PAUL SCHERRER INSTITUTE Swiss Light Source (SLS) Particle accelerator.
BLU-ICE and the Distributed Control System Constraints for Software Development Strategies Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
1 Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory.
Computer Emergency Notification System (CENS)
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
Hierarchical storage management
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
CASTOR: CERN’s data management system CHEP03 25/3/2003 Ben Couturier, Jean-Damien Durand, Olof Bärring CERN.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Facilities and How They Are Used ORNL/Probe Randy Burris Dan Million – facility administrator.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Probe Plans and Status SciDAC Kickoff July, 2001 Dan Million Randy Burris ORNL, Center for.
Test Results of the EuroStore Mass Storage System Ingo Augustin CERNIT-PDP/DM Padova.
CASPUR Site Report Andrei Maslennikov Lead - Systems Amsterdam, May 2003.
CCNA4 v3 Module 6 v3 CCNA 4 Module 6 JEOPARDY K. Martin.
1 Status of HPSS New Features, Requirements, and Installations Otis Graf IBM Global Services - Federal Houston, Texas October 1999.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Andrei Moskalenko Storage team, Centre de Calcul de l’ IN2P3. HPSS – The High Performance Storage System Storage at the Computer Centre of the IN2P3 HEPiX.
Chapter 9: Networking with Unix and Linux. Objectives: Describe the origins and history of the UNIX operating system Identify similarities and differences.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
Randy MelenApril 14, Stanford Linear Accelerator Center Site Report April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
2: Operating Systems Networking for Home & Small Business.
CEG 2400 FALL 2012 Linux/UNIX Network Operating Systems.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
© 2014 VMware Inc. All rights reserved. Cloud Archive for vCloud ® Air™ High-level Overview August, 2015 Date.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
NASA Langley Research Center’s Distributed Mass Storage System (DMSS) Juliet Z. Pao Guest Lecturing at ODU April 8, 1999.
High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.
Research Data Archive - technology
Introduction of Week 6 Assignment Discussion
Kirill Lozinskiy NERSC Storage Systems Group
XenData SX-550 LTO Archive Servers
Networks Software.
Chapter 2: System Structures
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets A.Chervenak, I.Foster, C.Kesselman, C.Salisbury,
Sending data to EUROSTAT using STATEL and STADIUM web client
STATEL an easy way to transfer data
Presentation transcript:

O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY HPSS Features and Futures Presentation to SCICOMP4 Randy Burris ORNL’s Storage Systems Manager

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY Table of Contents  Background – design goals and descriptions  General information  Architecture  How it works  Infrastructure  HPSS 4.3 – current release (as of Sept. 1)  HPSS 4.5  HPSS 5.1  Background  Main features

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS is…  File-based storage system – software only.  Extremely scalable, targeting:  Millions of files;  Multiple petabyte capacity;  Gigabyte/second transfer rates;  Single files ranging to terabyte size.  Distributed:  Multiple nodes;  Multiple instances of most servers.  Winner of an R&D 100 award (1997).

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS is …  Developed by LLNL, Sandia, LANL, ORNL, NERSC, IBM  Used in >40 very large installations  ASCI (Livermore, Sandia, Los Alamos Labs)  High-energy physics sites (SLAC, Brookhaven, other US sites and sites in Europe and Japan)  NASA  Universities  As anExamples at ORNL  Archiving systemARM  Backup systemBackups of servers, O2000  Active repositoryClimate, bioinformatics, …

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY Example of the type of configuration HPSS is designed to support Control Parallel RAID Disk Farm Local Devices HPSS Server(s) Workstation Cluster or Parallel Systems Sequential Systems HIPPI/ GigE/ATM Network Parallel Tape Farm Visualization Engines Frame buffers HSI NFS FTP DFS Control Secondary Server(s) LANs Internet To Client Hosts WANs Throuhput Scalable to the GB/s Region

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS Software Architecture Diagram Communications Security Transaction Manager Metadata Manager Logging Infrastructure Services 64-bit Math Libraries ManagementManagement Client(s) - Client API - PFS Applications Data Management System Daemons: -HSI -FTP & PFTP - NFS - DFS Storage System Management (all components) Bitfile Servers Storage Servers Name Servers Location Servers Migration/ Purge Repack Movers NSL UniTree Migration Other Modules Green components are defined in the IEEE Mass Storage Reference Model. Common Infrastructure HPSS Software Architecture Physical Volume Library Physical Volume Respositories Installation

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY How’s it work?  User stores a file using hsi, ftp, parallel ftp or nfs.  It will be sent to a particular Class of Service (COS) depending upon user selection or defaults.  Default COS specifies a hierarchy with disk at the top level and tape below it.  So, file is first stored on disk (HPSS cache)  When enough time elapses or the cache gets full enough, the file will automatically be copied to the next level - tape - and purged from disk.

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS Infrastructure  HPSS depends upon (I.e., is layered over):  Operating system (AIX or Solaris for core servers)  Distributed Computing Environment (DCE) Security – authentication and authorization Name service Remote Procedure Calls  Encina Structured File System – flat-file system used to store metadata such as file names, segment locations, etc. Encina is built upon DCE.  GUI – Sammi product from Kinesix  Distributed File System (DFS) – for some installations. DFS is built upon DCE

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 4.3 (Newest released version)  Support for new hardware  StorageTek 9940 tape drives  IBM Linear Tape Open (LTO) tape drives and robots  Sony GY-8240 tape drives  Redundant Arrays of Independent Tapes  An ASCI PathForward project contracted with StorageTek  Target is multiple tape drives striped with parity

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 4.3 (continued)  Mass configuration  Earlier, each device or server had to be individually configured through the GUI  Could be tedious and error-prone for installations with hundreds of drives or servers  Mass configuration takes advantage of the command line interface (new with HPSS 4.2)  Allows scripted configuration of devices and various types of servers.

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 4.3 (continued)  Support for IBM High Availability configurations  HACMP (High Availability Cluster MultiProcessor) hardware feature  HACMP supporting AIX software  Handles node and network interface failures  Essentially a controlled failover to a spare node  Initiated manually

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 4.3 (continued)  Other features:  Support for Solaris 8  Client API ported to Redhat Linux  Support for NFS v3  By the way  In our Probe testbed, we’re running HPSS 4.3 on AIX 5L on our S80  Not certified, just trying it to see what happens.

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 4.5 – target date 7/1/2002  Features  Implement an efficient, transparent interface for users to access their HPSS data  Uses HPSS as an archive  Available freely for Linux (no licensing fee)  Key requirements  Support HPSS access via XFS using DMAPI  XFS / HPSS filesystems shall be accessible via NFS for transparent access  Support archived filesets (rename / delete)  Support on Linux

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 4.5 (continued)  Provide migration and purge from XFS based on policy  Stage data from HPSS when data has been purged from XFS  Support whole and partial file migration  Support utilities for the following: Create / Delete XFS fileset metadata in HPSS List HPSS filenames in archived fileset List XFS names of files Compare archive dumps from HPSS and XFS Delete all files from HPSS side of XFS fileset Delete files older than a specified age from HPSS side Recover files deleted from XFS filesets not yet deleted from HPSS

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 5.1- release date Jan  Background  HPSS was designed in 1992/1993 as a total rewrite of NSL UniTree.  Goal – achieve speed using many parallel servers.  The Distributed Computing Environment (DCE) was a prominent and promising infrastructure product  Encina’s Structured File System (SFS) was the only product supporting distributed nested transactions.  Management GUI mandated to be Sammi, from Kinesix, because of anticipated reuse of NSL UniTree screens.

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 5.1 Background (continued)  Today:  DCE – future in doubt  Encina’s Structured File System Future in doubt Performance problems No longer need nested transactions Or distributed transactions  Sammi relatively expensive and feature poor

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 5.1 Features  New basic structure  DCE still used – still no alternative  Designing a “core” server combining the name server, the bitfile server, the storage server and parts of the Client API  Replacing SFS with a commercial DBMS – DB2 – but design and coding goal is easy replacement of the DBMS  Expect considerable speed improvement  Oracle and DB2 were both ~10 times faster than SFS in a model run in ORNL’s Probe testbed  There is reduced communication between servers

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS Software Architecture Diagram Communications Security Transaction Manager Metadata Manager Logging Infrastructure Services 64-bit Math Libraries ManagementManagement Client(s) - Client API - PFS Applications Data Management System Daemons: -HSI -FTP & PFTP - NFS - DFS Storage System Management (all components) Bitfile Servers Storage Servers Name Servers Location Servers Migration/ Purge Repack Movers NSL UniTree Migration Other Modules Green components are defined in the IEEE Mass Storage Reference Model. Common Infrastructure HPSS Software Architecture Physical Volume Library Physical Volume Respositories Installation

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY New Java Admin Interface  User benefits:  Fast  Immediately portable to Unix, Windows, Macintosh  Picking up various manageability improvements  Developer benefits  Object oriented  Much code sharing Central communication and processing engine Different presentation engines GUI ASCII for the command-line interface A third one, a Web interface, would be easy to add later  Overall maintenance much easier - code generated from HPSS C structures

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY Future futures  These topics are under discussion; no guarantees  In each case, a gating function is the availability of staff to do the development.  Modification to HPSS’s parallel ftp to comply with specs for GridFTP. Interest from ASCI, Argonne and others.  GPFS/HPSS interface  Participants - LLNL, LBNL, Indiana University and IBM  Seeking further help  SAN exploitation – gleam in the eye right now

OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY Questions? home page tutorial for Comp. Sci. Sci and Math Div