The NERSC Global File System (NGF) Jason Hick Storage Systems Group Lead CAS2K11 September 11-14,

Slides:



Advertisements
Similar presentations
Archive Task Team (ATT) Disk Storage Stuart Doescher, USGS (Ken Gacke) WGISS-18 September 2004 Beijing, China.
Advertisements

Network Systems Sales LLC
XenData SX-520 LTO Archive Servers A series of archive servers based on IT standards, designed for the demanding requirements of the media and entertainment.
Current Testbed : 100 GE 2 sites (NERSC, ANL) with 3 nodes each. Each node with 4 x 10 GE NICs Measure various overheads from protocols and file sizes.
Data Grids Jon Ludwig Leor Dilmanian Braden Allchin Andrew Brown.
VERITAS Software Corp. BUSINESS WITHOUT INTERRUPTION Fredy Nick SE Manager.
A match made in heaven?. Who am I? Richard Barlow Systems Architect and Engineering Manager for the Virginia Credit Union Worked in IT for almost 20 years.
2 June 2015 © Enterprise Storage Group, Inc. 1 The Case for File Server Consolidation using NAS Nancy Marrone Senior Analyst The Enterprise Storage Group,
Symantec De-Duplication Solutions Complete Protection for your Information Driven Enterprise Richard Hobkirk Sr. Pre-Sales Consultant.
Silicon Graphics, Inc. Poster Presented by: SGI Proprietary Technologies for Breakthrough Research Rosario Caltabiano North East Higher Education & Research.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Grid Services at NERSC Shreyas Cholia Open Software and Programming Group, NERSC NERSC User Group Meeting September 17, 2007.
V1.00 © 2009 Research In Motion Limited Introduction to Mobile Device Web Development Trainer name Date.
Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 
ORNL is managed by UT-Battelle for the US Department of Energy Tools Available for Transferring Large Data Sets Over the WAN Suzanne Parete-Koon Chris.
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice HP StorageWorks LeftHand update Marcus.
Barracuda Networks Confidential1 Barracuda Backup Service Integrated Local & Offsite Data Backup.
National Energy Research Scientific Computing Center (NERSC) The GUPFS Project at NERSC GUPFS Team NERSC Center Division, LBNL November 2003.
“ Does Cloud Computing Offer a Viable Option for the Control of Statistical Data: How Safe Are Clouds” Federal Committee for Statistical Methodology (FCSM)
MCSE Guide to Microsoft Exchange Server 2003 Administration Chapter Four Configuring Outlook and Outlook Web Access.
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
Eos Center-wide File Systems Chris Fuson Outline 1 Available Center-wide File Systems 2 New Lustre File System 3 Data Transfer.
Principles of Scalable HPC System Design March 6, 2012 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
What is Ubiquity™? ubiquity (noun): the state of being, or seeming to be, everywhere at once Ubiquity™ is a powerful building management system that.
The University of Bolton School of Games Computing & Creative Technologies LCT2516 Network Architecture CCNA Exploration LAN Switching and Wireless Chapter.
Storage Tank in Data Grid Shin, SangYong(syshin, #6468) IBM Grid Computing August 23, 2003.
Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.
Presented by Leadership Computing Facility (LCF) Roadmap Buddy Bland Center for Computational Sciences Leadership Computing Facility Project.
Kiew-Hong Chua a.k.a Francis Computer Network Presentation 12/5/00.
IST Storage & Backup Group 2011 Jack Shnell Supervisor Joe Silva Senior Storage Administrator Dennis Leong.
NCAR storage accounting and analysis possibilities David L. Hart, Pam Gillman, Erich Thanhardt NCAR CISL July 22, 2013
Oracle's Distributed Database Bora Yasa. Definition A Distributed Database is a set of databases stored on multiple computers at different locations and.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Facilities and How They Are Used ORNL/Probe Randy Burris Dan Million – facility administrator.
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
1 The NERSC Global File System NERSC June 12th, 2006.
CENTER FOR HIGH PERFORMANCE COMPUTING Introduction to I/O in the HPC Environment Brian Haymore, Sam Liston,
VMware vSphere Configuration and Management v6
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
Evolving Scientific Data Workflow CAS 2011 Pamela Gillman
1 Chapter 13: RADIUS in Remote Access Designs Designs That Include RADIUS Essential RADIUS Design Concepts Data Protection in RADIUS Designs RADIUS Design.
3/12/2013Computer Engg, IIT(BHU)1 CLOUD COMPUTING-1.
Office of Science U.S. Department of Energy NERSC Site Report HEPiX October 20, 2003 TRIUMF.
NOAA R&D High Performance Computing Colin Morgan, CISSP High Performance Technologies Inc (HPTI) National Oceanic and Atmospheric Administration Geophysical.
File Transfer And Access (FTP, TFTP, NFS). Remote File Access, Transfer and Storage Networks For different goals variety of approaches to remote file.
STORAGE ARCHITECTURE/ MASTER): Where IP and FC Storage Fit in Your Enterprise Randy Kerns Senior Partner The Evaluator Group.
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
Tackling I/O Issues 1 David Race 16 March 2010.
© 2014 VMware Inc. All rights reserved. Cloud Archive for vCloud ® Air™ High-level Overview August, 2015 Date.
Deploying Highly Available SQL Server in Windows Azure A Presentation and Demonstration by Microsoft Cluster MVP David Bermingham.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
© 2015 MetricStream, Inc. All Rights Reserved. AWS server provisioning © 2015 MetricStream, Inc. All Rights Reserved. By, Srikanth K & Rohit.
What is Cloud Computing 1. Cloud computing is a service that helps you to perform the tasks over the Internet. The users can access resources as they.
Short Customer Presentation September The Company  Storgrid delivers a secure software platform for creating secure file sync and sharing solutions.
ORNL is managed by UT-Battelle for the US Department of Energy OLCF HPSS Performance Then and Now Jason Hill HPC Operations Storage Team Lead
High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.
Extreme Scale Infrastructure
Compute and Storage For the Farm at Jlab
A Brief Introduction to NERSC Resources and Allocations
Organizations Are Embracing New Opportunities
BEST CLOUD COMPUTING PLATFORM Skype : mukesh.k.bansal.
Amazon Storage- S3 and Glacier
Introduction to Networks
Introduction to Networks
XenData SX-550 LTO Archive Servers
Design Unit 26 Design a small or home office network
PRESENTER GUIDANCE: These charts provide data points on how IBM BaaS mid-market benefits a client with the ability to utilize a variety of backup software.
Data Management Components for a Research Data Archive
Presentation transcript:

The NERSC Global File System (NGF) Jason Hick Storage Systems Group Lead CAS2K11 September 11-14, 2011

Operated by University of California for the US Department of Energy NERSC serves a large population –Approximately 4000 users, 400 projects, 500 codes –Focus on “unique” resources High-end computing systems High-end storage systems –Large shared GPFS (a.k.a. NGF) –Large archive (a.k.a. HPSS) Interface to high speed networking –ESnet soon to be 100Gb (a.k.a. Advanced Networking Initiative) Our mission is to accelerate the pace of discovery by providing high performance computing, data, and communication services to the DOE Office of Science community. The Production Facility for DOE Office of Science 2011 storage allocation by area of science. Climate, Applied Math, Astrophysics, and Nuclear Physics are 75% of total. 2

Storing data: –We present efficient center-wide storage solutions. NGF aids in minimizing multiple online copies of data, and reduces extraneous data movement between systems. HPSS enables exponential user data growth without exponential impact to the facility. Analyzing Data: –Large memory systems, Hadoop/Eucalyptus file systems Sharing Data: –Provide Science Gateway Nodes that enable data in /project to be available via your browser (offer authenticated and unauthenticated options). Goals of Science Gateways –Expose scientific data on NGF file systems and HPSS archive to larger communities –Work with large data remotely –Outreach: Broadens scientific impact of computational science: “as easy as online banking” NEWT – NERSC Web Toolkit/API –Building blocks for science on the web –Write a Science Gateway by using HTML + Javascript –Support for: authentication, submission, files access, accounting, viewing queues, user data tables 30+ projects use the NGF -> web gateway Protecting Data: –Frequent (near daily) incremental backups of file system data. Request information on PIBS from Matt Andrews Servicing Data Needs 3

NERSC is a net importer of data. –Since 2007, we receive more bytes than transfer out to other sites. –Leadership in inter-site data transfers (Data Transfer Working Group). –Have been the an archive site for several valuable scientific projects: SNO project (70TB), 20 th Century Reanalysis project (140TB) Partnering with ANL and ESnet to advance HPC network capabilities. –Magellan ANL and NERSC cloud research infrastructure. –Advanced Networking Initiative with ESnet (100Gb Ethernet). Data Transfer Working Group –Coordination between stake holders (ESnet, ORNL, ANL, LANL and LBNL/NERSC) Data Transfer Nodes –Available to all NERSC users –Optimized for WAN transfers and regularly tested –Close to the NERSC border –GlobusOnline endpoints –hsi for HPSS-to-HPSS transfers (OLCF and NERSC) –gridFTP for system-to-system or system-to-HPSS transfers –bbcp for system-to-system transfers Optimizing Inter-Site Data Movement 4

Mount file systems on as many clusters as possible –Work with vendor to port client to wide variety of operating systems –Today this is easier because primarily systems are variants of Linux Architect file systems for different use cases –Write optimized, large file bandwidth, large quota (/global/scratch) –Read optimized, aggregate file bandwidth, long-term storage (/project) –Small file optimized, common environment (/global/homes, /global/common) Deliver high bandwidth to each system –Large fiber channel SAN to deliver bandwidth to each cluster using private network storage devices Minimizing Intra-site Data Movement 5

/project is for sharing and long-term residence of data on all NERSC computational systems. –4% monthly growth, 60% growth per year –Not purged, quota enforced (4TB default per project), projects under 5TB backed up daily –Serves 200 projects over FC8 primarily, 10Gb ethernet alternatively –1.5PB total capacity, increasing to ~4.4PB total in 2012 –~5TB average daily IO /global/homes provides a common login environment for users across systems. –7% monthly growth, 125% growth per year –Not purged but archived, quota enforced (40GB per user), backed up daily –Oversubscribed, needs ~150TB of capacity unless we increase quotas –Serves 4000 users, 400 per day over 10Gb Ethernet –50TB total capacity, increasing to ~100TB total in 2012 –100’s of GBs average daily IO /global/common provides a common installed software environment across systems. –5TB total capacity, increasing to ~10TB total in 2012 –Provides software packages common across platforms /global/scratch provides high bandwidth and capacity data across systems. –Purged, quota enforced (20TB per user), not backed up –Serves 4000 users over FC8 primarily, 10Gb ethernet alternatively –1PB total capacity, looking to increase to ~2.4PB total in 2012 –Bandwidth is about 16GB/s –Aided us in transitioning off Hopper p1 to p2, reconfiguring local scratches, and in inter-site data movement The NERSC Global File Systems 6

NGF Project 7 Ethernet (2 10Gb) GPFS Server SGNs DTNs IB pNSD 7 Euclid CarverMagellan Dirac Franklin DVS Hopper DVS FC (20x2 FC4) /project 1.5 PB (~10 GB/s) Increase planned FY12 FC (8x4 FC8) FC (12x2 FC8) FC (2x4 FC4) IB Sea * Gem* PDSF IB pNSD PDSF planck IB FC (4x2 FC4) pNSD

NGF Global Homes & Common 8 GPFS Server Ethernet (4x1 10Gb) FC 8 EuclidCarverMagellan DiracFranklinHopper /global/homes 58 TB Increase planned FY12 SGNsDTNsPDSF /global/common 5 TB Increase planned FY12

NGF Global Scratch 9 SGNsDTNspNSD 9 EuclidCarverMagellanDiracFranklinHopper /global/scratch 920 TB (~16 GB/s) Increase planned FY12 FC (2x4 FC4) FC (12x2 FC8) PDSF FC (8x4 FC8) Gem* IB pNSD DVS IB PDSF planck pNSD FC (4x2 FC4)

Storage direct-attached via FC to clients –In principle a good idea, in practice it’s not –Requires a very “broad” fiber channel network (200+ initiators) –Leads to de-tuning the FC HBAs (queue depth = 1 or 2) to keep storage arrays from panicking –Keeping client FC HBAs tuned correctly is difficult and it only takes one to cause havoc –Changes in device configuration (arrays) is difficult for every client (some OS’s allow dynamic changes, some don’t) FC network is at its final destination –Cisco 9513s are oversubscribed with 24 port 8Gb linecards –Can fix with 32 port 8Gb linecards at full rate, but this is a large reinvestment –Cisco 16Gb linecards are only intended for ISLs –Having increasing trouble matching client-end (IB) speeds with back-end (storage) speeds Lesson Learned: Avoid direct-attached FC clients 10

NGF SAN “N5” Switch “NGF” Switch /global/scratch/project/global/homes GPFS Servers PDSF pNSDs Hopper pNSDs DTNs Carver pNSDs Franklin DVSs FC4 (20x2) 19 GB/s FC4 (20x2) 19 GB/s FC8 (12x2) 23 GB/s FC8 (16) 15 GB/s FC8 (16) 15 GB/s FC4 (2x2) 2 GB/s FC8 (8x4) 30 GB/s FC8 (8x4) 30 GB/s FC4 (3x8), FC8 (5x8) 50 GB/s FC4 (3x8), FC8 (5x8) 50 GB/s FC4 (4x2) 4 GB/s FC8 (36x2) 69 GB/s FC8 (4x8) 33 GB/s

To avoid one file system from affecting another –Run file system managers on different NSD servers –Seems to eliminate observed ~2 minute delays of file system operations in file system A while file system B has certain commands run (e.g. policy manager, quota commands,...) Works well otherwise and allows a limited amount of server hardware to maintain a multi-GPFS file system deployment Lesson Learned: Multiple GPFS file systems in one owning cluster 12

Moving to IB based client connections to the file systems –Connect storage directly via FC to GPFS servers –Connect GPFS servers to each cluster’s IB network, if it doesn’t have one, create a small IB network to scale with cluster’s proprietary network (Seastar, Gemini,...) –Reduces with eventual goal to eliminate the FC SAN Move from pNSD deployment to NSD deployment –Scalability of server and data network resources NIM Accounting Service & NGF Information integration –Move from administrator-level information to user-level information –Administrative backups to user backups Common value in backups is restoring unintentionally deleted files –Aid users in knowing their utilization, allocation, and quota information Using the 100Gb WAN for data –Multi-site GPFS? 100Gb WAN makes this feasible, but new GPFS features desired to enable this to work well –Inter-site data transfer (bulk data movement) Future Plans 13

Thank you! Questions? 14