CENTER FOR HIGH PERFORMANCE COMPUTING Introduction to I/O in the HPC Environment Brian Haymore, Sam Liston,

Slides:



Advertisements
Similar presentations
Computing Infrastructure
Advertisements

2 Copyright © 2005, Oracle. All rights reserved. Installing the Oracle Database Software.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Disks and RAID.
File Systems.
1 Deciding When to Forget in the Elephant File System University of British Columbia: Douglas. S. Santry, Michael J. Feeley, Norman C. Hutchinson, Ross.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
On evaluating GPFS Research work that has been done at HLRS by Alejandro Calderon.
The Google File System. Why? Google has lots of data –Cannot fit in traditional file system –Spans hundreds (thousands) of servers connected to (tens.
Overview Roles of Servers Hardware User Management Security.
I/O Systems and Storage Systems May 22, 2000 Instructor: Gary Kimura.
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
An Introduction to Device Drivers Sarah Diesburg COP 5641 / CIS 4930.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
11 Capacity Planning Methodologies / Reporting for Storage Space and SAN Port Usage Bob Davis EMC Technical Consultant.
Data Deduplication in Virtualized Environments Marc Crespi, ExaGrid Systems
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &
Windows Server MIS 424 Professor Sandvig. Overview Role of servers Performance Requirements Server Hardware Software Windows Server IIS.
1 The Google File System Reporter: You-Wei Zhang.
IST346:  Storage  File Systems  File Services.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Troubleshooting SQL Server Enterprise Geodatabase Performance Issues
Suggested Exercise 9 Sarah Diesburg Operating Systems CS 3430.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Hosted by Case Study - Storage Consolidation Steve Curry Yahoo Inc.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Presenter: John Tkaczewski Duration: 30 minutes February Webinar: The Basics of Remote Data Replication.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Implementing Hyper-V®
Component 4: Introduction to Information and Computer Science Unit 4: Application and System Software Lecture 3 This material was developed by Oregon Health.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Large Scale Parallel File System and Cluster Management ICT, CAS.
Module 3 Planning and Deploying Mailbox Services.
1 Public DAFS Storage for High Performance Computing using MPI-I/O: Design and Experience Arkady Kanevsky & Peter Corbett Network Appliance Vijay Velusamy.
CS 153 Design of Operating Systems Spring 2015 Lecture 21: File Systems.
Ceph: A Scalable, High-Performance Distributed File System
Methodology – Physical Database Design for Relational Databases.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
CCGrid, 2012 Supporting User Defined Subsetting and Aggregation over Parallel NetCDF Datasets Yu Su and Gagan Agrawal Department of Computer Science and.
A P2P-Based Architecture for Secure Software Delivery Using Volunteer Assistance Purvi Shah, Jehan-François Pâris, Jeffrey Morgan and John Schettino IEEE.
An Introduction to Device Drivers Ted Baker  Andy Wang COP 5641 / CIS 4930.
The HDF Group HDF5 Chunking and Compression Performance tuning 10/17/15 1 ICALEPCS 2015.
Parallel IO for Cluster Computing Tran, Van Hoai.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
CPSC 231 Secondary storage (D.H.)1 Learning Objectives Understanding disk organization. Sectors, clusters and extents. Fragmentation. Disk access time.
Maximizing Performance – Why is the disk subsystem crucial to console performance and what’s the best disk configuration. Extending Performance – How.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
An Introduction to GPFS
1© Copyright 2015 EMC Corporation. All rights reserved. NUMA(YEY) BY JACOB KUGLER.
CommVault Architecture
ORNL is managed by UT-Battelle for the US Department of Energy OLCF HPSS Performance Then and Now Jason Hill HPC Operations Storage Team Lead
Canadian Bioinformatics Workshops
Getting the Most out of Scientific Computing Resources
Getting the Most out of Scientific Computing Resources
Sarah Diesburg Operating Systems COP 4610
Lecture 16: Data Storage Wednesday, November 6, 2006.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn.
Methodology – Physical Database Design for Relational Databases
Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
Gregory Kesden, CSE-291 (Cloud Computing) Fall 2016
Lecture 11: DMBS Internals
EECS 582 Midterm Review Mosharaf Chowdhury EECS 582 – F16.
CSE 451: Operating Systems Winter 2009 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
Lecture 15 Reading: Bacon 7.6, 7.7
CSE 451: Operating Systems Winter 2012 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
Specialized Cloud Architectures
Sarah Diesburg Operating Systems CS 3430
Chapter 1: Introduction CSS503 Systems Programming
Presentation transcript:

CENTER FOR HIGH PERFORMANCE COMPUTING Introduction to I/O in the HPC Environment Brian Haymore, Sam Liston, Center for High Performance Computing Spring 2013

CENTER FOR HIGH PERFORMANCE COMPUTING Overview CHPCFS Topology 12/2/2015http:// 2

CENTER FOR HIGH PERFORMANCE COMPUTING Overview Couplet Topology –Redundency –Performance –Used on: Redbutte Drycreek Greenriver Saltflats IBrix Scratch –Virt/Horiz Scalable 12/2/2015http:// 3 Network Clients 10GigE/40GigE

CENTER FOR HIGH PERFORMANCE COMPUTING Overview SAN storage redundancy – RAID RAID = Redundant Array of Inexpensive Disks 12/2/2015http:// 4

CENTER FOR HIGH PERFORMANCE COMPUTING Overview Types of storage available at CHPC –Home Directory (i.e. /uufs/chpc.utah.edu/common/home/uNID) Per department backed up (except CHPC_HPC file system) Intended for critical/volatile data Expected to maintain a high level of responsiveness –Group Data Space (i.e. /uufs/chpc.utah.edu/common/home/pi_grp) Optional per department archive Intended for active projects, persistent data, etc. Usage expectations to be set by group –Network Mounted Scratch (i.e. /scratch/serial) No expectation of data retention (It’s scratch) Expected to maintain a high level of I/O performance under significant load –Local Disk (i.e. /tmp) Most consistent I/O No expectation of data retention Unique per machine 12/2/2015http:// 5

CENTER FOR HIGH PERFORMANCE COMPUTING Overview Good Citizens –Shared Environment Characteristics Many to one relationship (over-subscribed) Globally accessible Global resource are still finite Consider your usage impact when choosing a storage location Be aware of any usage policies Evaluate different I/O methodologies Seek additional assistance from CHPC 12/2/2015http:// 6

CENTER FOR HIGH PERFORMANCE COMPUTING Best Practices 12/2/2015http:// 7 Data Segregation; Where should files be stored? –Classify your Data How important is it? Can it be recreated? Is this dataset currently in use? Will this data be used by others? Does this data need to be backed up? –Put your data in the appropriate space Home Directory Group Space Scratch /tmp

CENTER FOR HIGH PERFORMANCE COMPUTING Considerations Backup Impact –Performance Characteristics Time (when, duration) Competition/concurrent access Capacity of files backed up Quantity of files backed up Unintended consequences 12/2/2015http:// 8

CENTER FOR HIGH PERFORMANCE COMPUTING Considerations Network Performance 12/2/2015http:// 9

CENTER FOR HIGH PERFORMANCE COMPUTING Considerations Data Migration –What are we moving; What does it look like? –From where to where are we moving the data? –What transfer performance expectation do we have? –What tool will we use to make the transfer? SSH/SFTP –Simple –Very portable rsync –Restart able –File verification tar via SSH –More efficient with many small files Compression? Secure 12/2/2015http:// 10

CENTER FOR HIGH PERFORMANCE COMPUTING File Operations Directory Structure –Poor performance when too many files are in the same directory –Organizing files in a tree avoids this issue –Directory block count significance Network vs. Local –IOPS vs. Bandwidth –Network I/O Overhead Limited by network pipe More efficient for bandwidth vs. IOPS –Local I/O Limited size Not globally accessible Depending on hardware offers a fair balance between bandwidth and IOPS 12/2/2015http:// 11

CENTER FOR HIGH PERFORMANCE COMPUTING File Operations Metadata Operations Performance Considerations –Create, Destroy, Stat, etc. –IOPS oriented performance Application I/O Performance Considerations –How often do we open and close files? –What I/O granularity do our applications write files? – Are we doing anything else silly? 12/2/2015http:// 12

CENTER FOR HIGH PERFORMANCE COMPUTING Examples Code (user example) –Code wrote millions of small files (<10kB) to a single dir –Code wrote thousands of larger files (100s of MB) –Code uses a compression algorithm to speed up I/O of the larger files. Observations –Changing default r/w chunk of compression I/O from 16kB to 32kB/64kB improved performance 10%-20%. –Changing code to write files in a hierarchical directory structure produced a multiple times speed up. 12/2/2015http:// 13

CENTER FOR HIGH PERFORMANCE COMPUTING Example s Single Directory vs. Hierarchical Directory Structure 12/2/2015http:// GB file

CENTER FOR HIGH PERFORMANCE COMPUTING Example s Processing Many Small Files over Network vs. Local 12/2/2015http:// GB file

CENTER FOR HIGH PERFORMANCE COMPUTING Example s Bonnie Test 12/2/2015http:// GB file

CENTER FOR HIGH PERFORMANCE COMPUTING Example s Tar (Linux Kernel) 12/2/2015http:// GB file

CENTER FOR HIGH PERFORMANCE COMPUTING Example s Compile (Linux Kernel) 12/2/2015http:// GB file

CENTER FOR HIGH PERFORMANCE COMPUTING Example s Fine vs. Coarse I/O (Read) 12/2/2015http:// GB file

CENTER FOR HIGH PERFORMANCE COMPUTING Example s Fine vs. Coarse I/O (Writes) 12/2/2015http:// GB file

CENTER FOR HIGH PERFORMANCE COMPUTING Troubleshooting 12/2/2015http:// 21 Diagnosing Slowness –Open a ticket –File system –System load –Network load Future Monitoring –Ganglia Additional Information –