© Copyright 2004 Instrumental, Inc I/O Types and Usage in DoD Henry Newman Instrumental, Inc/DOD HPCMP/DARPA HPCS May 24, 2004.

Slides:



Advertisements
Similar presentations
Data Storage Solutions Module 1.2. Data Storage Solutions Upon completion of this module, you will be able to: List the common storage media and solutions.
Advertisements

Interactive lesson about operating system
Ddn.com ©2012 DataDirect Networks. All Rights Reserved. GridScaler™ Overview Vic Cornell Application Support Consultant.
Storing Data: Disks and Files: Chapter 9
Snapshots in a Flash with ioSnap TM Sriram Subramanian, Swami Sundararaman, Nisha Talagala, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau Copyright © 2014.
Denny Cherry Manager of Information Systems MVP, MCSA, MCDBA, MCTS, MCITP.
Vorlesung Speichernetzwerke Teil 2 Dipl. – Ing. (BA) Ingo Fuchs 2003.
CS4432: Database Systems II Data Storage - Lecture 2 (Sections 13.1 – 13.3) Elke A. Rundensteiner.
Storage area Network(SANs) Topics of presentation
1 Advanced Database Technology February 12, 2004 DATA STORAGE (Lecture based on [GUW ], [Sanders03, ], and [MaheshwariZeh03, ])
G Robert Grimm New York University Sprite LFS or Let’s Log Everything.
G Robert Grimm New York University SGI’s XFS or Cool Pet Tricks with B+ Trees.
ISCSI Performance in Integrated LAN/SAN Environment Li Yin U.C. Berkeley.
CS4432: Database Systems II Lecture 2 Timothy Sutherland.
G Robert Grimm New York University Sprite LFS or Let’s Log Everything.
1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
BACKUP/MASTER: Immediate Relief with Disk Backup Presented by W. Curtis Preston VP, Service Development GlassHouse Technologies, Inc.
Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.
SQL Server 2008 & Solid State Drives Jon Reade SQL Server Consultant SQL Server 2008 MCITP, MCTS Co-founder SQLServerClub.com, SSC
Storage Area Networks The Basics. Storage Area Networks SANS are designed to give you: More disk space Multiple server access to a single disk pool Better.
Lecture 11: DMBS Internals
Computers Central Processor Unit. Basic Computer System MAIN MEMORY ALUCNTL..... BUS CONTROLLER Processor I/O moduleInterconnections BUS Memory.
IT 344: Operating Systems Winter 2010 Module 13 Secondary Storage Chia-Chi Teng CTB 265.
RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University (Joint work with Diego Ongaro, Ryan Stutsman, Steve Rumble, Mendel.
Drive Capacity Dave Anderson. Presentation Title Month ##, 2002 Page 2 © Seagate Confidential What hardware technology concerns you least Hard drives.
Small File File Systems USC Jim Pepin. Level Setting  Small files are ‘normal’ for lots of people Metadata substitute (lots of image data are done this.
School of EECS, Peking University Microsoft Research Asia UStore: A Low Cost Cold and Archival Data Storage System for Data Centers Quanlu Zhang †, Yafei.
Virtualization for Storage Efficiency and Centralized Management Genevieve Sullivan Hewlett-Packard
Computers organization & Assembly Language Chapter 0 INTRODUCTION TO COMPUTING Basic Concepts.
1 U.S. Department of the Interior U.S. Geological Survey Contractor for the USGS at the EROS Data Center EDC CR1 Storage Architecture August 2003 Ken Gacke.
Inside your computer. Hardware Review Motherboard Processor / CPU Bus Bios chip Memory Hard drive Video Card Sound Card Monitor/printer Ports.
Chapter 8 External Storage. Primary vs. Secondary Storage Primary storage: Main memory (RAM) Secondary Storage: Peripheral devices  Disk drives  Tape.
© 2011 IBM Corporation Sizing Guidelines Jana Jamsek ATS Europe.
Arcserve ® Backup Enterprise-class protection for small & mid-size+ businesses  File-based backup to disk, tape & cloud (Amazon, Azure, Cloudian, Eucalyptus,
CS4432: Database Systems II Data Storage 1. Storage in DBMSs DBMSs manage large amounts of data How does a DBMS store and manage large amounts of data?
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Implementation of a reliable and expandable on-line storage for compute clusters Jos van Wezel.
Click to add text Introduction to the new mainframe: Large-Scale Commercial Computing © Copyright IBM Corp., All rights reserved. Chapter 6: Accessing.
Disk Basics CS Introduction to Operating Systems.
CIS250 OPERATING SYSTEMS Chapter One Introduction.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Embedded System Lab. 정영진 The Design and Implementation of a Log-Structured File System Mendel Rosenblum and John K. Ousterhout ACM Transactions.
Page 1 Mass Storage 성능 분석 강사 : 이 경근 대리 HPCS/SDO/MC.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
Programmer’s View of Files Logical view of files: –An a array of bytes. –A file pointer marks the current position. Three fundamental operations: –Read.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
An Introduction to GPFS
Unit V part 2.
Getting the Most out of Scientific Computing Resources
Chapter 10: Mass-Storage Systems
Integrating Disk into Backup for Faster Restores
Getting the Most out of Scientific Computing Resources
Sarah Diesburg Operating Systems COP 4610
Diskpool and cloud storage benchmarks used in IT-DSS
Database Management Systems (CS 564)
Experiences and Outlook Data Preservation and Long Term Analysis
CERN Lustre Evaluation and Storage Outlook
SAN and NAS.
A Survey on Distributed File Systems
Lecture 11: DMBS Internals
VNX Storage Report Project: Sample VNX Report Project ID:
Unity Storage Array Profile
Lecture 9: Data Storage and IO Models
CS 140 Lecture Notes: Technology and Operating Systems
LTO Cartridge native (uncompressed) capacity, Interchangeability and Maximum tape drive native data transfer speeds Tape Drive Operation Generation 1 cartridge.
CS 140 Lecture Notes: Technology and Operating Systems
File System Implementation
Chapter 11: Mass-Storage Systems
Presentation transcript:

© Copyright 2004 Instrumental, Inc I/O Types and Usage in DoD Henry Newman Instrumental, Inc/DOD HPCMP/DARPA HPCS May 24, 2004

© Copyright 2004 Instrumental, Inc Evaluation Issues What are the scaling problems?

© Copyright 2004 Instrumental, Inc Facts About Performance(1) System Feature System CPU PerformanceCDC MFLOPSEarth Simulator 40 TFLOPS Disk TechnologyCDC Cyber RPMs Seagate Cheetah 15K RPMs Disk Density80 MB146 GB Disk Transfer Rate3 MB/sec Half Duplex71.5 MB/sec Avg. per disk 200 MB/sec full duplex RAID Disk Seek+Latency24 ms6.0 ms write 5.6 ms read

© Copyright 2004 Instrumental, Inc Facts About Performance(2) ItemTimes Increase CPU1.6M RPMS4.1 Density1814 Transfer Rate disk23.8 Transfer Rate RAID133 Seek+Latency Read4.3

© Copyright 2004 Instrumental, Inc Device Utilization

© Copyright 2004 Instrumental, Inc Tape Facts Vendor DriveMediaYear Introduced Capacity MBPeak Transfer Rate MB/sec uncompressed Performance Increase IBM 3420Reel-to-Reel IBM IBM IBM 3490E IBM 3490E IBM E IBM StorageTek SD StorageTek T9840A IBM 3590E IBM 3950E3590E StorageTek T9940A LTO Sony GY-8240FC DTF GB/60GB *** StorageTek T9840B StorageTek T9940B IBM 3590H LTO LTO-IILTO

© Copyright 2004 Instrumental, Inc File System Concerns Data fragmentation and allocation Metadata fragmentation and allocation Recovery from crash or metadata loss Performance that scales Support for >2TB LUNs Failover

© Copyright 2004 Instrumental, Inc Fragmentation Fragmentation is becoming a performance problem as file systems grow No major technology enhancements have been seen in decades 4 Object Storage Device (OSD new T10 spec) will change this Fragmentation of metadata can have dramatic impact on performance 4 Recently observed 600x slowdown in access at a site

© Copyright 2004 Instrumental, Inc USG Types Requirements What is DoD currently using?

© Copyright 2004 Instrumental, Inc Current Types of Requirements Database 4 Used by most sites, big and small, for data reference especially in the intelligence community 4 Not used by MSRCs much Real-time data capture 4 Requirement in intelligence community Application 4 Homogeneous shared file system access

© Copyright 2004 Instrumental, Inc Current Types of Requirements Archival 4 Used by MSRCs 4 Intelligence community Process Flow 4 Used after real-time capture 4 Could be used by MSRCs if shared file system between HPC and HSM systems were implemented

© Copyright 2004 Instrumental, Inc Database 4, 8 or 16 KB I/Os for indexes 4 Random 64 KB I/Os for log updates 4 Sequential Read and write Up to 256 KB Just about everyone uses a database somewhere in their HPC systems 4 Although some don’t have performance requirements

© Copyright 2004 Instrumental, Inc Real-Time Data Capture Large Block Requirement 4 4 MB-128 MB I/O requests Small Block Requirement 4 1 KB-8 KB files with millions of files per day Multiple Threads threads to keep the devices busy with either type of I/O

© Copyright 2004 Instrumental, Inc Real-Time Data Capture Generally requires an HSM 4 Usually needs 100’s of MB/sec 7x24 to meet the requirements for capture Everything must run at rate 4 I/O Bus 4 RAID devices 4 Switches 4 Limitations of tape bandwidth are pushed 4 HBAs

© Copyright 2004 Instrumental, Inc Application Homogeneous Shared File System access 4 Must be able to get the data from the nodes to a single file over fibre channel High performance I/O from those nodes 4 Depends on the application but given that GPFS peek is about 400 MB/sec that seems to be the current requirement Support for a few 100,000 files 4 No where near the HSM requirements

© Copyright 2004 Instrumental, Inc Archival Large HSM Systems 4 MSRCs are a good example 4 High speed networks 4 TCP/IP (ftp) data movement Future movement to shared file systems which will make these look more like real-time capture requirements

© Copyright 2004 Instrumental, Inc Process Flow These are applications and processes that are done via an assembly line like concept 4 Each step uses a machine or machines, sometimes specialized, to move the task along 4 Data communication via a shared file system with multi- threaded large block I/O requests from each of the hosts to various data sets

© Copyright 2004 Instrumental, Inc Current MSRC Requirements Homogeneous shared file system for applications running on the HPC system HSM support and access via TCP/IP Process Flow should be supported for visualization Support for database but no performance requirement

© Copyright 2004 Instrumental, Inc Conclusion The future for HPCS machines and most application environments will be shared file systems Shared file systems were pioneered for real-time capture world Large file systems are seeing problems with fragmentation and scaling