Storage on the Lunatic Fringe Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

Slides:



Advertisements
Similar presentations
1 AMY Detector (eighties) A rather compact detector.
Advertisements

Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and Mass Storage to Diverse Scientific Communities.
High Performance Computing Course Notes Grid Computing.
IB in the Wide Area How can IB help solve large data problems in the transport arena.
Highest Energy e + e – Collider LEP at CERN GeV ~4km radius First e + e – Collider ADA in Frascati GeV ~1m radius e + e – Colliders.
Distributed IT Infrastructure for U.S. ATLAS Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Randall Sobie The ATLAS Experiment Randall Sobie Institute for Particle Physics University of Victoria Large Hadron Collider (LHC) at CERN Laboratory ATLAS.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
Silicon Graphics, Inc. Poster Presented by: SGI Proprietary Technologies for Breakthrough Research Rosario Caltabiano North East Higher Education & Research.
Parallel Programming on the SGI Origin2000 With thanks to Moshe Goldberg, TCC and Igor Zacharov SGI Taub Computer Center Technion Mar 2005 Anne Weill-Zrahia.
University of Minnesota Digital Technology Center Thomas M. Ruwart
POLITEHNICA University of Bucharest California Institute of Technology National Center for Information Technology Ciprian Mihai Dobre Corina Stratan MONARC.
HEP Prospects, J. Yu LEARN Strategy Meeting Prospects on Texas High Energy Physics Network Needs LEARN Strategy Meeting University of Texas at El Paso.
CERN/IT/DB Multi-PB Distributed Databases Jamie Shiers IT Division, DB Group, CERN, Geneva, Switzerland February 2001.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
Object-based Storage Long Liu Outline Why do we need object based storage? What is object based storage? How to take advantage of it? What's.
High Energy Physics At OSCER A User Perspective OU Supercomputing Symposium 2003 Joel Snow, Langston U.
Niko Neufeld, CERN/PH-Department
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000
9/16/2000Ian Bird/JLAB1 Planning for JLAB Computational Resources Ian Bird.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
Storage Systems Market Analysis Dec 04. Storage Market & Technologies.
Scientific Computing Division Trends and Directions of Mass Storage in the Scientific Computing Arena CAS 2001 Gene Harano National Center for Atmospheric.
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
Storage Tank in Data Grid Shin, SangYong(syshin, #6468) IBM Grid Computing August 23, 2003.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Storage and File Structure.
Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.
Data Logistics in Particle Physics Ready or Not, Here it Comes… Prof. Paul Sheldon Vanderbilt University Prof. Paul Sheldon Vanderbilt University.
Fermi National Accelerator Laboratory SC2006 Fermilab Data Movement & Storage Multi-Petabyte tertiary automated tape store for world- wide HEP and other.
JLAB Computing Facilities Development Ian Bird Jefferson Lab 2 November 2001.
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
High Energy Physics and Grids at UF (Dec. 13, 2002)Paul Avery1 University of Florida High Energy Physics.
CERN – IT Department CH-1211 Genève 23 Switzerland t Working with Large Data Sets Tim Smith CERN/IT Open Access and Research Data Session.
ATLAS WAN Requirements at BNL Slides Extracted From Presentation Given By Bruce G. Gibbard 13 December 2004.
 The End to the Means › (According to IBM ) › 03.ibm.com/innovation/us/thesmartercity/in dex_flash.html?cmp=blank&cm=v&csr=chap ter_edu&cr=youtube&ct=usbrv111&cn=agus.
Computing for LHC Physics 7th March 2014 International Women's Day - CERN- GOOGLE Networking Event Maria Alandes Pradillo CERN IT Department.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
© Copyright 2004 Instrumental, Inc I/O Types and Usage in DoD Henry Newman Instrumental, Inc/DOD HPCMP/DARPA HPCS May 24, 2004.
Tackling I/O Issues 1 David Race 16 March 2010.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Pathway to Petaflops A vendor contribution Philippe Trautmann Business Development Manager HPC & Grid Global Education, Government & Healthcare.
Storage Management on the Grid Alasdair Earl University of Edinburgh.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
IT-DSS Alberto Pace2 ? Detecting particles (experiments) Accelerating particle beams Large-scale computing (Analysis) Discovery We are here The mission.
1 Particle Physics Data Grid (PPDG) project Les Cottrell – SLAC Presented at the NGI workshop, Berkeley, 7/21/99.
Presented by Robust Storage Management On Desktop, in Machine Room, and Beyond Xiaosong Ma Computer Science and Mathematics Oak Ridge National Laboratory.
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
Predrag Buncic CERN Data management in Run3. Roles of Tiers in Run 3 Predrag Buncic 2 ALICEALICE ALICE Offline Week, 01/04/2016 Reconstruction Calibration.
Processing Device and Storage Devices
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Information Technology
Course Introduction Dr. Eggen COP 6611 Advanced Operating Systems
Andy Wang COP 5611 Advanced Operating Systems
CERN Lustre Evaluation and Storage Outlook
The LHC Computing Grid Visit of Her Royal Highness
Enabling High Speed Data Transfer in High Energy Physics
ALICE Computing Model in Run3
OffLine Physics Computing
Nuclear Physics Data Management Needs Bruce G. Gibbard
Andy Wang COP 5611 Advanced Operating Systems
Andy Wang COP 5611 Advanced Operating Systems
Improving performance
The LHC Computing Grid Visit of Professor Andreas Demetriou
Presentation transcript:

Storage on the Lunatic Fringe Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

Orientation Who are the lunatics? What are their requirements? Why is this interesting to the Storage Industry? What is SNIA doing about this? Conclusions

Who are the Lunatics? DoE Accelerated Strategic Computing Initiative (ASCI)  BIG data, locally and widely distributed, high bandwidth access, relatively few users, secure, short-term retention High Energy Physics (HEP) – Fermilab, CERN, DESY  BIG data, locally distributed, widely available, moderate number of users, sparse access, long-term retention NASA – Earth Observing System Data Information Systems (EOSDIS)  Moderately sized data, locally distributed, widely available, large number of users, very long-term retention DoD – NSA  Lots of little data – trillions of files, locally distributed, relatively few users, secure, long-term retention DoD – Army High Performance Computing Centers and Naval Research Center  BIG data, locally and widely distributed, relatively few users, high bandwidth access, secure, very long term reliable retention

A bit of History 1990 – Supercomputer Centers operating with HUGE disk farms of GB! 1990 – Laptop computers have 50MB internal disk drives! 1992 – Fast/wide SCSI runs at break-necking speeds of 20 MB/sec! 1994 – Built a 1+TB array of disks with a single SGI xFS file system and wrote a single 1TB file  Used 4GB disks in 7+1 RAID 5 disk arrays  36 disk arrays mounted in 5 racks 1997 ASCI Mountain Blue - 75TB – distributed 2002 ASCI Q – 700TB – online, high performance, pushing limits of traditional [legacy] block-based file systems

The not-too-distant Future 2004 ASCI Red Storm – 240TB – online, high bandwidth, massively parallel 2005 ASCI Purple – 3000TB – online, high performance, OSD/Lustre 2006 NASA RDS – 6000TB – online, global access, CAS,OSD, Data Grids, Lustre? 2007 DoE Fermi Lab / CERN – 3 PB/year online / nearline, global sparse access 2010 Your laptop will have a 1TB internal disk that will still be barley adequate for MS Office™

DoE ASCI 1998 – Mountain Blue – Los Alamos  Processor SGI Origin 2000 systems  75TB disk storage 2002 – Q  processor machines processor I/O nodes  GB FC connections to 64 I/O nodes  GB FC connections to disk storage subsystem  692 TB disk storage, 20GB/sec bandwidth  2 file systems of 346GB each  4 file system layers between the application  and the disk media 2004 – Red Storm  10,000 processors, 10TB Main Memory  240TB Disk, 50 GB/sec bandwidth

DoE ASCI Purple Requirements Parallel I/O Bandwidth - Multiple (up to 60,000) clients access one file at hundreds of GB/sec. Support for very large (multi-petabyte) file systems Single files of multi-terabyte size must be permitted. Scalable file creation & Metadata Operations  Tens of Millions of files in one directory  Thousands of file creates per second within the same directory Archive Driven Performance - The file system should support high bandwidth data movement to tertiary storage. Adaptive Pre-fetching - Sophisticated pre-fetch and write-behind schemes are encouraged, but a method to disable them must accompany them. Flow Control & Quality of I/O Service

HEP – Fermilab and CMS The Compact Muon Solenoid (CMS)  $750M Experiment being built at CERN in Switzerland  Will be active in 2007  Data rate from the detectors is ~1 PB/sec  Data rate after filtering is ~hundreds of MB/sec The Data Problem  Dataset for a single experiment is ~1PB  Several experiments per year are run  Must be made available to 5000 scientists all over the planet (Earth primarily)  Dense dataset, sparse data access by any one user  Access patterns are not deterministic HEP experiments cost $US 1B, last 20 years, involve thousands of collaborators at hundreds of institutions world- wide, and collect and analyze several petabytes of data per year

Tier 1 Tier2 Center Online System event reconstruction French Regional Center German Regional Center Institute Institute ~0.25TIPS Workstations ~100 MBytes/sec ~ Gbps Mbits/sec Physics data cache ~PByte/sec ~2.5 Gbits/sec Tier2 Center ~ Gbps Tier 0 +1 Tier 3 Tier 4 Tier2 Center LHC Data Grid Hierarchy CMS as example, Atlas is similar Tier 2 CERN/CMS data goes to 6-8 Tier 1 regional centers, and from each of these to 6-10 Tier 2 centers. Physicists work on analysis “channels” at 135 institutes. Each institute has ~10 physicists working on one or more channels physicists in 31 countries are involved in this 20-year experiment in which DOE is a major player. CMS detector: 15m X 15m X 22m 12,500 tons, $700M. human=2m analysis event simulation Italian Center FermiLab, USA Regional Center Courtesy Harvey Newman, CalTech and CERN

NASA EOSDIS Remote Data Store Project:  Build a 6PB Data archive with a life expectancy of at least 20 years, probably more  Make data and data products available to 2 million users What to use?  Online versus Nearline  SCSI vs ATA  Tape vs Optical  How much of each and when? Data Grids? Dealing with Technology Life Cycles – continual migration

DoD NSA How to deal with a trillion files?  At 256 bytes of metadata per file -> 256TB just for the file system metadata for one trillion files  File System resiliency  Backups? Forget it. File Creation Rate is a challenge – 32,000 file per second for 1 year will generate 1 trillion files How to search for any given file How to search for any given piece of information inside all the files

DoD MSRC 500TB per year data growth Longevity of data retention is critical  100% reliable access of any piece of data for 20+ years Security is critical Reasonably quick access to any piece of data from anywhere at any time Heterogeneous computing and storage environment

History has shown… The problems that the Lunatic Fringe is working on today are the problems that the main-stream storage industry will face in 5-10 years Legacy Block-based File Systems break at these scales Legacy Network File System protocols cannot scale to meet these extreme requirements

Looking Forward

What happens when…. NEC Announces a 10Tbit Memory Chip Disk drives reach 1TByte and beyond MEMS devices become commercially viable Holographic Storage Devices become commercially viable Interface speeds reach 1Tbit/sec Intel develops the sub-space channel Vendors need better ways to exploit the capabilities of these technologies rather than react to them Vendors need better ways to exploit the capabilities of these technologies rather than react to them

Common thread Their data storage capacity, access, and retention requirements are continually increasing Some of the technologies and concepts the Lunatic Fringe are looking at include:  Object-based Storage Device  Intelligent Storage  Data Grid  Borg Assimilation Technologies, …etc.

How does SNIA make a difference? Act as a point to achieve critical mass behind emerging technologies such as OSD, SMI, and Intelligent Storage Make sure that these emerging technologies come to market from the beginning as standards (not proprietary implementations that migrate to standards) Help to get beyond the potential barrier for emerging technologies OSD and Intelligent Storage Help to generate vendor and user awareness and education regarding future trends and emerging technologies

Conclusions Lunatic Fringe users will continue to push the limits of existing hardware and software technologies Lunatic Fringe is a moving target – there will always be a Lunatic Fringe well beyond where you are The Storage Industry at large should pay more attention to  What they are doing  Why they are doing it  What they learn

References University of Minnesota Digital Technology Center – ASCI – Fermilab – NASA EOSDIS – NSA –

Contact Info