BlueArc IOZone Root Benchmark How well do VM clients perform vs. Bare Metal clients? Bare Metal Reads are (~10%) faster than VM Reads. Bare Metal Writes.

Slides:



Advertisements
Similar presentations
Jun 29, 20111/13 Investigation of storage options for scientific computing on Grid and Cloud facilities Jun 29, 2011 Gabriele Garzoglio & Ted Hesselroth.
Advertisements

Current Testbed : 100 GE 2 sites (NERSC, ANL) with 3 nodes each. Each node with 4 x 10 GE NICs Measure various overheads from protocols and file sizes.
Metadata Performance Improvements Presentation for LUG 2011 Ben Evans Principal Software Engineer Terascala, Inc.
Xrootd and clouds Doug Benjamin Duke University. Introduction Cloud computing is here to stay – likely more than just Hype (Gartner Research Hype Cycle.
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
Minimising IT costs, maximising operational efficiency desktop.
Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley.
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
VIRTUALISATION OF HADOOP CLUSTERS Dr G Sudha Sadasivam Assistant Professor Department of CSE PSGCT.
5205 – IT Service Delivery and Support
Copyright Tim Antonowicz, This work is the intellectual property of the author. Permission is granted for this material to be shared for non- commercial,
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Minerva Infrastructure Meeting – October 04, 2011.
Jun 29, 20101/25 Storage Evaluation on FG, FC, and GPCF Jun 29, 2010 Gabriele Garzoglio Computing Division, Fermilab Overview Introduction Lustre Evaluation:
Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &
1 The Virtual Reality Virtualization both inside and outside of the cloud Mike Furgal Director – Managed Database Services BravePoint.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Middleware Enabled Data Sharing on Cloud Storage Services Jianzong Wang Peter Varman Changsheng Xie 1 Rice University Rice University HUST Presentation.
File System Benchmarking
Investigation of Storage Systems for use in Grid Applications 1/20 Investigation of Storage Systems for use in Grid Applications ISGC 2012 Feb 27, 2012.
Network Aware Resource Allocation in Distributed Clouds.
Mar 24, 20111/17 Investigation of storage options for scientific computing on Grid and Cloud facilities Mar 24, 2011 Keith Chadwick for Gabriele Garzoglio.
Storage Management in Virtualized Cloud Environments Sankaran Sivathanu, Ling Liu, Mei Yiduo and Xing Pu Student Workshop on Frontiers of Cloud Computing,
Grid Developers’ use of FermiCloud (to be integrated with master slides)
Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes – Bretagne Atlantique Rennes, France
Building a Parallel File System Simulator E Molina-Estolano, C Maltzahn, etc. UCSC Lab, UC Santa Cruz. Published in Journal of Physics, 2009.
Current Testbed : 100 GE 2 sites (NERSC, ANL) with 3 nodes each. Each node with 4 x 10 GE NICs Measure various overheads from protocols and file sizes.
Carrying Your Environment With You or Virtual Machine Migration Abstraction for Research Computing.
Amy Apon, Pawel Wolinski, Dennis Reed Greg Amerson, Prathima Gorjala University of Arkansas Commercial Applications of High Performance Computing Massive.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
03/03/09USCMS T2 Workshop1 Future of storage: Lustre Dimitri Bourilkov, Yu Fu, Bockjoo Kim, Craig Prescott, Jorge L. Rodiguez, Yujun Wu.
Investigation of Storage Systems for use in Grid Applications 1/20 Investigation of Storage Systems for use in Grid Applications ISGC 2012 Feb 27, 2012.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.
KIT – The cooperation of Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) Hadoop on HEPiX storage test bed at FZK Artem Trunov.
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Tools and techniques for managing virtual machine images Andreas.
Ian Gable HEPiX Spring 2009, Umeå 1 VM CPU Benchmarking the HEPiX Way Manfred Alef, Ian Gable FZK Karlsruhe University of Victoria May 28, 2009.
A Silvio Pardi on behalf of the SuperB Collaboration a INFN-Napoli -Campus di M.S.Angelo Via Cinthia– 80126, Napoli, Italy CHEP12 – New York – USA – May.
First Look at the New NFSv4.1 Based dCache Art Kreymer, Stephan Lammel, Margaret Votava, and Michael Wang for the CD-REX Department CD Scientific Computing.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
Atlas Software Structure Complicated system maintained at CERN – Framework for Monte Carlo and real data (Athena) MC data generation, simulation and reconstruction.
1.3 ON ENHANCING GridFTP AND GPFS PERFORMANCES A. Cavalli, C. Ciocca, L. dell’Agnello, T. Ferrari, D. Gregori, B. Martelli, A. Prosperini, P. Ricci, E.
RALPP Site Report HEP Sys Man, 11 th May 2012 Rob Harper.
Nov 18, 20091/15 Grid Storage on FermiGrid and GPCF – Activity Discussion Grid Storage on FermiGrid and GPCF Activity Discussion Nov 18, 2009 Gabriele.
Virtual Machine in HPC PAK MARKTHUB (13M54040) 1 VIRTUAL MACHINE IN HPC.
1 Thierry Titcheu Chekam 1,2, Ennan Zhai 3, Zhenhua Li 1, Yong Cui 4, Kui Ren 5 1 School of Software, TNLIST, and KLISS MoE, Tsinghua University 2 Interdisciplinary.
Auxiliary services Web page Secrets repository RSV Nagios Monitoring Ganglia NIS server Syslog Forward FermiCloud: A private cloud to support Fermilab.
Performance analysis comparison Andrea Chierici Virtualization tutorial Catania 1-3 dicember 2010.
FermiCloud Project: Integration, Development, Futures Gabriele Garzoglio Associate Head Grid & Cloud Computing Department Fermilab Work supported by the.
Architecture of a platform for innovation and research Erik Deumens – University of Florida SC15 – Austin – Nov 17, 2015.
Hao Wu, Shangping Ren, Gabriele Garzoglio, Steven Timm, Gerard Bernabeu, Hyun Woo Kim, Keith Chadwick, Seo-Young Noh A Reference Model for Virtual Machine.
GPCF* Update Present status as a series of questions / answers related to decisions made / yet to be made * General Physics Computing Facility (GPCF) is.
29/04/2008ALICE-FAIR Computing Meeting1 Resulting Figures of Performance Tests on I/O Intensive ALICE Analysis Jobs.
Parallel Virtual File System (PVFS) a.k.a. OrangeFS
Getting the Most out of Scientific Computing Resources
Dynamic Extension of the INFN Tier-1 on external resources
Getting the Most out of Scientific Computing Resources
Diskpool and cloud storage benchmarks used in IT-DSS
Running virtualized Hadoop, does it make sense?
Experience of Lustre at a Tier-2 site
PA an Coordinated Memory Caching for Parallel Jobs
Introduction to Networks
Grid Canada Testbed using HEP applications
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Investigation of Storage Systems for use in Grid Applications
Overview Context Test Bed Lustre Evaluation Standard benchmarks
Virtual Memory: Working Sets
Presentation transcript:

BlueArc IOZone Root Benchmark How well do VM clients perform vs. Bare Metal clients? Bare Metal Reads are (~10%) faster than VM Reads. Bare Metal Writes are (~5%) faster than VM Writes. Note: results vary depending on the overall system conditions (net, storage, etc.) Do transmit ethernet buffer sizes (txqueuelen) affect write performance? For these “reasonable” values of txqueuelen, we do NOT see any effect. on write performance Varying the eth buffer size for the client VMs does not change read / write BW MB/s read MB/s write 3 Slow Clnts at 3 MB/s How well do VM clients perform vs. Bare Metal (BM) clients? Read BW is essentially the same on Bare Metal and VM. Note: NOvA skimming app reads 50% of the events by design. On BA, OrangeFS, and Hadoop, clients transfer 50% of the file. On Lustre 85%, because the default read-ahead configuration is inadequate for this use case. 21 VM Clts Outliers consistently slow 11 BM Clts 49 VM Clts Root-app Read Rates: 21 Clts: 8.15  0.03 MB/s ( Lustre:  0.06 MB/s Hadoop: ~7.9  0.1 MB/s ) Eth interfacetxqueuelen Host1000 Host / VM bridge500, 1000, 2000 VM1000 Hadoop IOZone Root Benchmark How well do VM clients perform vs. Bare Metal clients? Is there a difference for External vs. OnBoard clients? How does number of replica change performance? On-Board Bare Metal clients reads gain from kernel caching, for a few clients. For many clients, same or ~50% faster than On-Board VM and ~100% faster than External VM clients. External VM Clients up to 100% faster than On- Board VM clients. Multiple replicas have little effect on read BW. On-Board Bare Metal client writes gain from kernel caching: generally faster than VM clients On-Board VM client 50%-200% faster than Extermal VM clients. All VM write scale well with number of clients. For External VM clients, write speed scales almost linearly with the number of replicas. Root-app Read Rates: ~7.9  0.1 MB/s ( Lustre on Bare Metal was  0.06 MB/s Read ) Ext (ITB) clients read ~5% faster then on-board (FCL) clients. Number of replicas has minimal impact on read bandwidth. How does read BW vary for On-Brd vs. Ext. clnts? How does read bw vary vs. number of replicas? External and On-Board clients get the same share of the bw among themselves (within ~2%). At saturation, External clients read ~10% faster than On-Brd clients.. (Same as OrangeFS. Different from Lustre) 49 clts (1 proc. / VM / core) saturate the BW to the srv. Is the distrib. of the BW fair? Read time up to 233% of min. time (113% w/o outliers) 10 files / 15 GB: min read time: 1117s OrangeFS IOZone Root Benchmark How well do VM clients perform vs. Bare Metal clients? Is there a difference for External vs. OnBoard clients? How does number of name nodes change performance? On-Board Bare Metal clients read almost as fast as On-Board VM (faster config. w/ 4nn and 3 dn), but 50% slower than VM cl. for many processes: possibly too many procs for the OS to manage. On-Board VM Clients read 10%-60% faster than External VM clients. Using 4 name nodes improves read performance by 10%-60% as compared to 1 name node (different from write performance). Best performance when each name node serves a fraction of the clients. On-Board Bare Metal clients write 80% slower than VM clients: possibly too many processes for the OS to manage. Write performance NOT consistent. On-Board VM clients generally have the same perf. as External VM clients. One reproducible 70% slower write meas. for External VM (4 name nodes when each nn serves a fraction of the cl.) Using 4 name nodes has 70% slower perf. for External VM (reproducible). Different from read performance. 10 files / 15 GB: min read time: 948s Read time up to 125% of min. time External clients get the same share of the bw among themselves (within ~2%) (as Hadoop). On-Board clients have a larger spread (~20%) (as Lustre virtual Server.). At saturation, on average External clients read ~10% faster than On-Board cl. (Same as Hadoop. Different from Lustre virt. Srv.) 49 clts (1 proc. / VM / core) saturate the BW to the srv. Is the distribution of the BW fair? External (ITB) clients read ~7% faster then on-board (FCL) clients (Same as Hadoop. Opposite from Lustre virt. Srv.). How does read BW vary for On-Brd vs. Ext. clnts? Read: same for Bare Metal and VM srv (w/ virtio net drv.) Read: OnBrd clts 15% slower than Ext. clts (not significant) Write: Bare Metal srv 3x faster than VM srv Striping has a 5% effect on reading, none on writing. No effect changing number of cores on Srv VM Lustre Srv. on VM (SL5 and KVM) Read Write How well does Lustre perform with servers on Bare Metal vs. VM ? Note: SL6 may have better write performance Read – Ext Cl.. vs. virt. srv. BW =  0.08 MB/s (1 ITB cl.: 15.3  0.1 MB/s) Read – Ext. Cl. vs BM srv. BW =  0.06 MB/s (1 cl. vs. b.m.: 15.6  0.2 MB/s) Virtual Clients as fast as BM for read. On-Brd Cl. 6% faster than Ext. cl. (Opposite as Hadoop & OrangeFS ) Virtual Server is almost as fast as Bare Metal for read (ext. cl.) Read – On-Brd Cl. vs. virt. srv. BW =  0.05 MB/s (1 FCL cl.: 14.4  0.1 MB/s) 4MB stripes on 3 OST Read – Ext. Cl. vs. virt. srv. BW =  0.01 MB/s Data Striping: More “consistent” BW Non-Striped Bare Metal (BM) Server: baseline for read (ext. cl.) 49 clts saturate the BW: is the distrib. fair? Clients do NOT all get the same share of the bandwidth (within 20%). Does Striping affect read BW for Ext. and On-Brd clients? Read performance: how does Lustre Srv. on VM compare with Lustre Srv. on Bare Metal for External and On-Board clients? Lustre IOZone Root Benchmark MetaData Comparison Hadoop Name Node scales better than OrangeFS Acknowledgments: Ted Hesselroth, Doug Strain – IOZone Perf. measurements. Andrei Maslennikov – HEPiX storage group. Andrew Norman, Denis Perevalov – Nova framework for the storage benchmarks and HEPiX work. Robert Hatcher, Art Kreymer – Minos framework for the storage benchmarks and HEPiX work. Steve Timm, Neha Sharma, Hyunwoo Kim – FermiCloud support. Alex Kulyavtsev, Amitoj Singh – Consulting. Fermilab is Operated by the Fermi Research Alliance, LLC under Contract No. DE-AC02-07CH11359 with the United States Department of Energy. “Bare Metal” Clients / Servers On-Board vs. External Clients Storage Testbed Lustre on Bare Metal has the best performance as an external storage solution for the root skim application use case (fast read / little write). Consistent performance for general operations (tested via iozone). Consider operational drawback of special kernel On-board clients only via virtualization, but server VM allows only slow write. Hadoop, OrangeFS, and BlueArc have equivalent performance for the root skim use case. Hadoop has good operational properties (maintenance, fault tolerance) and a fast name server, but performance is not impressive. BlueArc at FNAL is a good alternative for general operations since it is a well known production quality solution. The results of the study support the growing deployment of Lustre at Fermilab, while maintaining the BlueArc infrastructure. Conclusions StorageBenchmarkRead (MB/s)Write (MB/s)Notes Lustre IOZone (70 on VM) Root-based12.6- Hadoop IOZone Varies on replicas Root-based7.9- BlueArc IOZone Varies on conditions Root-based8.4- OrangeFS IOZone Varies on name nodes Root-based8.1- Set the scale: measure storage metrics from running experiments to set the scale on expected bandwidth, typical file size, number of clients, etc. Install storage solutions on FermiCloud testbed: Lustre, BlueArc, Hadoop, OrangeFS Measure performance Run standard benchmarks on storage installations. Study response of the technology to real-life (skim) applications access patterns (root-based) Use HEPiX storage group infrastructure to characterize response to Intensity Frontier (IF) applications Fault tolerance: simulate faults and study reactions Operations: comment on potential operational issues. Clients on Virtual Machines: can we take advantage of the flexibility of cloud resources? Introduction IOZone Writes 2GB file from each client and performs read/write tests. Setup: 3-60clients on Bare Metal (BM) and 3-21 VM/nodes. Root-based applications Used off-line root-based framework (ana) of the Nova neutrino Intensity Frontier (IF) experiment. Ran a "skim job" that read a data file and discarded large fraction of events. Reads stressed storage access; writes proved CPU-bound" Setup: 3-60clients on Bare Metal and 3-21 VM/nodes. MDTest Different metadata FS operations on up to 50k files / dirs using different access patterns. Setup: clients on 21 VM. Data Access Tests How well do name nodes scale with number of clients? On-Brd Client VM run on the same host as the storage server.