Storage at SMU OSG Storage 9/22/2010 Justin Ross Southern Methodist University.

Slides:



Advertisements
Similar presentations
Computing Infrastructure
Advertisements

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Deployment and Management of Grid Services.
ManeFrame File Systems Workshop Jan 12-15, 2015 Amit H. Kumar Southern Methodist University.
Duke Atlas Tier 3 Site Doug Benjamin (Duke University)
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
New Cluster for Heidelberg TRD(?) group. New Cluster OS : Scientific Linux 3.06 (except for alice-n5) Batch processing system : pbs (any advantage rather.
GNU/Linux Filesystem 1 st AUT GNU/Linux Festival Computer Engineering & IT Department Bahador Bakhshi.
VMware Infrastructure Alex Dementsov Tao Yang Clarkson University Feb 28, 2007.
CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.
BNL Oracle database services status and future plans Carlos Fernando Gamboa RACF Facility Brookhaven National Laboratory, US Distributed Database Operations.
© 2013 Mellanox Technologies 1 NoSQL DB Benchmarking with high performance Networking solutions WBDB, Xian, July 2013.
Research Computing with Newton Gerald Ragghianti Newton HPC workshop Sept. 3, 2010.
Computing/Tier 3 Status at Panjab S. Gautam, V. Bhatnagar India-CMS Meeting, Sept 27-28, 2007 Delhi University, Delhi Centre of Advanced Study in Physics,
ASGC 1 ASGC Site Status 3D CERN. ASGC 2 Outlines Current activity Hardware and software specifications Configuration issues and experience.
Optimizing Performance of HPC Storage Systems
Florida Tier2 Site Report USCMS Tier2 Workshop Fermilab, Batavia, IL March 8, 2010 Presented by Yu Fu For the Florida CMS Tier2 Team: Paul Avery, Dimitri.
OSG Site Provide one or more of the following capabilities: – access to local computational resources using a batch queue – interactive access to local.
UTA Site Report Jae Yu UTA Site Report 4 th DOSAR Workshop Iowa State University Apr. 5 – 6, 2007 Jae Yu Univ. of Texas, Arlington.
Grid Developers’ use of FermiCloud (to be integrated with master slides)
Tier 3(g) Cluster Design and Recommendations Doug Benjamin Duke University.
Hardware Trends. Contents Memory Hard Disks Processors Network Accessories Future.
Oxford Update HEPix Pete Gronbech GridPP Project Manager October 2014.
November 2, 2000HEPiX/HEPNT FERMI SAN Effort Lisa Giacchetti Ray Pasetes GFS information contributed by Jim Annis.
Sandor Acs 05/07/
Inside your computer. Hardware Review Motherboard Processor / CPU Bus Bios chip Memory Hard drive Video Card Sound Card Monitor/printer Ports.
SLAC Experience on Bestman and Xrootd Storage Wei Yang Alex Sim US ATLAS Tier2/Tier3 meeting at Univ. of Chicago Aug 19-20,
D0SAR - September 2005 Andre Sznajder 1 Rio GRID Initiatives : T2-HEPGRID Andre Sznajder UERJ(Brazil)
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
HPCC Mid-Morning Break File System Tips & Tricks.
ITEP computing center and plans for supercomputing Plans for Tier 1 for FAIR (GSI) in ITEP  8000 cores in 3 years, in this year  Distributed.
Jefferson Lab Site Report Sandy Philpott Thomas Jefferson National Accelerator Facility Jefferson Ave. Newport News, Virginia USA 23606
KIT – The cooperation of Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) Hadoop on HEPiX storage test bed at FZK Artem Trunov.
Southgrid Technical Meeting Pete Gronbech: May 2005 Birmingham.
Fabric Monitoring at the INFN Tier1 Felice Rosso on behalf of INFN Tier1 Joint OSG & EGEE Operations WS, Culham (UK)
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Implementation of a reliable and expandable on-line storage for compute clusters Jos van Wezel.
HEP Computing Status Sheffield University Matt Robinson Paul Hodgson Andrew Beresford.
Active Storage Processing in Parallel File Systems Jarek Nieplocha Evan Felix Juan Piernas-Canovas SDM CENTER.
Computational Research in the Battelle Center for Mathmatical medicine.
ISCSI. iSCSI Terms An iSCSI initiator is something that requests disk blocks, aka a client An iSCSI target is something that provides disk blocks, aka.
US ATLAS Western Tier 2 Status Report Wei Yang Nov. 30, 2007 US ATLAS Tier 2 and Tier 3 workshop at SLAC.
January 30, 2016 RHIC/USATLAS Computing Facility Overview Dantong Yu Brookhaven National Lab.
Florida Tier2 Site Report USCMS Tier2 Workshop Livingston, LA March 3, 2009 Presented by Yu Fu for the University of Florida Tier2 Team (Paul Avery, Bourilkov.
LSST Cluster Chris Cribbs (NCSA). LSST Cluster Power edge 1855 / 1955 Power Edge 1855 (*LSST1 – LSST 4) –Duel Core Xeon 3.6GHz (*LSST1 2XDuel Core Xeon)
BNL Oracle database services status and future plans Carlos Fernando Gamboa, John DeStefano, Dantong Yu Grid Group, RACF Facility Brookhaven National Lab,
What’s Coming? What are we Planning?. › Better docs › Goldilocks – This slot size is just right › Storage › New.
VSphere 5 – Maximums – Virtual Machine Compute vCPUs per VM 32 Memory vRAM per VM 1TB Swap per VM 1TB Storage SCSI Adapto rs per VM 4 SCSI Targets per.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
9/22/10 OSG Storage Forum 1 CMS Florida T2 Storage Status Bockjoo Kim for the CMS Florida T2.
Lustre Overview Nirmal Seenu Fermilab.
Willkommen Welcome Bienvenue How we work with users in a small environment Patrik Burkhalter.
ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)
HELMHOLTZ INSTITUT MAINZ Dalibor Djukanovic Helmholtz-Institut Mainz PANDA Collaboration Meeting GSI, Darmstadt.
G. Russo, D. Del Prete, S. Pardi Frascati, 2011 april 4th-7th The Naples' testbed for the SuperB computing model: first tests G. Russo, D. Del Prete, S.
STORAGE EXPERIENCES AT MWT2 (US ATLAS MIDWEST TIER2 CENTER) Aaron van Meerten University of Chicago Sarah Williams Indiana University OSG Storage Forum,
S. Pardi Computing R&D Workshop Ferrara 2011 – 4 – 7 July SuperB R&D on going on storage and data access R&D Storage Silvio Pardi
BeStMan/DFS support in VDT OSG Site Administrators workshop Indianapolis August Tanya Levshina Fermilab.
1 Lustre ZFS Team Sun Microsystems Lustre, ZFS, and Data Integrity ZFS BOF Supercomputing 2009 Andreas Dilger Sr. Staff Engineer Sun Microsystems.
Lustre File System chris. Outlines  What is lustre  How does it works  Features  Performance.
PSI CMS T3 Status & '1[ ] HW Plan March '16 Fabio Martinelli
TYBIS IP-Matrix Virtualized Total Video Surveillance System Edge Technology, World Best Server Virtualization.
Experience of Lustre at QMUL
The demonstration of Lustre in EAST data system
Cluster / Grid Status Update
Jeremy Maris Research Computing IT Services University of Sussex
Experience of Lustre at a Tier-2 site
Southwest Tier 2.
Scalable Database Services for Physics: Oracle 10g RAC on Linux
Scalable Database Services for Physics: Oracle 10g RAC on Linux
Presentation transcript:

Storage at SMU OSG Storage 9/22/2010 Justin Ross Southern Methodist University

Tier 3 Hardware 20 Dell C6100 (80 nodes) 32 Dell M Dell M610 with Infiniband 10 Gb Cluster Network Backbone, 1Gb to each node 6 GB RAM per core 14 Dell C6100 (56 nodes) just delivered Dell 8024F and 6248 switches Experiments: ATLAS, NoVA, CDMS, BaBar, D-Zero, Biology, Math, Mechanical Engineering, Statistics

Tier 3 Config Lustre FS – ext3 FS on individual file systems NFS – Special case use BeStMan SRM Condor for batch SGE for parallel jobs Nagios monitoring

Lustre Layout Dell MD3000 MDT – 14 15K RPM SAS drives – Snapshots for MDT backup Nexsan SataBEAST OST – 42 2TB SATA drives – Single controller – 2 4Gb fibre ports Dell R710 servers – Intel CPU – 2 500GB SATA HD RAID1 for OS – 16G RAM – 4 1Gb bonded NIC – Qlogic 4Gb HBA/6Gb SAS HBA

Config Lustre provided kernel on all MDS/OSS Patchless kernel module on clients. Every time a new kernel is released you need to compile the module. Performance is good. – Performance increases when more OSS/OST are added. – The one drawback is small files. – Biology Professor has millions of small files and uses NFS.

Startup/Shutdown Lustre is like a freight train it is big and takes a long time to start and stop To start just mount everything – mount /mdt – mount /mgs – mount /[2345]-ost[123456] on appropriate server Shutdown – umount /[2345]-ost[123456] on appropriate server – umount /mdt – umount /mgs

Traffic Flow/Monitoring

lustre]$ pwd /proc/fs/lustre lustre]$ cat devices 0 UP mgc 124b7d77-8f cf-80a0098aa43b 5 1 UP lov smuhpc-clilov-ffff810c6f e6d03b78-edcd-a8ee c81b2fa0 4 2 UP mdc smuhpc-MDT0000-mdc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa0 5 3 UP osc smuhpc-OST0000-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa0 5 4 UP osc smuhpc-OST0001-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa0 5 5 UP osc smuhpc-OST0002-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa0 5 6 UP osc smuhpc-OST0003-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa0 5 7 UP osc smuhpc-OST0004-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa0 5 8 UP osc smuhpc-OST0005-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa0 5 9 UP osc smuhpc-OST0006-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST0007-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST000b-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST000c-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST000d-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST000e-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST000f-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST0010-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST0011-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST0012-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST0013-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST0014-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST0015-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST0016-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST0008-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST0009-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST000a-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST0017-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST0018-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST0019-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST001a-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST001b-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST001c-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST001d-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST001e-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST001f-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST0020-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa UP osc smuhpc-OST0021-osc-ffff810c6f e6d03b78-edcd-a8ee c81b2fa0 5

~]$ lfs df -h UUID bytes Used Available Use% Mounted on smuhpc-MDT0000_UUID 656.2G 3.6G 652.6G 0% /lustre[MDT:0] smuhpc-OST0000_UUID 4.5T 3.6T 907.1G 80% /lustre[OST:0] smuhpc-OST0001_UUID 4.5T 3.6T 919.7G 79% /lustre[OST:1] smuhpc-OST0002_UUID 4.5T 3.6T 897.7G 80% /lustre[OST:2] smuhpc-OST0003_UUID 4.5T 3.6T 872.7G 80% /lustre[OST:3] smuhpc-OST0004_UUID 4.5T 3.6T 915.9G 80% /lustre[OST:4] smuhpc-OST0005_UUID 4.5T 3.6T 911.2G 80% /lustre[OST:5] smuhpc-OST0006_UUID 4.5T 3.6T 915.5G 80% /lustre[OST:6] smuhpc-OST0007_UUID 4.5T 3.6T 872.0G 80% /lustre[OST:7] smuhpc-OST0008_UUID 4.5T 22.8G 4.5T 0% /lustre[OST:8] smuhpc-OST0009_UUID 4.5T 22.5G 4.5T 0% /lustre[OST:9] smuhpc-OST000a_UUID 4.5T 22.0G 4.5T 0% /lustre[OST:10] smuhpc-OST000b_UUID 4.5T 3.5T G 77% /lustre[OST:11] smuhpc-OST000c_UUID 4.5T 3.5T G 77% /lustre[OST:12] smuhpc-OST000d_UUID 4.5T 3.5T G 78% /lustre[OST:13] smuhpc-OST000e_UUID 4.5T 3.5T G 77% /lustre[OST:14] smuhpc-OST000f_UUID 4.5T 3.2T 1.3T 70% /lustre[OST:15] smuhpc-OST0010_UUID 4.5T 3.2T 1.3T 70% /lustre[OST:16] smuhpc-OST0011_UUID 4.5T 3.1T 1.4T 69% /lustre[OST:17] smuhpc-OST0012_UUID 4.5T 3.2T 1.3T 70% /lustre[OST:18] smuhpc-OST0013_UUID 4.5T 2.9T 1.6T 64% /lustre[OST:19] smuhpc-OST0014_UUID 4.5T 2.9T 1.6T 65% /lustre[OST:20] smuhpc-OST0015_UUID 4.5T 2.8T 1.6T 63% /lustre[OST:21] smuhpc-OST0016_UUID 4.5T 2.8T 1.6T 63% /lustre[OST:22] smuhpc-OST0017_UUID 4.5T 16.7G 4.5T 0% /lustre[OST:23] smuhpc-OST0018_UUID 4.5T 11.9G 4.5T 0% /lustre[OST:24] smuhpc-OST0019_UUID 4.5T 16.8G 4.5T 0% /lustre[OST:25] smuhpc-OST001a_UUID 4.5T 15.5G 4.5T 0% /lustre[OST:26] smuhpc-OST001b_UUID 4.5T 12.4G 4.5T 0% /lustre[OST:27] smuhpc-OST001c_UUID 4.5T 12.6G 4.5T 0% /lustre[OST:28] smuhpc-OST001d_UUID 4.5T 11.5G 4.5T 0% /lustre[OST:29] smuhpc-OST001e_UUID 4.5T 14.1G 4.5T 0% /lustre[OST:30] smuhpc-OST001f_UUID 4.5T 16.1G 4.5T 0% /lustre[OST:31] smuhpc-OST0020_UUID 4.5T 11.2G 4.5T 0% /lustre[OST:32] smuhpc-OST0021_UUID 4.5T 9.3G 4.5T 0% /lustre[OST:33] filesystem summary: 152.2T 67.0T 85.2T 44% /lustre lustre]$ pwd /proc/fs/lustre lustre]$ ll total 0 -r--r--r-- 1 root root 0 Sep 17 08:43 devices -r--r--r-- 1 root root 0 Sep 17 08:43 health_check dr-xr-xr-x 4 root root 0 Sep 17 08:43 ldlm dr-xr-xr-x 3 root root 0 Sep 17 08:43 llite dr-xr-xr-x 3 root root 0 Sep 17 08:43 lov dr-xr-xr-x 2 root root 0 Sep 17 08:43 lquota dr-xr-xr-x 3 root root 0 Sep 17 08:43 mdc dr-xr-xr-x 3 root root 0 Sep 17 08:43 mgc dr-xr-xr-x 36 root root 0 Sep 17 08:43 osc -r--r--r-- 1 root root 0 Sep 17 08:43 pinger -r--r--r-- 1 root root 0 Sep 17 08:43 version