1 Cluster Development at Fermilab Don Holmgren All-Hands Meeting Jefferson Lab June 1-2, 2005.

Slides:



Advertisements
Similar presentations
Computing Infrastructure
Advertisements

XenData SX-520 LTO Archive Servers A series of archive servers based on IT standards, designed for the demanding requirements of the media and entertainment.
XenData SXL-3000 LTO Archive System Turnkey video archive system with near-line LTO capacities scaling from 150 TB to 750 TB, designed for the demanding.
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
JLab Status & 2016 Planning April 2015 All Hands Meeting Chip Watson Jefferson Lab Outline Operations Status FY15 File System Upgrade 2016 Planning for.
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
Designing Lattice QCD Clusters Supercomputing'04 November 6-12, 2004 Pittsburgh, PA.
Design Considerations Don Holmgren Lattice QCD Project Review May 24, Design Considerations Don Holmgren Lattice QCD Computing Project Review Cambridge,
SUMS Storage Requirement 250 TB fixed disk cache 130 TB annual increment for permanently on- line data 100 TB work area (not controlled by SUMS) 2 PB near-line.
Real Parallel Computers. Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Illinois Campus Cluster Program User Forum October 24, 2012 Illini Union Room 210 2:00PM – 3:30PM.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
Computing/Tier 3 Status at Panjab S. Gautam, V. Bhatnagar India-CMS Meeting, Sept 27-28, 2007 Delhi University, Delhi Centre of Advanced Study in Physics,
LQCD Project Overview Don Holmgren LQCD Project Progress Review May 25-26, 2006 Fermilab.
University of Illinois at Urbana-Champaign NCSA Supercluster Administration NT Cluster Group Computing and Communications Division NCSA Avneesh Pant
QCD Project Overview Ying Zhang September 26, 2005.
Site Lightning Report: MWT2 Mark Neubauer University of Illinois at Urbana-Champaign US ATLAS Facilities UC Santa Cruz Nov 14, 2012.
9/16/2000Ian Bird/JLAB1 Planning for JLAB Computational Resources Ian Bird.
Design & Management of the JLAB Farms Ian Bird, Jefferson Lab May 24, 2001 FNAL LCCWS.
CDF data production models 1 Data production models for the CDF experiment S. Hou for the CDF data production team.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina, L.Lueking,
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
LQCD Clusters at JLab Chip Watson Jie Chen, Robert Edwards Ying Chen, Walt Akers Jefferson Lab.
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
Computing and IT Update Jefferson Lab User Group Roy Whitney, CIO & CTO 10 June 2009.
NLIT May 26, 2010 Page 1 Computing Jefferson Lab Users Group Meeting 8 June 2010 Roy Whitney CIO & CTO.
Scientific Computing Experimental Physics Lattice QCD Sandy Philpott May 20, 2011 IT Internal Review 12GeV Readiness.
Introduction to U.S. ATLAS Facilities Rich Baker Brookhaven National Lab.
Jefferson Lab Site Report Kelvin Edwards Thomas Jefferson National Accelerator Facility Newport News, Virginia USA
Lattice QCD and the SciDAC-2 LQCD Computing Project Lattice QCD Workflow Workshop Fermilab, December 18, 2006 Don Holmgren,
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
PDSF at NERSC Site Report HEPiX April 2010 Jay Srinivasan (w/contributions from I. Sakrejda, C. Whitney, and B. Draney) (Presented by Sandy.
JLab Scientific Computing: Theory HPC & Experimental Physics Thomas Jefferson National Accelerator Facility Newport News, VA Sandy Philpott.
Integrating JASMine and Auger Sandy Philpott Thomas Jefferson National Accelerator Facility Jefferson Ave. Newport News, Virginia USA 23606
RAL Site Report Andrew Sansum e-Science Centre, CCLRC-RAL HEPiX May 2004.
Proposed 2007 Acquisition Don Holmgren LQCD Project Progress Review May 25-26, 2006 Fermilab.
09/02 ID099-1 September 9, 2002Grid Technology Panel Patrick Dreher Technical Panel Discussion: Progress in Developing a Web Services Data Analysis Grid.
May 25-26, 2006 LQCD Computing Review1 Jefferson Lab 2006 LQCD Analysis Cluster Chip Watson Jefferson Lab, High Performance Computing.
Jefferson Lab Site Report Sandy Philpott Thomas Jefferson National Accelerator Facility Jefferson Ave. Newport News, Virginia USA 23606
SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower QCD Project Review May 24-25, 2005 Code distribution see
Outline: Tasks and Goals The analysis (physics) Resources Needed (Tier1) A. Sidoti INFN Pisa.
1 Lattice QCD Clusters Amitoj Singh Fermi National Accelerator Laboratory.
CASPUR Site Report Andrei Maslennikov Lead - Systems Amsterdam, May 2003.
PC clusters in KEK A.Manabe KEK(Japan). 22 May '01LSCC WS '012 PC clusters in KEK s Belle (in KEKB) PC clusters s Neutron Shielding Simulation cluster.
CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.
Outline: Status: Report after one month of Plans for the future (Preparing Summer -Fall 2003) (CNAF): Update A. Sidoti, INFN Pisa and.
Randy MelenApril 14, Stanford Linear Accelerator Center Site Report April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
Batch Software at JLAB Ian Bird Jefferson Lab CHEP February, 2000.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, R. Brock,T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina,
Hans Wenzel CDF CAF meeting October 18 th -19 th CMS Computing at FNAL Hans Wenzel Fermilab  Introduction  CMS: What's on the floor, How we got.
1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.
Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.
FY2006 Acquisition at Fermilab Don Holmgren LQCD Project Progress Review May 25-26, 2006 Fermilab.
Compute and Storage For the Farm at Jlab
LQCD Computing Project Overview
Project Management – Part I
Computational Requirements
Lattice QCD Computing Project Review
Experiences with Large Data Sets
Vanderbilt Tier 2 Project
Cluster Active Archive
SAM at CCIN2P3 configuration issues
Unit 2 Computer Systems HND in Computing and Systems Development
Patrick Dreher Research Scientist & Associate Director
Designing a PC Farm to Simultaneously Process Separate Computations Through Different Network Topologies Patrick Dreher MIT.
Cluster Computers.
Presentation transcript:

1 Cluster Development at Fermilab Don Holmgren All-Hands Meeting Jefferson Lab June 1-2, 2005

2 Outline Status of current clusters New cluster details Processors Infiniband Performance Cluster architecture (I/O) The coming I/O crunch

3 Current Clusters (1) QCD80 Running since March 2001 Dual 700 MHz Pentium III In June 2004, moved off of Myrinet and onto a gigE switch 67 nodes still alive Now used for automated perturbation theory

4 Current Clusters (2) nqcd Running since July dual 2.0 GHz Xeons, originally Myrinet 32 nodes now running as prototype Infiniband cluster (since July 2004) Now used for Infiniband testing, plus some production work (accelerator simulation - another SciDAC project)

5 Cluster Status (3) W Running since January dual 2.4 GHz Xeons (400 MHz FSB) Myrinet One of our two main production clusters To be retired October 2005 to vacate computer room for construction work

6 Cluster Status (4) qcd Running since June single 2.8 GHz Pentium 4 “Prescott”, 800 MHz FSB Reused Myrinet from qcd80, nqcd PCI-X, but only 290 MB/sec bidirectional bw Our second main production cluster $900 each ~ 1.05 Gflop/sec asqtad (14^4) Gflop/sec DWF

7 Cluster Status (5) pion Being integrated now – production mid-June 260 single 3.2 GHz Pentium 640, 800 MHz FSB Infiniband using PCI-E (8X slot), 1300 MB/sec bidirectional bw $1000/node + $890/node for Infiniband ~ 1.4 Gflop/sec asqtad (14^4) ~ 2.0 Gflop/sec DWF Expand to 520 cpus by end of September

8 Details – Processors (1) Processors on W, qcd, and pion – cache size and memory bus: W: 0.5 MB 400 MHz FSB qcd: 1.0 MB 800 MHz FSB pion: 2.0 MB 800 MHz FSB

9 Details – Processors (2) Processor Alternatives: PPC970/G MHz FSB (split bus) AMD FX-55 is the fastest Opteron Pentium 640 best price /performance for pion cluster

10 Details – Infiniband (1) Infiniband cluster schematic “Oversubscription” – ratio of nodes:uplinks on leaf switches Increase oversubscription to lower switch costs LQCD codes are very tolerant to oversubscription pion uses 2:1 now, will likely go to 4:1 or 5:1 during expansion

11 Details – Infiniband (2) Netpipe bidirectional bw data Myrinet: W cluster Infiniband: pion Native QMP implementation (over VAPI instead of MPI) would boost application performance

12 Details – Performance (1) - asqtad Weak scaling curves on new cluster

13 Details – Performance (2) - asqtad Weak scaling on NCSA “T2” and FNAL “pion” T2: 3.6 GHz Dual Xeon, Infiniband on PCI-X, standard MILC V6 pion: MILC using QDP

14 Details – Performance (3) - DWF

15 Cluster Architecture – I/O (1) Current architecture: Raid storage on head node: /data/raid1, /data/raid2, etc. Files moved from head node to tape silo (encp) User’s jobs stage data to scratch disk on worker nodes Problems: Users keep track of /data/raidx Users have to manage storage Data rate limited by performance of head node Disk thrashing – use fcp to throttle

16 Cluster Architecture – I/O (2) New architecture: Data storage on dCache nodes Add capacity, bw by adding spindles and/or buses Throttling on reads Load balancing on writes Flat directory space (/pnfs/lqcd) User access to files: Shell copy (dccp) to worker scratch disks User binary access via popen(), pread(), pwrite() [#ifdefs] User binary access via open(), read(), write() [$LD_PRELOAD]

17 Cluster Architecture Questions Now: Two clusters, W and qcd Users optionally steer jobs via batch commands Clusters are binary compatible After Infiniband cluster (pion) comes online Do users want to preserve binary compatibility? If so, would use VMI from NCSA Do users want 64-bit operating system? > 2 Gbyte file access without all the #defines

18 The Coming I/O Crunch (1) LQCD already represents a noticed fraction of incoming data into Fermilab

19 The Coming I/O Crunch (2) Propagator storage is already consuming 10’s of Tbytes of tape silo capacity

20 The Coming I/O Crunch (3) Analysis is already demanding 10+ MB/sec sustained I/O rates to robots for both reading and writing (simultaneously)

21 The Coming I/O Crunch (4) These storage requirements were presented last week to the DOE reviewers: Heavy-light analysis (S. Gottlieb): 594 TB DWF analysis (G. Fleming): 100 TB Should propagators be stored to tape? Assume yes, since current workflow uses several independent jobs Duration of storage? Permanent? 1 Yr? 2 Yr? Need to budget: $200/Tbyte now, $100/Tbyte in 2006 when LTO2 drives are available

22 The Coming I/O Crunch (5) What I/O bandwidths are necessary? 1 copy in 1 year of 600 Tbytes  18.7 MB/sec sustained for 365 days, 24 hours/day! Need at least 2X this (write once, read once) Need multiple dedicated tape drives Need to plan for site-local networks (multiple gigE required) What files need to move between sites? Configurations: assume O(5 Tbytes)/year QCDOC  FNAL, similar for JLab (and UKQCD?) Interlab network bandwidth requirements?

23 Questions?