Presentation is loading. Please wait.

Presentation is loading. Please wait.

U.S. Department of Energy’s Office of Science Midrange Scientific Computing Requirements Jefferson Lab Robert Edwards October 21, 2008.

Similar presentations


Presentation on theme: "U.S. Department of Energy’s Office of Science Midrange Scientific Computing Requirements Jefferson Lab Robert Edwards October 21, 2008."— Presentation transcript:

1 U.S. Department of Energy’s Office of Science Midrange Scientific Computing Requirements Jefferson Lab Robert Edwards edwards@jlab.org October 21, 2008

2 Office of Science U.S. Department of Energy Jefferson Lab  Lab doubling beam energy  Adding new experimental Hall CD-3 JLab Receives DOE Approval to Start Construction of $310 Million Upgrade

3 Office of Science U.S. Department of Energy Jefferson Lab Science Goals  Understanding structure, spectroscopy and interactions of hadrons from QCD is the central challenge of nuclear physics  How are charge, current and spin distributed in the nucleon?  What are the effective degrees of freedom describing the low-energy spectrum of the theory?  How does the nucleon-nucleon and hadron-hadron interaction arise from QCD?

4 Office of Science U.S. Department of Energy Jefferson Lab NP Milestones in Hadron Physics  NP2009 and NP2012 : measurement & determination of mass & electromagnetic properties of low lying baryons (hadrons)  New Hall D – seek information about exotic measurements. Flagship of JLab upgrade  NP2014 : determination of Generalized Parton Distributions within nucleons. These characterize how quarks interact with nucleons

5 Office of Science U.S. Department of Energy Jefferson Lab Computing efforts in support of mission  Experimental physics data acquisition, storage, and analysis (farm computing)  Lattice QCD theory calculations of fundamental quantities

6 Office of Science U.S. Department of Energy Jefferson Lab Theory context: USQCD Collaboration  Consists of nearly all high energy and nuclear physicists in the US involved in lattice QCD. Formed nine years ago to develop infrastructure for these studies  Research directly in support of DOE experimental facilities at BNL, FNAL, JLab  SciDAC I & II: software development for lattice QCD  FY06-11: ~$2.0M/yr HEP+NP, $0.4M/yr JLab  USQCD Facilites: LQCD I – Midrange computing  FY06-09: ~$2.5M/yr HEP+NP  Access to USQCD dedicated hardware is allocated within peer-reviewed process  LQCD II: FY10-14 – proposal submitted & reviewed

7 Office of Science U.S. Department of Energy Jefferson Lab Example: Spectroscopy  Experimental and ab initio N* and Exotic- meson programs aim at discovering effective degrees of freedom of QCD,  NP2009 and NP2012 milestones  Excited Baryon Analysis Center (EBAC) at Jefferson Lab  Spectroscopy of Exotic Mesons is a flagship component of CEBAF@12GeV ½ + 3/2 + 5/2 + ½ - 3/2 - 5/2 - 550 1100 1650 2200 2750 3300 Nucleon spectrum (MeV)

8 Office of Science U.S. Department of Energy Jefferson Lab Nature of the LQCD calculations Calculations are performed in two steps:  Monte Carlo methods are used to gauge configurations with a probability proportional to their weight in the Feynman path integrals that define QCD  These configurations are stored, and used to calculate a wide variety of physical observables

9 Office of Science U.S. Department of Energy Jefferson Lab Nature of the LQCD calculations (2)  Leadership machines: gauge generation uses large core count, single sequence of computing, requires multi TF sustained on one job  Midrange machines: analysis jobs typically smaller core count, each configuration an independent job, typically a few 100 GF sustained  Fine grained parallelism; regular hypercubic problems  Computations and communication equally important; low latency required

10 Office of Science U.S. Department of Energy Jefferson Lab Leadership Machines  USQCD aggressively pursuing national resources:  INCITE 07: ORNL: Cray XT4 - 10M-hrs (largest allocation)  INCITE 08-10: yearly allocations, will increase  ORNL: Cray XT4 - 7M-hrs  ANL: BG/P – 20M-hrs  ESP 08: ANL: BG/P - 250M-hrs [11 TF-yr]  ESP 09: ORNL: XT-5 ~ 60M-hrs ??? [7 TF-yr]  Groups (JLab) also pursuing NSF + other resources  2007: 1 TF-yr (PSC+SDSC)  2008: 2 TF-yr (PSC+SDSC+TACC)  2008: 1 TF-yr (LANL/NNSA)

11 Office of Science U.S. Department of Energy Jefferson Lab LQCD Computing Project Hardware (Midrange) YearComputerSiteNodes Performance (TF/s) 2002QCDFNAL1270.15 20044gJLab3840.36 2005PionFNAL5180.86 2005QCDOCBNL122884.20 20066nJLab2560.62 2006KaonFNAL6002.56 20077nJLab3962.98 2008J/PsiFNAL4006.00 w/ OASCR SciDAC I LQCD I

12 Office of Science U.S. Department of Energy Jefferson Lab Distribution of Jobs  Leadership machines: few K to 128K cores  Clusters: currently up to 1K cores. Will grow as lattice sizes scale up on leadership machines

13 Office of Science U.S. Department of Energy Jefferson Lab Future Hardware goals (BNL+FNAL+JLab) by fiscal year  Initial year based upon initial INCITE and leadership class awards. Moore’s law growth thereafter  Dedicated hardware is for all resources running that year; assumes clusters with 3.5 year life

14 Office of Science U.S. Department of Energy Jefferson Lab Distribution of Resources (2010-2014)

15 Office of Science U.S. Department of Energy Jefferson Lab LQCD II budget request

16 Office of Science U.S. Department of Energy Jefferson Lab LQCD II review: Jan. 2008 Charges to the collaboration by DOE:  Why is a new project needed if OASCR is providing access to Leadership Class machines? In particular, is dedicated hardware, such as additional clusters, essential and cost effective in such an environment? What is the optimal mix of machines, given realistic budget constraints?  What are the plans at FNAL, TJNAF, & BNL for LQCD computing? How are these plans incorporated into your plans for LQCD II ?

17 Office of Science U.S. Department of Energy Jefferson Lab LQCD II review: Jan. 2008 Review findings:  The 1-1 mix of Leadership and clusters advocated by LQCD II is the most suitable hardware mix for this project.  The review committee advocates full finding for LQCD II at the level described in their proposal. The scenario in which the funding for LQCD II would be flat with LQCD I would cause the project to miss opportunities that would otherwise enhance several other fields that the Office of Science supports, such as computer hardware and software, nuclear and astrophysics.

18 Office of Science U.S. Department of Energy Jefferson Lab Characteristics of computing:  Dominant part of calculation: large sparse matrix system solve; application of matrix on vector  Regular grid per compute core: moving to hybrid threaded + MPI model.  SciDAC II software development effort

19 Office of Science U.S. Department of Energy Jefferson Lab Opportunities for Optimization  Clusters configured for a single application (LQCD) can be better optimized than those serving a dozen applications  Memory and disk lean  Pruned fat tree network (e.g. Infiniband) due to highly local communications pattern  Lower aggregate bandwidth to disk compared to check- point intensive simulations  Overall impact compared to generic clusters: 50% more computing capacity/dollar  2x-5x more cost effective for analysis jobs than leadership class machines

20 199020002010 Mflops / $ 10 1 10 -1 10 0 QCDSP Performance/$ for LQCD Applications Commodity compute nodes (leverage marketplace & Moore’s law) Low latency, high bandwidth network to exploit full I/O capability 10 -2 Supercomputers, leadership class machines JLab SciDAC Prototype Clusters QCDOC 2002 2003 2004 2008/9 cluster at FNAL Japanese Earth Simulator JLab clusters BlueGene/L BlueGene/P

21 Office of Science U.S. Department of Energy Jefferson Lab NP Approved Missions Data analysis and support for 12 GeV:  Why: Needed to support the current 6 GeV and future 12 GeV programs (detector simulation)  How it’s done today: small cluster / farm, plus all necessary infrastructure (tape library, cache disk, …)  Status: Constrained by tight budgets, but currently keeping up with requirements (barely)  Barriers: Future: multi-threading to prevent memory requirement blow-up on many-core architectures  Special Features: integer intensive and i/o intensive

22 Office of Science U.S. Department of Energy Jefferson Lab NP Approved Missions Lattice QCD:  Why: Theory calculations in support of JLab experimental physics program  How it’s done today: NSF+INCITE computers + midrange computing within USQCD  Status: need more runs of larger size (more cores), plus more statistics (longer runs)  Barriers: multi-threading to reduce comms. Need LQCD II.  Special Features: floating point intensive + balanced communications – fine grained parallelism

23 Office of Science U.S. Department of Energy Jefferson Lab NP Proposed Missions or Initiatives  No new proposed missions  Currently fulfilling approved missions, with funding requested to continue those approved missions


Download ppt "U.S. Department of Energy’s Office of Science Midrange Scientific Computing Requirements Jefferson Lab Robert Edwards October 21, 2008."

Similar presentations


Ads by Google