Presentation is loading. Please wait.

Presentation is loading. Please wait.

SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower All Hands Meeting JLab June 1-2, 2005 Code distribution see

Similar presentations


Presentation on theme: "SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower All Hands Meeting JLab June 1-2, 2005 Code distribution see"— Presentation transcript:

1 SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower All Hands Meeting JLab June 1-2, 2005 Code distribution see (Links to documentation at bottom)

2 Major Participants in Software Project * Software Coordinating Committee UK Peter Boyle Balint Joo (Mike Clark)

3  QCD API Design  Completed Components  Performance  Future OUTLINE Note: Thursday 9:30 Applications of the SciDAC API TBA = Carleton DeTar & Balint Joo

4 Lattice QCD – extremely uniform  (Perfect) Load Balancing: Uniform periodic lattices & identical sublattices per processor.  (Complete) Latency hiding: overlap computation /communications  Data Parallel: operations on Small 3x3 Complex Matrices per link.  Critical kernel : Dirac Solver is ~90% of Flops. Lattice Dirac operator:

5 Optimized Dirac Operators, Inverters Level 3 QDP (QCD Data Parallel) Lattice Wide Operations, Data shifts Level 2 QMP (QCD Message Passing) QLA (QCD Linear Algebra) Level 1 QIO Binary Data Files / XML Metadata SciDAC QCD API C/C++, implemented over MPI, native QCDOC, M-via GigE mesh Optimised for P4 and QCDOC Exists in C/C++ ILDG collab

6  QMP (Message Passing) version 2.0 GigE, in native QCDOC & over MPI.  QLA (Linear Algebra)  C routines (24,000 generated by Perl with test code).  Optimized (subset) for P4 Clusters in SSE assembly.  QCDOC optimized using BAGEL -- assembly generator for PowerPC, IBM power, UltraSparc, Alpha… by P. Boyle UK  QDP (Data Parallel) API  C version for MILC applications code  C++ version for new code base Chroma  I/0 and ILDG: Read/Write for XML/binary files compliant with ILDG  Level 3 (Optimized kernels) Dirac inverters for QCDOC and P4 clusters Fundamental Software Components are in Place:

7 Example of QDP++ Expression Typical for Dirac Operator:  QDP/C++ code:  Portable Expression Template Engine (PETE) Temporaries eliminated, expressions optimized

8  Gauge Covariant Version of QDP/C++ code ?  Lattice gauge algebra has bifundamentals:  QDP new type: (Field U, displacement y-x)  No shifts! All data movement is implicit in * operator.  Gauge non-invariant expressions are impossible, except with special identity field for pure shifts ( Identity[vec])

9 Performance of Optimized Dirac Inverters  QCDOC Level 3 inverters on 1024 processor racks  Level 2 on Infiniband P4 clusters at FNAL  Level 3 GigE P4 at JLab and Myrinet P4 at FNAL  Future Optimization on BlueGene/L et al.

10 Level 3 on QCDOC

11 Level 3 Aqstad QCDOC (double precision)

12 Level 2 Asqtad on single P4 node

13 Asqtad on Infiniband Cluster

14 Level 3 Domain Wall CG Inverter y y L s = 16, 4g is P4 2.8MHz, 800MHz FSB (6) (32 nodes) (4)

15 Alpha Testers on the 1K QCDOC Racks  MILC C code: Level 3 Asqtad HMD: 40 3 * 96 lattice for flavors (Each trajectory: 12hrs 18 min, Dru Renner)  Chroma C++ code: Level 3 domain wall RHMC: 24 3 * 32 (L s = 8) for 2+1 flavors (James Osborn & Robert Edwards & UKQCD)  I/O: Reading and writing lattices to RAID and host AIX box  File transfers: 1 hop file transfers between BNL, FNAL & JLab

16 BlueGene/L Assessment  June ’04: 1024 node BG/L Wilson HMC sustain 1 Teraflop (peak 5.6) Bhanot, Chen, Gara, Sexton & Vranas hep-lat/  Feb ’05: QDP++/Chroma running on UK BG/L Machine  June ’05: BU and MIT each acquire 1024 Node BG/L  ’05-’06 Software Development: Optimize Linear Algebra (BAGEL assembler from P. Boyle) at UK Optimize Inverter (Level 3) at MIT Optimize QMP (Level 1) at BU

17 Future Software Directions  On going: Optimization, Testing and Hardening of components. New Level 3 and application routines. Executables for BG/L, X1E, XT3, Power 3, PNNL cluster, APEmille…  Integration: Uniform Interfaces, Uniform Execution Environment for 3 Labs. Middleware for data sharing with ILDG. Data, Code, Job control as 3 Lab Meta Facility (leverage PPDG)  Algorithm Research: Increased performance (leverage TOPS & NSF IRT). New physics (Scattering, resonances, SUSY, Fermion signs, etc?)  Generalize API: QCD like extensions of SM, SUSY, New Lattices, Related Apps? (See FermiQCD)  Graphics: Debugger, GUI for Job and Data Management, Physics & PR ?  Today at 5:30 MG talk by James Brannick

18 Old MG results y to set context for Brannick’s talk: 2x2 Blocks for Dirac Operator Gauss-Jacobi (Diamond), CG (circle), MG with V cycle (square), W cycle (star) 2-d Lattice, U  (x) on links  (x) on sites  = 1 y R. C. Brower, R. Edwards, C.Rebbi,and E. Vicari, "Projective multigrid forWilson fermions", Nucl. Phys.B366 (1991) 689 (aka Spectral AMG, Tim Chatier, 2002)

19 Gauss-Jacobi (Diamond), CG (circle), 3 level MG (square & star)  = 3 (cross), 10(plus), 100( square) Limitation of earlier MG methods with gauge invariant but non-adaptive projections for 2-d Schwinger Model y Universality of  vs m q /  1/2


Download ppt "SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower All Hands Meeting JLab June 1-2, 2005 Code distribution see"

Similar presentations


Ads by Google