Presentation is loading. Please wait.

Presentation is loading. Please wait.

The High Performance Cluster for QCD Calculations: System Monitoring and Benchmarking Lucas Fernandez Seivane Summer Student 2002.

Similar presentations

Presentation on theme: "The High Performance Cluster for QCD Calculations: System Monitoring and Benchmarking Lucas Fernandez Seivane Summer Student 2002."— Presentation transcript:

1 The High Performance Cluster for QCD Calculations: System Monitoring and Benchmarking Lucas Fernandez Seivane Summer Student 2002 IT Group, DESY Hamburg Supervisor: Andreas Gellrich Oviedo University (Spain)

2 Topics  Some Ideas of QM  The QFT Problem  Lattice Field Theory  What can we get?  Approaches to the computing   Hardware  Software  The stuff we made: Clumon  Possible improvements

3  QM, “real behavior” of the world: ‘fuzzy world’  Relativity means causality (cause must precede consequence!)  Any complete description of Nature must combine both ideas  The only consistent way of doing this is … QUANTUM FIELD THEORY Let’s do some physics…

4  Impossible to solve it exactly  PERTURBATIVE APPROACH  Necessity of small coupling constant (like  em = 1/137)  Example: QED (the strange theory of light and matter) Taylor:  em +  2 em /2 +  3 em /6 +… The QFT Problem

5  Not small coupling constant (at least at low energies)  We cannot explain (at least analytically) a proton!!!  We do need something exact (the LATTICE is EXACT*) … but for QCD

6  Generic tool for approaching non perturbative QFT  But more necessary in QCD (non perturbative aspects)  Even pure theoretical interests (Wilson approach) Lattice field theory

7  We are interested in the spectra (bound states, masses of particles)  We can do it by means of correlation functions: if we could calculate them exactly, we would have solved the theory  They are extracted out of Path Integrals (foil1)  The problem is calculate Path Integrals Lattice can calculate Path Integrals What can we get?

8  Discretize space-time  Monte-Carlo methods for choosing field configurations (Random generators)  Numerical evaluation of Path Integrals and correlation functions!!! (typical lattice sizes: a=0.05-0.1 fm, 1/a = 2GeV, L=32) but… A Naïve Approach

9  Huge computer power i. Highly dimensional integrals ii. The calculation requires to compute the inverse of an “infinite”-dimensional matrix, which takes a lot of CPU time and RAM.  That’s why we need clusters, supercomputers or special machines (to divide the work)  The amount of data transferred is not so important, the deciding factor is the LATENCY of the network and the scalability above 1TFlops …but

10  General Purpose Supercomputers:  Very expensive  Rigid (difficult upgrades on hardware)  Fully customed parallel machines:  Completely optimized  Only this use (difficult recycling)  Necessity of design, develop and build (or modify) the hard & soft  Commodity clusters  “Cheap PC” components  Completely customizable  Easy to upgrade / recycle How can we get it?

11  Commercial Supercomputers: CrayT3E, Fujitsu VPP77, NECSx4, Hitachi SR8000…  Parallel machines: APEmille/apeNEXTINFN/DESY QCDSP/QCDOCCU/UKQCD/Riken CP-PACSTsukuba/Hitachi  Commodity clusters + Fast Networking  Low latency (Fast Networking)  Fast Speed  Standard software and programming environments Machines

12  Cluster bought from a company (Megware), Beowulf type (1 master, 32 slaves)  Before upgrade (some weeks ago): 32 nodes:IntelXEONP4 1.7GHz 256 KB cache 1GB Rambus RAM 2  64 bit PCI slots 18 GB SCSI hard disks Fast Ethernet switch (normal networking, NFS disk mounting) Myrinet network (low latency)  Upgrade (August 2002) 16 nodes:2 IntelXEONP4 1.7GHz 256 KB cache 16 nodes:2 IntelXEONP4 2.0GHz 512 KB cache Lattice cluster@DESY

13  Software: SuSE Linux (modified by Megware)  MPICH-GM (implementation of MPI- CHamaleon for Myrinet GM system)  Megware Clustware (OpenSCE/SCMS modified): tool for monitoring and administration (but no logs) Lattice cluster@DESY(2)

14  Andreas Gellrich First Version:  Provides logs and monitoring  Perl written (customizable) Lattice cluster@DESY(3)

15  Me and Andreas Gellrich new version:  Also graphical data and another log measure  Uses MRTG to graph data Lattice cluster@DESY(4)

16 Clumon v2.0 (1)

17 Clumon v2.0 (2)

18  Getting the flavor of a really high-perf cluster  Learning Perl (more or less) to understand Andreas tool  Playing around with Andreas tool  Search for how to graph this kind of data  Learning how to use MRTG/RRDtool  Some test and previous versions  Only have to do last retouches (polishing):  Time info of the cluster  Better documentation of the tools  Play around this last week with other stuff  Prepare talk and document and write up Work done (in progress)

19  The cluster is unplugged to AFS DESY  Need for Backups / Archiving of the Data stored (dCash theoc01)  Maybe reinstall the cluster with DESY Linux (to fully know what’s in it)  Play around with other cluster stuff: OpenSCE, OSCAR, ROCKS… Possible Improvements

Download ppt "The High Performance Cluster for QCD Calculations: System Monitoring and Benchmarking Lucas Fernandez Seivane Summer Student 2002."

Similar presentations

Ads by Google