Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Distributed Tier-1 An example based on the Nordic Scientific Computing Infrastructure GDB meeting – NIKHEF/SARA 13th October 2004 John Renner Hansen.

Similar presentations


Presentation on theme: "A Distributed Tier-1 An example based on the Nordic Scientific Computing Infrastructure GDB meeting – NIKHEF/SARA 13th October 2004 John Renner Hansen."— Presentation transcript:

1 A Distributed Tier-1 An example based on the Nordic Scientific Computing Infrastructure GDB meeting – NIKHEF/SARA 13th October 2004 John Renner Hansen – Niels Bohr Institute With contributions from Oxana Smirnova, Peter Villemoes and Brian Vinter

2 Basis for a distributed Tier-1 structure External connectivity Internal connectivity Computer and Storage capacity Maintenance and operation Long term stability

3 NORDUnet network in 2003 155M RUNNet 622M NASK 12M GÉANT 10G (Oct ’03) NETNOD 3.5G General Internet 5G General Internet 2.5G

4 NorthernLight Helsinki Oslo Stockholm Copenhagen NetherLight Amsterdam 2.5G links connected to “ONS boxes” giving 2 GE channels between endpoints Dec 2003 Aug 2003

5 NORDUNet was represented at the NREN-TIER1 Meeting Paris, Roissy Hilton, 12:00-17:00, 22 July 2004 by Peter Villemoes

6 Denmark / Forskningsnet Upgrading from 622Mbit/s to a 2.5 Gbit/s ring structure finished: –Copenhagen-Odense-Århus-Aalborg- and back via Göteborg to Copenhagen Network research –setting up a Danish national IPv6 activity –dark fibre through the country experimental equipment for 10GE channels

7 Finland / FUNET Upgraded to 2.5G already in 2002 Upgrading backbone routers to 10G capability

8 Norway / UNINETT Network upgraded to 2.5G between major universities, UNINETT is expanding to more services and organisations:

9 Sweden / SUNET 10G resilient nationwide network since Nov 2002 –all 32 universities have 2.5G access Active participation in SweGrid, Swedish Grid Initiative

10 Nordic Tier12004200520062007200820092010 Split 2008 ALIC E ATLA SCMSLHCbSUM 2008 CPU (kSI2K) 300 7001400 Offered 1400 % of Total 8% Disk (Tbytes) 6070200700 Offered 700 % of Total 8% Tape (Pbytes) 0.060.070.20.7 Offered 0.7 % of Total 12% Tape (Mbytes/sec) Offered Required 580 Balance WAN (Mbits/sec) 10002000500010000 Computer and Storage capacity at a Nordic Tier-1

11 Who Denmark Danish Center for Grid Computing - DCGC Danish Center for Scientific Computing -DCSC

12 Who Denmark Finland - CSC

13 Who Denmark Finland Norway - NorGrid

14 Who Denmark Finland Norway Sweden - SweGrid

15 Denmark Two collaborating Grid projects Danish Centre for Scientific Computing Grid –DCSC-Grid spans the four DCSC sites and thus unify the resources, PC-clusters, IBM- Regatta, SGI-Enterprise, … within DCSC Danish Centre for Grid Computing –Is the the national Grid project –DCSC Grid is a partner in DCGC

16 Finland Remains centred about CSC –CSC participates in NDGF and NGC –A Finnish Grid will probably be created –This Grid will focus more on accessing CSC resources with local machines

17 Norway NOTUR Emerging Technologies on Grid Computing is the main mover –Oslo-Bergen “mini-Grid” in place –Trondheim and Tromsø should be joining Curently under reorganization

18 Sweden SweGrid is a very ambitious project –6 clusters have been created for the purpose of SweGrid each equipped with a 100 PCs and a large disk system –Large support and education organization is integrated in the plans

19 Research Councils DKDKSFSFSN NOS-N Nordic Data Grid Facility Nordic Project 1.Create the basis for a common Nordic Data Grid Facility 2. Coordinate Nordic Grid Activities Core Group Project Director 4 Post Doc.s Steering Group 3 members per country 1 R.C. Civil Servant 2 Scientists

20 Services provided by the Tier-1 Regional Centres acceptance of raw and processed data from the Tier-0 centre, keeping up with data acquisition; recording and maintenance of raw and processed data on permanent mass storage; provision of managed disk storage providing permanent and temporary data storage for files and databases; operation of a data-intensive analysis facility; provision of other services according to agreed experiment requirements provision of high capacity network services for data exchange with the Tier-0 centre, as part of an overall plan agreed between the experiments, Tier-1 and Tier-0 centres; provision of network services for data exchange with Tier-1 and selected Tier-2 centres, as part of an overall plan agreed between the experiments, Tier-1 and Tier-2 centres; administration of databases required by experiments at Tier-1 centres;

21

22 Site Count ry ~ # CP Us ~ % Dedica ted 1atlas.hpc.unimelb.edu.au2830% 2 genghis.hpc.unimelb.edu. au 9020% 3 charm.hpc.unimelb.edu.a u 20100% 4lheppc10.unibe.ch12100% 5lxsrv9.lrz-muenchen.de2345% 6atlas.fzk.de8845% 7morpheus.dcgc.dk18100% 8lscf.nbi.dk3250% 9benedict.aau.dk4690% 10fe10.dcsc.sdu.dk6441% 11grid.uio.no40100% 12fire.ii.uib.no5850% 13grid.fi.uib.no4100% 14hypatia.uio.no10060% 15sigrid.lunarc.lu.se10030% 16sg-access.pdc.kth.se10030% 17hagrid.it.uu.se10030% 18bluesmoke.nsc.liu.se10030% 19ingrid.hpc2n.umu.se10030% 20farm.hep.lu.se6060% 21hive.unicc.chalmers.se10030% 22brenta.ijs.si50100% Totals at peak: 7 countries 22 sites ~3000 CPUs –dedicated ~700 7 Storage Services (in RLS) –few more storage facilities –~12TB ~1FTE (1-3 persons) in charge of production –At most 2 executor instances simultaneously ARC-connected resources for DC2

23  Total # of successful jobs: 42202 (as of September 25, 2004)  Failure rate before ATLAS ProdSys manipulations: 20% ~1/3 of failed jobs did not waste resources  Failure rate after: 35%  Possible reasons: Dulcinea failing to add DQ attributes in RLS DQ renaming Windmill re-submitting good jobs ARC performance in ATLAS DC2

24 Failure analysis Dominant problem: hardware accidents


Download ppt "A Distributed Tier-1 An example based on the Nordic Scientific Computing Infrastructure GDB meeting – NIKHEF/SARA 13th October 2004 John Renner Hansen."

Similar presentations


Ads by Google