A Nordic Tier-1 for LHC Mattias Wadenstein Systems Integrator, NDGF Grid Operations Workshop Stockholm, June the 14 th, 2007.

Slides:

Advertisements

Similar presentations

National Grid's Contribution to LHCb IFIN-HH Serban Constantinescu, Ciubancan Mihai, Teodor Ivanoaica.

Advertisements

Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.

Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.

BNL Oracle database services status and future plans Carlos Fernando Gamboa RACF Facility Brookhaven National Laboratory, US Distributed Database Operations.

March 27, IndiaCMS Meeting, Delhi1 T2_IN_TIFR of all-of-us, for all-of-us, by some-of-us Tier-2 Status Report.

ScotGrid: a Prototype Tier-2 Centre – Steve Thorn, Edinburgh University SCOTGRID: A PROTOTYPE TIER-2 CENTRE Steve Thorn Authors: A. Earl, P. Clark, S.

Grid Computing Oxana Smirnova NDGF- Lund University R-ECFA meeting in Sweden Uppsala, May 9, 2008.

US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.

UTA Site Report Jae Yu UTA Site Report 4 th DOSAR Workshop Iowa State University Apr. 5 – 6, 2007 Jae Yu Univ. of Texas, Arlington.

CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.

Overview of day-to-day operations Suzanne Poulat.

Alex Read, Dept. of Physics Grid Activity in Oslo CERN-satsingen/miljøet møter MN-fakultetet Oslo, 8 juni 2009 Alex Read.

SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.

RAL Site Report Andrew Sansum e-Science Centre, CCLRC-RAL HEPiX May 2004.

GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.

Quick Introduction to NorduGrid Oxana Smirnova 4 th Nordic LHC Workshop November 23, 2001, Stockholm.

The ALICE Distributed Computing Federico Carminati ALICE workshop, Sibiu, Romania, 20/08/2008.

Site report HIP / CSC HIP : Helsinki Institute of Physics CSC: Scientific Computing Ltd. (Technology Partner) Storage Elements (dCache) for ALICE and CMS.

CERN Database Services for the LHC Computing Grid Maria Girone, CERN.

Alex Read, Dept. of Physics Grid Activities in Norway R-ECFA, Oslo, 15 May, 2009.

Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.

Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.

NORDUnet NORDUnet e-Infrastrucure: Grids and Hybrid Networks Lars Fischer CTO, NORDUnet Fall 2006 Internet2 Member Meeting, Chicago.

Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.

BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.

Western Tier 2 Site at SLAC Wei Yang US ATLAS Tier 2 Workshop Harvard University August 17-18, 2006.

Status of GSDC, KISTI Sang-Un Ahn, for the GSDC Tier-1 Team

Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.

NDGF – a Joint Nordic Production Grid Lars Fischer ICFA Workshop on HEP Networking, Grid, and Digital Divide Issues for Global e-Science Cracow, 2 October.

Pledged and delivered resources to ALICE Grid computing in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.

HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP33 Ambleside 22 Aug 2014.

The CMS Beijing Tier 2: Status and Application Xiaomei Zhang CMS IHEP Group Meeting December 28, 2007.

May pre GDB WLCG services post EGEE Josva Kleist Michael Grønager Software Coord NDGF Director CERN, May 12 th 2009.

NDGF and the distributed Tier-I Center Michael Grønager, PhD Technical Coordinator, NDGF dCahce Tutorial Copenhagen, March the 27th, 2007.

DCache in the NDGF Distributed Tier 1 Gerd Behrmann Second dCache Workshop Hamburg, 18 th of January 2007.

A Distributed Tier-1 for WLCG Michael Grønager, PhD Technical Coordinator, NDGF CHEP 2007 Victoria, September the 3 rd, 2007.

NDGF Site Report Mattias Wadenstein Hepix 2009 spring, Umeå , Umeå University.

Servizi core INFN Grid presso il CNAF: setup attuale

WLCG IPv6 deployment strategy

NDGF The Distributed Tier1 Site.

(Prague, March 2009) Andrey Y Shevel

The EDG Testbed Deployment Details

The Beijing Tier 2: status and plans

LCG Service Challenge: Planning and Milestones

Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017

Mattias Wadenstein Hepix 2012 Fall Meeting , Beijing

U.S. ATLAS Tier 2 Computing Center

NL Service Challenge Plans

POW MND section.

Database Services at CERN Status Update

Data Challenge with the Grid in ATLAS

Grid Computing for the ILC

Update on Plan for KISTI-GSDC

5th DOSAR Workshop Louisiana Tech University Sept. 27 – 28, 2007

Readiness of ATLAS Computing - A personal view

Overview of IPB responsibilities in EGEE-III SA1

UTFSM computer cluster

The CCIN2P3 and its role in EGEE/LCG

Southwest Tier 2.

Mattias Wadenstein Hepix 2010 Fall Meeting , Cornell

Christof Hanke, HEPIX Spring Meeting 2008, CERN

Nordic ROC Organization

Simulation use cases for T2 in ALICE

Javier Magnin Brazilian Center for Research in Physics & ROC-LA

NE-ROC Nordics Operations

LHC Data Analysis using a worldwide computing grid

Pierre Girard ATLAS Visit

ETHZ, Zürich September 1st , 2016

CC and LQCD dimanche 13 janvier 2019dimanche 13 janvier 2019

Presentation transcript:

A Nordic Tier-1 for LHC Mattias Wadenstein Systems Integrator, NDGF Grid Operations Workshop Stockholm, June the 14 th, 2007

Overview Background – Nordic Tier-1 Organization / Governance Tier-1 Services: – Computing – Storage – ATLAS – ALICE Operation – current usage – tools used in daily operations

The Nordic Tier-1

Background Nordic Countries constitute together 25Mio People No country is bigger than 10Mio People Nordic Tier-1 needs to utilize hardware at bigger Nordic compute sites Strong Nordic grid tradition: NorduGrid / ARC – Deployed at all Nordic compute centers – Used heavily also by non-HEP users (>75%) Need for a pan-Nordic organization for Tier-1 and possibly other huge inter/Nordic e-Science projects

Background Many (and sometimes stalled) discussions on a pan-Nordic grid organization resulted in the present NDGF Equally funded by the four Nordic Research Councils Hosted by the existing pan-Nordic RREN Organization NORDUnet Born June the 1 st 2006 Management team operational since September 2006 Rapid staff ramp up

Organization – CERN related

NDGF Tier-1 Resource Centers The 7 biggest Nordic compute centers, dTier-1s, form the NDGF Tier-1 Resources (Storage and Computing) are scattered Services can be centralized Advantages in redundancy Especially for 24x7 data taking

NDGF Facility

Tier-1 Services: Networking Today NDGF is connected directly with GEANT 10GBit fiber to CERN Inter-Nordic shared 10Gbit network from NORDUnet A Dedicated 10Gbit LAN covering all dTier-1 centers is planned

Tier-1 Services: Computing NorduGrid / ARC middleware for Computing Used routinely since 2002 for e.g. ATLAS data challenges Deployed at all the dTier-1 sites Interoperability tackled pr. VO

Storage dCache Installation Admin and Door nodes at GEANT endpoint Pools at sites Very close collaboration with DESY to ensure dCache is suited also for distributed use

NDGF Storage

Storage Central Installation: –7 Dell xDual Core 2GHz Xeon, 4GB RAM, 2 x 73GB 15k SAS disks (mirrored) (one for spare) –2 x Dell PowerVault MD-1000 direct attached storage enclosures with 7 x 143GB 15k SAS RAID-10 each Running: –2 Postgress for PNFS running in HA mode (master-slave) DB on MD-1000 –1 PNFS Manager and Pool Manager –1 SRM, location manager, statistics, billing, etc. –1 GridFTP and xrootd door on one machine –1 Monitoring and intrusion detection on one machine

Storage - dCache Operational with two sites: – Denmark, Univ. of Copenhagen: 70 TB but currently offline due to SAN issues – Norway, Univ. of Oslo: 70TB Running whaterver OS the local admins want – Currently: 64-bit Ubuntu bit RHEL4 32-bit RHEL4-compatible

FTS Located in Linköping: –1 Server for FTS –1 Server for Oracle database Channels from T1s to NDGF for T1-T1 transfers Also being used internally for migrating data, in particular from legacy gsiftp sources Web interface to statistics and error messages –screenshots in the next two pagesh

FTS Web interface also deployed at PIC – screenshot from this installation as the NDGF fts was boring right now

FTS

3D Initial setup located in Helsinki: –One dual core dual Xeon box with 4GB of memory no RAC, just one server –High availability SAN storage –a bit more than one TB of space allocated for data, more to come after March

ATLAS VO Services ATLAS VOBox (ARC flavor) services fully implemented – ARC uses Globus RLS – US-ATLAS-LRC mysql records almost identical to Globus RLS – Online sync'ed US-ATLAS-LRC and RLS in production since November – Enables outside ATLAS subscription to data stored on old SEs In operation since Nov 2006

ALICE VO Services ALICE VOBox boxes: – Installed at: Univ. of Copenhagen, Univ. of Bergen, Univ. of Aalborg and National Supercomp. Center in Sweden. Integration with ARC and the distributed NDGF Storage ongoing

Operation

Operation – current usage Current usage (beginning of June, 2007) – ATLAS data being read and written from compute jobs on the grid – ATLAS tier0-tier1 transfer tests

Operation – current usage CPU usage is depending on available participant clusters – Last weekend Atlas had >1200 concurrent jobs running on NorduGrid, a large part of those on NDGF-affiliated resources – The day-to-day computational jobs does not require much participation from NDGF operations – as long as the core services are running (dcache, rls, giis, etc)

Operations Interfacing with EGEE for operations – Read and respond to GGUS tickets – SAM tests running on NDGF – Cooperating with NEROC

Tools used in daily ops NorduGrid monitor for computing Storage – distributed dCache – ganglia graphing of involved nodes and some dCache sources – and jabber communication with local admins and NDGF staff – CIC portal and GOCDB for downtime annoucements etc Dozens of webpages for service status

Tools used in daily ops RT or mailinglist for user problems GGUS for COD tickets – And the occasional EGEE user reading data from srm.ndgf.org Nagios being set up – will share probes on the sysadmin wiki 24/7 first line being set up – with experts on call during weekends

Future... Rapidly ramping up diskspace within the next few months Large computational resources planned for end of 2007 In 2010 we will have more than 4MSI2k, and 4PB of storage