“A prototype for INFN TIER-1 Regional Centre” Luca dell’Agnello INFN – CNAF, Bologna Workshop CCR La Biodola, 8 Maggio 2002.

Slides:



Advertisements
Similar presentations
Tony Doyle - University of Glasgow GridPP EDG - UK Contributions Architecture Testbed-1 Network Monitoring Certificates & Security Storage Element R-GMA.
Advertisements

Istituto Nazionale di Fisica Nucleare Italy LAL - Orsay April Site Report – R.Gomezel Site Report Roberto Gomezel INFN - Trieste.
4/2/2002HEP Globus Testing Request - Jae Yu x Participating in Globus Test-bed Activity for DØGrid UTA HEP group is playing a leading role in establishing.
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
Tier 1 Luca dell’Agnello INFN – CNAF, Bologna Workshop CCR Paestum, 9-12 Giugno 2003.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.
BNL Oracle database services status and future plans Carlos Fernando Gamboa RACF Facility Brookhaven National Laboratory, US Distributed Database Operations.
Mass RHIC Computing Facility Razvan Popescu - Brookhaven National Laboratory.
March 27, IndiaCMS Meeting, Delhi1 T2_IN_TIFR of all-of-us, for all-of-us, by some-of-us Tier-2 Status Report.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),
INFN Tier1 Andrea Chierici INFN – CNAF, Italy LCG Workshop CERN, March
ScotGrid: a Prototype Tier-2 Centre – Steve Thorn, Edinburgh University SCOTGRID: A PROTOTYPE TIER-2 CENTRE Steve Thorn Authors: A. Earl, P. Clark, S.
Amsterdam May 19-23,2003 Site Report Roberto Gomezel INFN - Trieste.
Soluzioni HW per il Tier 1 al CNAF Luca dell’Agnello Stefano Zani (INFN – CNAF, Italy) III CCR Workshop May
Farm Management D. Andreotti 1), A. Crescente 2), A. Dorigo 2), F. Galeazzi 2), M. Marzolla 3), M. Morandin 2), F.
+ discussion in Software WG: Monte Carlo production on the Grid + discussion in TDAQ WG: Dedicated server for online services + experts meeting (Thusday.
EGO Computing Center site report EGO - Via E. Amaldi S. Stefano a Macerata - Cascina (PI) | Stefano Cortese INFN Computing Workshop –
T0/T1 network meeting July 19, 2005 CERN
Paul Scherrer Institut 5232 Villigen PSI HEPIX_AMST / / BJ95 PAUL SCHERRER INSTITUT THE PAUL SCHERRER INSTITUTE Swiss Light Source (SLS) Particle accelerator.
Federico Ruggieri INFN-CNAF GDB Meeting 10 February 2004 INFN TIER1 Status.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
23 Oct 2002HEPiX FNALJohn Gordon CLRC-RAL Site Report John Gordon CLRC eScience Centre.
Introduction to U.S. ATLAS Facilities Rich Baker Brookhaven National Lab.
March 2003 CERN 1 EDG and AliEn in Prague Dagmar Adamova INP Rez near Prague.
ScotGRID:The Scottish LHC Computing Centre Summary of the ScotGRID Project Summary of the ScotGRID Project Phase2 of the ScotGRID Project Phase2 of the.
INDIACMS-TIFR Tier 2 Grid Status Report I IndiaCMS Meeting, April 05-06, 2007.
Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.
Laboratório de Instrumentação e Física Experimental de Partículas GRID Activities at LIP Jorge Gomes - (LIP Computer Centre)
RAL Site Report Andrew Sansum e-Science Centre, CCLRC-RAL HEPiX May 2004.
RAL Site Report John Gordon IT Department, CLRC/RAL HEPiX Meeting, JLAB, October 2000.
Tier1 status at INFN-CNAF Giuseppe Lo Re INFN – CNAF Bologna Offline Week
10/22/2002Bernd Panzer-Steindel, CERN/IT1 Data Challenges and Fabric Architecture.
The II SAS Testbed Site Jan Astalos - Institute of Informatics Slovak Academy of Sciences.
October 2002 INFN Catania 1 The (LHCC) Grid Project Initiative in Prague Dagmar Adamova INP Rez near Prague.
INFN TIER1 (IT-INFN-CNAF) “Concerns from sites” Session LHC OPN/ONE “Networking for WLCG” Workshop CERN, Stefano Zani
Sep 02 IPP Canada Remote Computing Plans Pekka K. Sinervo Department of Physics University of Toronto 4 Sep IPP Overview 2 Local Computing 3 Network.
KOLKATA Grid Site Name :- IN-DAE-VECC-02Monalisa Name:- Kolkata-Cream VO :- ALICECity:- KOLKATACountry :- INDIA Shown many data transfers.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
Tier1 Andrew Sansum GRIDPP 10 June GRIDPP10 June 2004Tier1A2 Production Service for HEP (PPARC) GRIDPP ( ). –“ GridPP will enable testing.
Fabric Monitoring at the INFN Tier1 Felice Rosso on behalf of INFN Tier1 Joint OSG & EGEE Operations WS, Culham (UK)
HEP Computing Status Sheffield University Matt Robinson Paul Hodgson Andrew Beresford.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
SJ – Nov CERN’s openlab Project Sverre Jarp, Wolfgang von Rüden IT Division CERN 29 November 2002.
Randy MelenApril 14, Stanford Linear Accelerator Center Site Report April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader.
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
January 30, 2016 RHIC/USATLAS Computing Facility Overview Dantong Yu Brookhaven National Lab.
The 2001 Tier-1 prototype for LHCb-Italy Vincenzo Vagnoni Genève, November 2000.
W.A.Wojcik/CCIN2P3, Nov 1, CCIN2P3 Site report Wojciech A. Wojcik IN2P3 Computing Center URL:
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
CNAF Database Service Barbara Martelli CNAF-INFN Elisabetta Vilucchi CNAF-INFN Simone Dalla Fina INFN-Padua.
The Italian Tier-1: INFN-CNAF 11-Oct-2005 Luca dell’Agnello Davide Salomoni.
AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.
IT-INFN-CNAF Status Update LHC-OPN Meeting INFN CNAF, December 2009 Stefano Zani 10/11/2009Stefano Zani INFN CNAF (TIER1 Staff)1.
Virtual Server Server Self Service Center (S3C) JI July.
PADME Kick-Off Meeting – LNF, April 20-21, DAQ Data Rate - Preliminary estimate Tentative setup: all channels read with Fast ADC 1024 samples, 12.
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
Luca dell’Agnello INFN-CNAF
INFN CNAF TIER1 Network Service
LCG 3D Distributed Deployment of Databases
UK GridPP Tier-1/A Centre at CLRC
The INFN TIER1 Regional Centre
The INFN Tier-1 Storage Implementation
Presentation transcript:

“A prototype for INFN TIER-1 Regional Centre” Luca dell’Agnello INFN – CNAF, Bologna Workshop CCR La Biodola, 8 Maggio 2002

© 20022Luca dell'Agnello - Workshop CCR - La Biodola, May INFN – TIER1 Project Computing facility for INFN HNEP community  Usage by other countries will be regulated by a Mutual Agreements Multi-Experiment TIER1  LHC experiments (ALICE, ATLAS, CMS, LHCb)  VIRGO  CDF (in a near future) Resources assigned to Experiments on a Yearly Plan. Location: INFN-CNAF, Bologna (Italy)  one of the main nodes of GARR TIER2, TIER3 under development at other places INFN-TIER1 is a prototype!  4th quarter 2003: End of project  Winter 2004: experimental phase revision and new master plan  2004: TIER1 becomes fully operational

© 20023Luca dell'Agnello - Workshop CCR - La Biodola, May Experiments needs LHC experiments  CERN (TIER 0) o Mass Storage: 10 Peta Bytes (10 15 B)/yr o disk: 2 PB o CPU: 2 MSI95 (PC today ~ 30SI95)  Multi-Experiment TIER1 o Mass Storage: 3 PB/yr o disk: 1.5 PB o CPU: 1 M SI95  Networking Tier 0 --> Tier 1: 2 Gbps Other experiments (VIRGO, CDF)  To be defined

© 20024Luca dell'Agnello - Workshop CCR - La Biodola, May Services Computing servers (CPU FARMS) Access to on-line data (Disks) Mass Storage/Tapes Broad-band network access and QoS System administration Database administration Experiment specific library software Helpdesk Coordination with TIER0, other TIER1s and TIER2s

© 20025Luca dell'Agnello - Workshop CCR - La Biodola, May Issues Technical staff  Recruiting & Training Resource management  Minimization of manual operations Sharing of resources (network, CPU, storage, HR) among experiments  Resource use optimization Compatibility between tests and production activity  Technological tests for Tier-1  Prototype phase (LHC experiments)  Production phase (VIRGO) Integration with (Data)grid framework  interoperation  Common tool development and test

© 20026Luca dell'Agnello - Workshop CCR - La Biodola, May HR Resources PERSONNEL

© 20027Luca dell'Agnello - Workshop CCR - La Biodola, May Networking (1) New GARR-B Backbone with 2.5 Gbps F/O lines already in place. CNAF-TIER1 access is now 100 Mbps and will be 622 Mbps in a few weeks  Gigapop is colocated with INFN-TIER1 Many TIER2 are now 34 Mbps and will migrate soon to 155 Mbps. International Connectivity via Geant: 2.5 Gbps access in Milano and 2x2.5 Gbps links of Geant with US (Abilene+commodity) already in place.

© 20028Luca dell'Agnello - Workshop CCR - La Biodola, May Networking (2) MI BO RM PI TO CT PD 2.5 Gbps 155 Mbps TIER1 CNAF 100 Mbps (622Mbps) GARR-B GEANT

© 20029Luca dell'Agnello - Workshop CCR - La Biodola, May Gb/s Milano 1Gb/s GigabitEthernet Cisco CNAF 622 Mb/s POS Roma 2.5Gb/s Bologna GARR-B Rack_farm1 Rack_farm2 STK Tape Library NAS File Server 1Gb/s SSR8600 F.E. Interconnection to Internet (near future)

© Luca dell'Agnello - Workshop CCR - La Biodola, May LAN Large amount of resources to be interconnected at high speed...  CPU’s connected to rack FE switch with Gb uplink to core switch  Disk servers connected via GE to core switch  Foreseen upgrade to rack Gb switch/10G core switch (2003 ?).... and shared among experiments  Possibility to reallocate each resource at every moment  Avoid recabling (or physical moving) of hw to change the topology Level 2 isolation of farms  Aid for enforcement of security measures

© Luca dell'Agnello - Workshop CCR - La Biodola, May Tier1 LAN model layout SAN STK Tape Library NAS Rack_farm

© Luca dell'Agnello - Workshop CCR - La Biodola, May Vlan Tagging (1) Possible solution for complete granularity  To each switch port is associated one VLAN identifier  Each rack switch uplink propagates VLAN informations  VLAN identifiers are propagated across switches  Each farm has its own VLAN Independent from switch brand (Standard 802.1q) First interoperability tests show viability of solution 1 48 Switch 1 Switch 2 VLAN 1 VLAN 10 Uplink Estreme  Extreme OK! Extreme  EnterasysOK! Extreme  Cisco tests ongoing Extreme  HP tests ongoing

© Luca dell'Agnello - Workshop CCR - La Biodola, May Vlan CMS Vlan Virgo Vlan ATLAS Vlan LHCb 1 Vlan per experiment 1 Uplink per rack Vlan Tagging (2) 802.1Q

© Luca dell'Agnello - Workshop CCR - La Biodola, May Computing units Basic unit  Intel CPU with Redhat Linux (Datagrid framework) o Different requirements from various experiments – RedHat 6.2 (moving to RedHat 7.2) + experiment specific libraries  1U rack-mountable dual processor servers o 800 MHz GHz o 2 FE interfaces o 512 MB – 2 GB RAM Rack unit (what we buy)  40 1U dual processor servers  1 Fast Ethernet switch with Gigabit uplink to main switch (to be upgraded in a next future)  Remote control via KVM switch (tests with Raritan ongoing) A new bid (1 rack) is in progress

© Luca dell'Agnello - Workshop CCR - La Biodola, May Brand choice “Certification” of platforms  Mechanical and assembling aspects  Hardware characteristics (e.g. cooling system, PCI slots number, max RAM etc..)  Hardware benchmarks (I/O etc..)  PXE protocol support o Installation tests  Compatibility with RedHat Linux Tested platforms:  Proliant 360 (COMPAQ) 5  Netfinity X330 (IBM)12  PowerEdge 1550 (DELL)48  INTEL (various OEM’s)  SIEMENS

© Luca dell'Agnello - Workshop CCR - La Biodola, May Computing units issues (1) Coexistence Datagrid/Datatag test-beds – “traditional” installations  Need to develop tools to manage non-grid servers Dynamic (re)allocation of server pools as experiments farms  Automatic procedure for installation & upgrade o LCFG (developed by Datagrid WP4) – Central server for configuration profiles – Use of standard services (NFS, HTTP) – Only RedHat 6.2 currently supported – First boot from floppy o LCFG+PXE protocol (only a quick patch!) – No floppy needed

© Luca dell'Agnello - Workshop CCR - La Biodola, May Computing units issues (2) Resource database and management interface  Under development (php + mysql)  Hw servers characteristics  Sw servers configuration  Servers allocation  Possibly interface to configure switches and prepare LCFG profiles Monitoring system for central control  Ganglia (+ lmsensors)  Proprietary system (e.g. DELL) under consideration  Generic tool using SNMP under development Batch system (non Datagrid servers)  PBS  Condor

© Luca dell'Agnello - Workshop CCR - La Biodola, May Comparison between NAS and SAS architectures  SAS has Linux limits (e.g. problem with large volumes) but can be customized  NAS with proprietary OS can be optimized Choice of Fiber Channel  Flexibility Access to disk servers via Gigabit Ethernet  Access via NFS v.3 (AFS considered, NFS v.4 in a near future)  Tests for HA (fail-over) ongoing Legato Networker for user Backup (on L180) Storage (1)

© Luca dell'Agnello - Workshop CCR - La Biodola, May Storage (2) Use of staging sw system for archive  Installed and tested CASTOR under Linux 7.2 configuration CLIENT/SERVER with a a single tape connected o Waiting for the Storagetek CSC Toolkit for starting test with the ACSLS software and the STK180 Library Near future tests:  Study of volume managers tools for better space organization and server fail-over (Veritas, GFS,GPFS…)  Study of a SAN solution (F.C. switches)  Integration of the NAS SAS solutions  Test and comparison between disk solutions (IDE Raid array, SCSI and F.C. Raid array)  Tests with TAPE DISPATCHER staging software (developed by A. Maslennikov and R. Gurin)

© Luca dell'Agnello - Workshop CCR - La Biodola, May Storage resources (1) NAS Procom  2 TB raw  NFS v.3  24 FC disks (72 GB) o Upgrade to 16 TB in a 2-3 months  Network interface FE, GE  One RAID 5 volume SAS Raidtec  2 TB raw  12 SCSI disks (180 GB)  Raid controller

© Luca dell'Agnello - Workshop CCR - La Biodola, May Storage resources (2) SAN Dell Power Vault 660F  8 TB raw FC o 6 modules 224F 3U with 14 disks 73 GB 10,000 rpm  2 FC RAID controller in partner mode o 512Mb Cache su ogni controller (2x512 MB)  Under test Library STK L180  180 slots, 4 drives LTO, 2 drives 9840 o 100 tapes LTO (100 GB) o 180 tapes 9840 (20 GB)

© Luca dell'Agnello - Workshop CCR - La Biodola, May Security issues (1) Need a centralized control for resource access  1 FTE required DataGrid AAA infrastructure based on PKI  Authentication via certificates (well established procedure)  Authorization currently in evolution o presently via “gridmap-file” o CAS (tests ongoing) does not seem to solve all issues o LCAS (tests ongoing) to map authorization at local level o Some “glue” needed Interim solutions  Ssh, sftp, bbftp  Bastion host at present  Kerberos (v. 5) in a few weeks if suitable (presently in test)

© Luca dell'Agnello - Workshop CCR - La Biodola, May Security issues (2) LAN access control  Packet filters on border router o Use of more sophisticated firewalls to be considered  Limit traffic to known active services  Centralized log for automatic filtering  NIDS under consideration o Requires manpower! Servers configuration  Completely under our control  Use on-board firewall o Filter all unnecessary ports  Upgrade of vulnerable packages o RedHat Network Alerts, CERT alerts etc..

© Luca dell'Agnello - Workshop CCR - La Biodola, May Hw allocation Experiment server numbers ALICE 8 ATLAS 4 CMS13 LHCb 5 VIRGO 7 GRID 10

© Luca dell'Agnello - Workshop CCR - La Biodola, May Conclusions INFN-TIER1 has began an experimental service…..  VIRGO, CMS, ATLAS, LHCb  Datagrid test-bed  Datatag test-bed …. but we are still in a test phase  Study and tests technological solutions Main goals of the prototype are:  Train people  Adopt standard solutions  Optimize resource use  Integration with the GRID