INFN GRID Production Infrastructure Status and operation organization Cristina Vistoli Cnaf GDB Bologna, 11/10/2005.

Slides:



Advertisements
Similar presentations
LCG WLCG Operations John Gordon, CCLRC GridPP18 Glasgow 21 March 2007.
Advertisements

Deployment Team. Deployment –Central Management Team Takes care of the deployment of the release, certificates the sites and manages the grid services.
Alessandro Italiano INFN – CNAF 26/09/2003 1/5 Status of the INFN - EDG testbeds Alessandro Italiano 7th DataGrid Conference.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Consistency of Accounting Information with.
– n° 1 VO Magic – Planck – Compchem in the production infrastructure.
INFN Testbed status report L. Gaido WP6 meeting CERN - October 30th, 2002.
Makrand Siddhabhatti Tata Institute of Fundamental Research Mumbai 17 Aug
08/11/908 WP2 e-NMR Grid deployment and operations Technical Review in Brussels, 8 th of December 2008 Marco Verlato.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
A.Guarise – F.Rosso 1 Enabling Grids for E-sciencE INFSO-RI Comprehensive Accounting Views on large computing farms. Andrea Guarise & Felice Rosso.
Monitoring in EGEE EGEE/SEEGRID Summer School 2006, Budapest Judit Novak, CERN Piotr Nyczyk, CERN Valentin Vidic, CERN/RBI.
Responsibilities of ROC and CIC in EGEE infrastructure A.Kryukov, SINP MSU, CIC Manager Yu.Lazin, IHEP, ROC Manager
INFSO-RI Enabling Grids for E-sciencE SA1 and gLite: Test, Certification and Pre-production Nick Thackray SA1, CERN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Next steps with EGEE EGEE training community.
Certification and test activity IT ROC/CIC Deployment Team LCG WorkShop on Operations, CERN 2-4 Nov
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The GILDA t-Infrastructure Roberto Barbera.
Prototyping production and analysis frameworks for LHC experiments based on LCG, EGEE and INFN-Grid middleware CDF: DAG and Parametric Jobs ALICE: Evolution.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
LCG workshop on Operational Issues CERN November, EGEE CIC activities (SA1) Accounting: current status
Recent improvements in HLRmon, an accounting portal suitable for national Grids Enrico Fattibene (speaker), Andrea Cristofori, Luciano Gaido, Paolo Veronesi.
Certification and test activity ROC/CIC Deployment Team EGEE-SA1 Conference, CNAF – Bologna 05 Oct
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) The Egyptian Grid Infrastructure Maha Metawei
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks APEL CPU Accounting in the EGEE/WLCG infrastructure.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA3 partner collaboration tasks & process.
LCG WLCG Accounting: Update, Issues, and Plans John Gordon RAL Management Board, 19 December 2006.
AEGIS Academic and Educational Grid Initiative of Serbia Antun Balaz (NGI_AEGIS Technical Manager) Dusan Vudragovic (NGI_AEGIS Deputy.
Accounting in LCG/EGEE Can We Gauge Grid Usage via RBs? Dave Kant CCLRC, e-Science Centre.
EGEE is a project funded by the European Union under contract INFSO-RI Grid accounting with GridICE Sergio Fantinel, INFN LNL/PD LCG Workshop November.
ROC managers meeting, Barcelona, Luciano Gaido (thanks to Paolo Veronesi for the slides) ROC-IT status.
INFSO-RI Enabling Grids for E-sciencE gLite Certification and Deployment Process Markus Schulz, SA1, CERN EGEE 1 st EU Review 9-11/02/2005.
Automatic testing and certification procedure for IGI products in the EMI era and beyond Sara Bertocco INFN Padova on behalf of IGI Release Team EGI Community.
M. Cristina Vistoli EGEE SA1 Organization Meeting EGEE is proposed as a project funded by the European Union under contract IST Regional Operations.
INFSO-RI Enabling Grids for E-sciencE gLite Test and Certification Effort Nick Thackray CERN.
INFSO-RI Enabling Grids for E-sciencE Operations Parallel Session Summary Markus Schulz CERN IT/GD Joint OSG and EGEE Operations.
INFSO-RI Enabling Grids for E-sciencE DGAS, current status & plans Andrea Guarise EGEE JRA1 All Hands Meeting Plzen July 11th, 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operations: Evolution of the Role of.
1 GRID – Stato dell’arte Alessandro Paolini (INFN-CNAF) Workshop della Commissione Calcolo e Reti dell'INFN Laboratori Nazionali del Gran Sasso 10 – 13.
EGEE is a project funded by the European Union under contract IST Service Activity 1 M.Cristina Vistoli ROC Coordinator All activity meeting,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operational Procedures (Contacts, procedures,
II EGEE conference Den Haag November, ROC-CIC status in Italy
– n° 1 Grid di produzione INFN – GRID Cristina Vistoli INFN-CNAF Bologna Workshop di INFN-Grid ottobre 2004 Bari.
INFN Grid – Referee Meeting Enabling Grids for E-sciencE Bolgona, 6-7 Marzo 2007 Testbeds e infrastruttura di produzione C.Aiftimiei – INFN-PADOVA D.Cesini.
1/3/2006 Grid operations: structure and organization Cristina Vistoli INFN CNAF – Bologna - Italy.
INFN/IGI contributions Federated Clouds Task Force F2F meeting November 24, 2011, Amsterdam.
HLRmon Enrico Fattibene INFN-CNAF 1EGI-TF Lyon, France19-23 September 2011.
DGAS Distributed Grid Accounting System INFN Workshop /05/1009, Palau Giuseppe Patania Andrea Guarise 6/18/20161.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
1 Grid Operations Jinny Chien ASGC June 09, Academia Sinica Slides adapted from the EGEE training material repository:
WorkShop 2007 sul Calcolo e Reti dell'INFN Enabling Grids for E-sciencE Rimini, 7-11 Maggio 2007 Operation and Support at INFN-GRID Daniele Cesini – INFN-CNAF.
Enabling Grids for E-sciencE INFN Workshop – May 7-11 Rimini 1 Grid Accounting Status at INFN Riccardo Brunetti INFN-TORINO.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
Scuola Grid - Martina Franca, Thursday 08 November Il Sistema di Supporto INFNGrid & GGUS ( Global Grid User.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
INFN-Grid WS, Bari, 2004/10/15 Andrea Caltroni, INFN-Padova Marco Verlato, INFN-Padova Andrea Ferraro, INFN-CNAF Bologna EGEE User Support Report.
– n° 1 The Grid Production infrastructure Cristina Vistoli INFN CNAF.
Il Sistema di Supporto INFNGrid & GGUS (Global Grid User Support )
Job monitoring and accounting data visualization
SuperB – INFN-Bari Giacinto DONVITO.
Regional Operations Centres Core infrastructure Centres
Operations Status Report
Andreas Unterkircher CERN Grid Deployment
Summary on PPS-pilot activity on CREAM CE
Technical Board Meeting, CNAF, 14 Feb. 2004
Accounting at the T1/T2 Sites of the Italian Grid
Presenter (on behalf of the authors): Cristina Vistoli
INFN – GRID status and activities
Short update on the latest gLite status
Report on GLUE activities 5th EU-DataGRID Conference
Site availability Dec. 19 th 2006
Presentation transcript:

INFN GRID Production Infrastructure Status and operation organization Cristina Vistoli Cnaf GDB Bologna, 11/10/2005

INFNGRID deployment status Deployment status summary: 25 sites are registered in the GOCDB:  23 already updated to INFNGRID  2 upgrade to be defined (ESA-ESRIN and ROMA1-CMS) 14 additional sites are not yet in the GOCDB (but under regular control by the italian ROC):  10 already updated  2 under certification (INFN-Parma and ENEA-INFO)  1 upgrade in progress (INFN-Genova)  1 upgrade to be defined (NAPOLI-VIRGO)

INFNGRID deployment status: resources

INFNGRID deployment status: services

INFNGRID deployment status: services

INFNGRID deployment status: services

INFNGRID features It is essentially LCG with some additional features:  Features/customizations already present in the previous releases:  new Network Monitor profile  improved support for LSF and MPI  support for additional VOs (managed via LDAP VO server):  babar, zeus  support for the additional VOs (managed via VOMS server):  infngrid, cdf, gridit, compchem, planck, bio, enea, theophys, ingv, inaf, virgo, argo  support for MPI jobs via home syncronisation with scp with hostbased authentication  DGAS (DataGrid Accounting System)  new customizations:  support for argo VO

844 Production Infrastructure: Resources 438

DGAS Usage records are collected by DGAS and stored in two different Home Location Registries (HLR), one for the resources and one for the VOs. Data are collected for jobs submitted via the IT Resource Brokers Accounting data will be provided to APEL A prototype web interface (DGAS web monitor) has been developed to get accounting data from the HLR with various levels of aggregations and 3 views (user,resource and VO). Access to data is controlled by means of certificates and ACLs. A DGAS functional test to check if DGAS is working on a resource has been developed and is currently under test

DGAS Web monitor (VO view)

DGAS: Web monitor (resource view)

Pre-production activities CNAF site is already part of the PPS: Two more sites (Bari and Padova) will join the PPS infrastructure soon

Pre-production activities In addition to the standard PPS activities we want to test the functionality, stability and performance of the gLite WMS interfaced to a production BDII. If the tests are satisfactory a gLite WMS could be deployed as a core service in addition to the LCG Resource Brokers.

Certification services INFN Grid Certification Testbed –to test and certificate the Grid software developed inside the INFN: gLite and LCG. –to certify new INFN-GRID releases installation –Five sites: INFN-TORINO, INFN-PADOVA, INFN- CNAF, INFN-ROMA1 and INFN-BARI. –The activity is carried out in strict collaboration with the INFN-LCG-EGEE development teams, the EGEE Pre Production Service, ECGI and the Experiment task forces –

LCG SitegLite-1.3 Site cert-mon-it (1.2 R-GMA server With Registry/Schema) cert-rb-02 (WMS+LB) cert-rls-01 (gLite1.2FireMan Cat.) glite-rb-00 (1.4 WMS+LB) pre-ui-01 (gLite 1.1 UI) cert-voms-01 (gLite 1.3 VOMS Server) cert-voms-02 (gLite1.1 VOMS Server) cert-ui-01 (gLite 1.2 with bulk UI) gLite-1.2 Site cert-rb-01 (1.2 WMS+LB) APT Repository cert-mon (gLite 1.2 R-GMA Server) ALL PPS devrb (rb) devui (ui) Release Creation/Test Cert Sites EGEE Production BDII cert-rb-03 (gLite 1.4 WMS+LB) cert-pbox-01 (PBOX server) cert-bdii-01 (LCG BDII) Services for PBOX TESTS CNAF CERTIFICATION / PRE-PRODUCTION +3 servers dedicated to STORM tests

PADOVA CERTIFICATION / PRE-PRODUCTION gLite-1.3 Site cert-mon (1.3 R-GMA server) cert-ui-01 (gLite 1.4 UI + Bulk) cert-rb-01 (gLite-1.4 WMS+LB) pre-ce-01 pre-wn-01pre-wn-02pre-wn-03 pre-se-01 gLite-1.3 Site Cert Sites

BARI CERTIFICATION pccms7 alicegrid1 alicegrid4 gLite-1.2 Site pccms10 (gLite 1.4 UI) ROMA1 CERTIFICATION grid-cert-01 (gLite 1.3 UI) grid-cert-02 (gLite 1.3 CE/WN) + 3 server dedicate to storage test TORINO CERTIFICATION grid007 (gLite 1.4 UI/RB) grid006 (gLite 1.4 CE/WN/RGMA )

Release and documentation Release and documentation : –Documentation: site installation guide, release notes…. –Software repository –Site management guide –FRY is a tool developed by the Release and Documentation group of SA1 Italian ROC to perform quickly a set of basic test on all the grid elements (CE, SE, RB, WN,...). The idea is to increase the speed and reliability of the release certification phase, performing a "standard" set of tests to verify automatically configuration/setup troubles (daemons, permission and ownership of some directories,...). –DGAS checklist [new] DGAS developers produced this document to check if DGAS configuration is ok: –UiPNP –Installation of LCG 2.6 on IA64

Release and documentation

Central Management Team Site Certification The CMT is responsible of the certification: checking the functionalities of a site before to join the site to the production grid. In particular checks: –GIIS' information consistence – Local jobs submission (LRMS) –Grid submission with Globus (globus-job-run) –Grid submission with the ResorceBroker –ReplicaManager functionalities In order to certificate a site the CMT uses dedicated grid services – RB: gridit-cert-rb.cnaf.infn.it BDII: gridit-cert-rb.cnaf.infn.it In this way we avoid to have an uncertificate site in the production grid. The same grid services should be used for test activities. The procedure is described in the following document: CMT's site certification procedure [PDF]CMT's site certification procedure

Supported VO

Voms proxy VO AprMayJuneJulyAugSep01-09 Octtotal argo bio cdf compchem enea gridit inaf infngrid ingv planck theophys virgo total

Job status 10/oct/

Job report 26/9 -10/10

Support First level support: Italian ROC shift –The Italian ROC provides geographically based local front line support to Virtual Organization, Users and Resources Centres –Provided through daily shifts –Check list to be covered during the shift –Periodic (every 15 days) phone conference ROC/CIC teams and site managers –ROC report to GDA Shitf example, weekly based: Second level support: CIC on Duty –Weekly shift –CIC tools

Support system Problems Communication : -ROC on Duty and site managers -Site managers to Central management team and viceversa -Site certification during installation/upgrade -GGUS to ROC

tickets statistics –starting date: August 2005 –272 total –64 from GGUS (COD and user)

Application Testing

Number of job per VO since18/7/2005 in INFNGrid