Sviluppi in ambito WLCG Highlights

Slides:



Advertisements
Similar presentations
LCG-France Project Status Fabio Hernandez Frédérique Chollet Fairouz Malek Réunion Sites LCG-France Annecy, May
Advertisements

1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN-IT Plans on Virtualization Ian Bird On behalf of IT WLCG Workshop, 9 th July 2010.
1 Evolution of OSG to support virtualization and multi-core applications (Perspective of a Condor Guy) Dan Bradley University of Wisconsin Workshop on.
Take on messages from Lecture 1 LHC Computing has been well sized to handle the production and analysis needs of LHC (very high data rates and throughputs)
Marian Babik, Luca Magnoni SAM Test Framework. Outline  SAM Test Framework  Update on Job Submission Timeouts  Impact of Condor and direct CREAM tests.
WNoDeS – Worker Nodes on Demand Service on EMI2 WNoDeS – Worker Nodes on Demand Service on EMI2 Local batch jobs can be run on both real and virtual execution.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN-IT Update Ian Bird On behalf of IT Multi-core and Virtualisation Workshop,
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Conference name Company name INFSOM-RI Speaker name The ETICS Job management architecture EGEE ‘08 Istanbul, September 25 th 2008 Valerio Venturi.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Trusted Virtual Machine Images a step towards Cloud Computing for HEP? Tony Cass on behalf of the HEPiX Virtualisation Working Group October 19 th 2010.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Rutherford Appleton Lab, UK VOBox Considerations from GridPP. GridPP DTeam Meeting. Wed Sep 13 th 2005.
Report from the WLCG Operations and Tools TEG Maria Girone / CERN & Jeff Templon / NIKHEF WLCG Workshop, 19 th May 2012.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Evolution of storage and data management Ian Bird GDB: 12 th May 2010.
CE extensions requirements for the Information System Pre-GDB 11 th July 2012 Maria Alandes Pradillo CERN IT Department, Grid Technology Group With contributions.
Workload management, virtualisation, clouds & multicore Andrew Lahiff.
DIRAC Pilot Jobs A. Casajus, R. Graciani, A. Tsaregorodtsev for the LHCb DIRAC team Pilot Framework and the DIRAC WMS DIRAC Workload Management System.
Data Placement Intro Dirk Duellmann WLCG TEG Workshop Amsterdam 24. Jan 2012.
Distributed Data Access Control Mechanisms and the SRM Peter Kunszt Manager Swiss Grid Initiative Swiss National Supercomputing Centre CSCS GGF Grid Data.
HEPiX Virtualisation Working Group Status, February 10 th 2010 April 21 st 2010 May 12 th 2010.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
II EGEE conference Den Haag November, ROC-CIC status in Italy
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
INFN/IGI contributions Federated Clouds Task Force F2F meeting November 24, 2011, Amsterdam.
Outcome should be a documented strategy Not everything needs to go back to square one! – Some things work! – Some work has already been (is being) done.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP33 Ambleside 22 Aug 2014.
WLCG Operations Coordination Andrea Sciabà IT/SDC GDB 11 th September 2013.
Virtual Machines on BiG Grid INFN Annual Meeting May 2010 Sander Klous, Nikhef.
HEPiX Virtualisation working group Andrea Chierici INFN-CNAF Workshop CCR 2010.
Operations Coordination Team Maria Girone, CERN IT-ES GDB, 11 July 2012.
CREAM Status and plans Massimo Sgaravatto – INFN Padova
Claudio Grandi INFN Bologna Workshop congiunto CCR e INFNGrid 13 maggio 2009 Le strategie per l’analisi nell’esperimento CMS Claudio Grandi (INFN Bologna)
Daniele Bonacorsi Andrea Sciabà
CernVM-FS vs Dataset Sharing
Dynamic Extension of the INFN Tier-1 on external resources
WLCG IPv6 deployment strategy
Review of the WLCG experiments compute plans
WLCG Workshop 2017 [Manchester] Operations Session Summary
C Loomis (CNRS/LAL) and V. Floros (GRNET)
WLCG Network Discussion
Virtualization and Clouds ATLAS position
Multi User Pilot Jobs update
StoRM: a SRM solution for disk based storage systems
Design rationale and status of the org.glite.overlay component
ATLAS Cloud Operations
StratusLab Final Periodic Review
GDB 8th March 2006 Flavia Donno IT/GD, CERN
StratusLab Final Periodic Review
Short term improvements to the Information System: a status report
CREAM Status and Plans Massimo Sgaravatto – INFN Padova
BOSS: the CMS interface for job summission, monitoring and bookkeeping
How to enable computing
Update on Plan for KISTI-GSDC
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Update from the HEPiX IPv6 WG
Evolution of the distributed computing model The case of CMS
Short update on the latest gLite status
Network Requirements Javier Orellana
CernVM Status Report Predrag Buncic (CERN/PH-SFT).
LCG middleware and LHC experiments ARDA project
WLCG Collaboration Workshop;
Management of Virtual Execution Environments 3 June 2008
Monitoring of the infrastructure from the VO perspective
Pierre Girard ATLAS Visit
Presentation transcript:

Sviluppi in ambito WLCG Highlights Claudio Grandi (INFN-Bologna) Mini-workshop CCR - Legnaro 17 Gennaio 2011

Mini-workshop CCR - Legnaro Outline Information System Workload management Multi-User Pilot Jobs Multi-core CPUs Virtualization CERNVM-FS Storage Amsterdam Jamboree - Demonstrators Mini-workshop CCR - Legnaro 17 Gennaio 2011

Mini-workshop CCR - Legnaro Information System Quoting F.Donno at the February GDB: “The [Glue] attributes used by middleware, experiments, monitoring and accounting tools are mostly discovery or semi-static status information” Strategy: Improve the global reliability of the service Deployment of a well managed set of top level BDIIs Provide a more static view of the available resources Parallel deployment of “static”/semi-dynamic top level BDIIs Improve the coherence and correctness of the information used Cleaning-up/consolidating IS used attributes: experiments, middleware, monitoring and accounting tools Not clear what the impact will be on the gLite-WMS match-making process Mini-workshop CCR - Legnaro 17 Gennaio 2011

Mini-workshop CCR - Legnaro Multi User Pilot Jobs Late binding of payload to resources Intra-VO scheduling is under VO control In use by 3.5/4 LHC experiments Security concerns: sites loose control on access Sites require identity change within the pilot  glexec VOs can delegate part of the control to sites Running glexec on the WN requires an authorization server ARGUS is ready for production but not in use Next version of CREAM can use ARGUS Consistent authorization Mini-workshop CCR - Legnaro 17 Gennaio 2011

Exploitation of multi-core CPUs In order to more efficiently use multi-core CPU's, the LHC experiments have been developing multi-core-aware applications capable of exploiting more than a single core via multi-processing and/or multi-threading The request from the experiments is to be able to have access to the whole node rather than a single core This has impact on the whole job management chain Experiment software and job management frameworks Grid middleware Site configuration and accounting procedures A WLCG task force has been created First step is to run test at sites on dedicated queues that provide one job slot per node Mini-workshop CCR - Legnaro 17 Gennaio 2011

Grid access to whole nodes EGEE-III MPI WG recommendations Final version available at http://grid.ie/mpi/wiki/WorkingGroup Next steps Validate new attributes in the CREAM-CE: Using a temporary JDL attribute (CERequirements) E.g. CERequirements = “WholeNodes = \”True\””; Using direct job submission to CREAM Validation starting with some user communities Modify CREAM and BLAH so that the new attributes can be used as first-level JDL attributes: E.g. WholeNodes = True; Coming with CREAM 1.7 (EMI 1 release ~ April 2011) Support for the new attributes in WMS Automatic I.Bird, Multicore and Virtualization WS CERN – June 2010 + Updates Mini-workshop CCR - Legnaro 17 Gennaio 2011

Mini-workshop CCR - Legnaro Virtualization Virtualization may ease: the support of multiple VOs at a site the inclusion of a site in the VO infrastructure HEPiX WG approach: Policy for creation of trusted virtual images VO’s prepare images according to policies accepted by sites Contextualization Sites have hooks to customize the VO image to the site configuration Cataloguing and distribution CVMFS may be used to distribute the images WNoDeS approach: focus on the site Sites create images in response to VO requests and keep full responsibility of their maintenance Mini-workshop CCR - Legnaro 17 Gennaio 2011

Mini-workshop CCR - Legnaro CERNVM-FS Make a directory tree stored on a web server look like a local read-only file system Developed to deliver software distributions onto virtual machines is currently being tested to distribute experiment software to sites Better performance than NFS/AFS Ease software distribution process Mini-workshop CCR - Legnaro 17 Gennaio 2011

A change in perspective? MUPJ + “whole node” + virtualization = Cloud? Experiments just want a Worker Node service Suited to their needs (e.g. running job “attractors”) The WNs join the experiment infrastructure Experiments take the responsibility of managing their specific services and handling the users job flow Sites and infrastructure services just handle the resource allocation, including inter-VO accounting This looks very much like a Cloud, but with a few important differences: Nodes are allocated for a definite amount of time so that if the experiment doesn’t use them they can be used by others Nodes are allocated without root privileges so that they can benefit from being inside the site LAN Experiments still need to execute user jobs with glexec so that the site still has control on the authorization process WNoDeS!!! Mini-workshop CCR - Legnaro 17 Gennaio 2011

Storage Model Evolution Motivations: Non optimal use (waste) of disk resources for analysis Too many replicas for data rarely accessed More efficient use of network resources Network is cheaper and more available than it used to be More controlled access to MSS Strategies: Remote data access xrootd, NFS 4.1, WebDAV, AFS, ... Dynamic data placement pre-placement caching Global namespaces, evolution of catalogues (p2p, etc...) MSS-Disk split SRM interface may become redundant Mini-workshop CCR - Legnaro 17 Gennaio 2011

Amsterdam Demonstrators X WLCG Workshop London – Jul ‘10 WLCG GDB CERN – Jan ‘11 Mini-workshop CCR - Legnaro 17 Gennaio 2011

Mini-workshop CCR - Legnaro What’s happening... Focus on the remote data access xrootd, NFS 4.1, CHIRP, caching (ARC and more) Most popular solution is xrootd ALICE! Big investment by CERN-IT together with ATLAS Concrete activities in CMS Interesting activities in dynamic data placement Mainly by ATLAS, LHCb did not report recently, CMS starts looking to it In my opinion the drawbacks of remote data access are underestimated In particular we lack a proper authorization mechanism that prevents abuses We should invest more on dynamic data placement! Mini-workshop CCR - Legnaro 17 Gennaio 2011

Mini-workshop CCR - Legnaro Conclusions A few important components are being deployed on the WLCG infrastructure and need extensive testing CREAM – ARGUS – glexec on WNs WLCG attitude towards service support consolidation and simplification of low level services not necessarily through standard tools, e.g. xrootd reduction of support to high level services that are more and more considered to be a business internal to the experiments in particular the job management and any component functional to its operation May be that my personal perspective changed but I have the impression that the contributions by external (grid middleware and infrastructure) projects have less and less influence on WLCG strategies Mini-workshop CCR - Legnaro 17 Gennaio 2011