AFS/LSF Phase Out personal experience

Slides:



Advertisements
Similar presentations
Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
Advertisements

Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
SM3121 Software Technology Mark Green School of Creative Media.
Version Control with git. Version Control Version control is a system that records changes to a file or set of files over time so that you can recall.
Systemic Issues of Software Confederations Jaroslav Král, Michal Žemlička Charles University, Prague
1 port BOSS on Wenjing Wu (IHEP-CC)
Tools and Utilities for parallel and serial codes in ENEA-GRID environment CRESCO Project: Salvatore Raia SubProject I.2 C.R. ENEA-Portici. 11/12/2007.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
…using Git/Tortoise Git
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
Grid MP at ISIS Tom Griffin, ISIS Facility. Introduction About ISIS Why Grid MP? About Grid MP Examples The future.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
M. Schott (CERN) Page 1 CERN Group Tutorials CAT Tier-3 Tutorial October 2009.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
Blue Brain Project Carlos Osuna, Carlos Aguado, Fabien Delalondre.
File sharing requirements of remote users G. Bagliesi INFN - Pisa EP Forum on File Sharing 18/6/2001.
The EDGeS project receives Community research funding 1 Porting Applications to the EDGeS Infrastructure A comparison of the available methods, APIs, and.
Computer Security Risks for Control Systems at CERN Denise Heagerty, CERN Computer Security Officer, 12 Feb 2003.
Cluster Configuration Update Including LSF Status Thorsten Kleinwort for CERN IT/PDP-IS HEPiX I/2001 LAL Orsay Tuesday, December 08, 2015.
Marco Cattaneo - DTF - 28th February 2001 File sharing requirements of the physics community  Background  General requirements  Visitors  Laptops 
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Marco Cattaneo -EP Forum - 11th June 2001 File sharing requirements of the physics community  Background  General requirements  Visitors  Laptops 
CS 147 Virtual Memory Prof. Sin Min Lee Anthony Palladino.
National Energy Research Scientific Computing Center (NERSC) CHOS - CHROOT OS Shane Canon NERSC Center Division, LBNL SC 2004 November 2004.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Susanna Guatelli Geant4 in a Distributed Computing Environment S. Guatelli 1, P. Mendez Lorenzo 2, J. Moscicki 2, M.G. Pia 1 1. INFN Genova, Italy, 2.
Application Design Document Developers: o Uri Goldenberg o Henry Abravanel o Academic.
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
IPEmotion License Management PM (V1.2).
AFS Home Directory Migration Details Andy Romero Core Computing Division.
AFS Phase Out experience R. De Maria. AFS and LSF Phase Out (1) I have learned about relevant changes in IT services, only while iterating with IT experts.
Python tools and services for LHC data analysis R. De Maria, C. Hernalsteens, T. Levens, L. Mascetti, D. Piparo.
Advanced Computing Facility Introduction
HTCondor Accounting Update
Compute and Storage For the Farm at Jlab
Volunteer Computing for Science Gateways
Python Tools for Control System Access
Virtualization and Clouds ATLAS position
Virtualisation for NA49/NA61
NA61/NA49 virtualisation:
Dag Toppe Larsen UiB/CERN CERN,
Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław
Kerberos token renewal & HTCondor
Dag Toppe Larsen UiB/CERN CERN,
WLCG Manchester Report
IW2D migration to HTCondor
GWE Core Grid Wizard Enterprise (
Creating and running applications on the NGS
Brian Leonard ブライアン レオナルド
Virtualisation for NA49/NA61
IM-pack: Software Installation Using Disk Images
Practical aspects of multi-core job submission at CERN
CRESCO Project: Salvatore Raia
PES Lessons learned from large scale LSF scalability tests
FCT Follow-up Meeting 31 March, 2017 Fernando Meireles
WLCG Collaboration Workshop;
A few points to mention There are two Olympus machines!
Integration of Singularity With Makeflow
Chapter 1: Introduction
Distributed File Systems
Distributed File Systems
Ivan Reid (Brunel University London/CMS)
Software - Operating Systems
Re- engineeniering.
Presentation transcript:

AFS/LSF Phase Out personal experience R. De Maria

Use cases for AFS and LSF OpenAFS is used in ABP for: Working directory when running in Lxplus and Scientific Linux distributions. Storage space for simulations campaigns with LSF and Boinc (LHC@Home). Sharing (anonymously and with access control) development and production files (data and applications) within CERN (GPN and TN) and outside CERN. Hosting websites. LSF is used in ABP for: Simulation campaigns by teams and individuals. Decades of simulation and analysis submission jobs accumulated and time consuming to update.

Example Typical WorkFlow user1: ssh lxplus cd /afs/cern.ch/user/u/user1/project vi code.c make cp code /afs/cern.ch/user/u/user1/public/project user2: cp code ~/myproject cd myproject bsub –q8nm code data Very lightweight workflow to share work and computing resources. How will this change after the PhaseOut?

AFS and LSF Phase Out: Process My understanding is: AFS will be discontinued in 2019, but it will increasingly difficult to work on it now (loosing client support, IT support steer on alternatives) Alternatives to AFS for some use cases: EOS, CVMFS, DBonDemand. No alternative for remaining OpenAFS use cases for the time being. LSF will be discontinued in 18 months. Alternative to LSF: HTCondor. Meant to replace all use cases.

AFS and LSF Phase Out: Issues Not a single product to replace OpenAFS (EOS, CVMFS, DBonDemand for the time being). One has to learn and test multiple solutions. -> very expensive for the user. No alternative product for the most used scenarios (working directory, batch jobs, many small files). One has to rethink completely software and frameworks . -> Even more expensive for the users. Poor or no documentation on the usage, features or known issues for the alternatives (EOS, CVMFS, HTCondor). Impossible to know in advance if application X will work unless dedicated testing. -> again at the cost of the users. No clear timeline and roadmap (“EOS can evolve as CERN product, but is not clear how”, “Feature X will not|might be implemented”). Contradicting messages on the future of the services. E.g.will LXPLUS and BATCH node have a shared file system (at all, supporting legacy applications, access control)? -> Too many options to follow, again the cost of the users. Some IT support is alleviating the cost (e.g. Laurence for porting a fraction of SixDesk). The Phase Out process requires substantial user resources that were not allocated at least by ABP.

AFS and LSF Phase Out: Technical issues EOS features discovered in few months of personal testing: Large space for user and project. Fast read access. Limited maximum number of files. No anonymous read access. Fuse mounted EOS available in lxplus and SWAN but has many issues with legacy application (VIM, Jupyter, make, SQLite, …) expecting a well behaved file system. Vey slow in creating and modifying existing files. EOS command line tool cannot be easily compiled (or compiled at all) in Linux distribution. Problems even in Scientific Linux. Today no support in TN (under discussion as far as I heard). Clearly not even close for a replacement of OpenAFS, but very good for serving big data files and we are already using it for it.

AFS and LSF Phase Out: Conclusion For my personal experience: AFS and LFS Phase Out are not transparent for the users. following-up the phase–out process (study new products, test them, interact with IT developers, understand implications) requires substantial time. The possible outcomes, as of now, imply presently working software to be rethought and re-developed and not simply ported. This will require substantial resources that are not presently available. My core activities has been slowed down to follow the phase-out process already.