Evolution of storage and data management Ian Bird GDB: 12 th May 2010.

Slides:



Advertisements
Similar presentations
HEPiX Edinburgh 28 May 2004 LCG les robertson - cern-it-1 Data Management Service Challenge Scope Networking, file transfer, data management Storage management.
Advertisements

Data Management TEG Status Dirk Duellmann & Brian Bockelman WLCG GDB, 9. Nov 2011.
CERN Summary Ian Bird eInfrastructure Workshop 9 December, 2003.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
Stefano Belforte INFN Trieste 1 CMS SC4 etc. July 5, 2006 CMS Service Challenge 4 and beyond.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Ian Bird LHCC Referees’ meeting; CERN, 11 th June 2013 March 6, 2013
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
CHEP – Mumbai, February 2006 The LCG Service Challenges Focus on SC3 Re-run; Outlook for 2006 Jamie Shiers, LCG Service Manager.
Tier 3 Data Management, Tier 3 Rucio Caches Doug Benjamin Duke University.
PhysX CoE: LHC Data-intensive workflows and data- management Wahid Bhimji, Pete Clarke, Andrew Washbrook – Edinburgh And other CoE WP4 people…
Workshop summary Ian Bird, CERN WLCG Workshop; DESY, 13 th July 2011 Accelerating Science and Innovation Accelerating Science and Innovation.
20th AIAA Advanced Measurement and Ground Testing Technology Conference Lessons Learned in AIAA Working Group Development E. Allen Arrington Dynacs/NASA.
Cracow Grid Workshop October 2009 Dipl.-Ing. (M.Sc.) Marcus Hilbrich Center for Information Services and High Performance.
CERN openlab V Technical Strategy Fons Rademakers CERN openlab CTO.
CERN IT Department CH-1211 Geneva 23 Switzerland GT WG on Storage Federations First introduction Fabrizio Furano
WLCG Collaboration Workshop 7 – 9 July, Imperial College, London In Collaboration With GridPP Workshop Outline, Registration, Accommodation, Social Events.
LCG Service Challenges: Planning for Tier2 Sites Update for HEPiX meeting Jamie Shiers IT-GD, CERN.
Ian Bird GDB; CERN, 8 th May 2013 March 6, 2013
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
From the Transatlantic Networking Workshop to the DAM Jamboree to the LHCOPN Meeting (Geneva-Amsterdam-Barcelona) David Foster CERN-IT.
Summary of Data Management Jamboree Ian Bird WLCG Workshop Imperial College 7 th July 2010.
Ian Bird GDB CERN, 9 th September Sept 2015
Procedure to follow for proposed new Tier 1 sites Ian Bird CERN, 27 th March 2012.
Future computing strategy Some considerations Ian Bird WLCG Overview Board CERN, 28 th September 2012.
Procedure for proposed new Tier 1 sites Ian Bird WLCG Overview Board CERN, 9 th March 2012.
The CMS Top 5 Issues/Concerns wrt. WLCG services WLCG-MB April 3, 2007 Matthias Kasemann CERN/DESY.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
Data Placement Intro Dirk Duellmann WLCG TEG Workshop Amsterdam 24. Jan 2012.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
Distributed Data Management Miguel Branco 1 DQ2 discussion on future features BNL workshop October 4, 2007.
Report from GSSD Storage Workshop Flavia Donno CERN WLCG GDB 4 July 2007.
Data management demonstrators Ian Bird; WLCG MB 18 th January 2011.
Ian Bird CERN, 17 th July 2013 July 17, 2013
Ian Bird WLCG Networking workshop CERN, 10 th February February 2014
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
44222: Information Systems Development
Analysis Performance and I/O Optimization Jack Cranshaw, Argonne National Lab October 11, 2011.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
Wahid Bhimji (Some slides are stolen from Markus Schulz’s presentation to WLCG MB on 19 June Apologies to those who have seen some of this before)
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The AliEn File Catalogue Jamboree on Evolution of WLCG Data &
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
From the Transatlantic Networking Workshop to the DAM Jamboree David Foster CERN-IT.
Status of gLite-3.0 deployment and uptake Ian Bird CERN IT LCG-LHCC Referees Meeting 29 th January 2007.
Evolution of WLCG infrastructure Ian Bird, CERN Overview Board CERN, 30 th September 2011 Accelerating Science and Innovation Accelerating Science and.
The HEPiX IPv6 Working Group David Kelsey (STFC-RAL) EGI OMB 19 Dec 2013.
Outcome should be a documented strategy Not everything needs to go back to square one! – Some things work! – Some work has already been (is being) done.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Evolution of WLCG Data & Storage Management Outcome of Amsterdam.
Computing Fabrics & Networking Technologies Summary Talk Tony Cass usual disclaimers apply! October 2 nd 2010.
LHCbComputing Update of LHC experiments Computing & Software Models Selection of slides from last week’s GDB
Baseline Services Group Status of File Transfer Service discussions Storage Management Workshop 6 th April 2005 Ian Bird IT/GD.
WLCG Jamboree – Summary of Day 1 Simone Campana, Maria Girone, Maarten Litmaath, Gavin McCance, Andrea Sciaba, Dan van der Ster.
Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.
Evolution of storage and data management
DAaM summary Ian Bird MB 29th June 2010.
Ian Bird WLCG Workshop San Francisco, 8th October 2016
Sviluppi in ambito WLCG Highlights
Global Data Access – View from the Tier 2
Vincenzo Spinoso EGI.eu/INFN
Computing models, facilities, distributed computing
Storage Interfaces and Access: Introduction
Storage / Data TEG Introduction
Final summary Ian Bird Amsterdam, DAaM 18th June 2010.
R&D for HL-LHC from the CWP
WLCG Collaboration Workshop;
Presentation transcript:

Evolution of storage and data management Ian Bird GDB: 12 th May 2010

Discussion on evolving WLCG Storage and Data Management Background: – Experiments’ concern with performance and scalability of access to data Particularly for analysis – Concern over long term sustainability of existing solutions – Recognize: Evolution of technology (networking, file systems, etc) Huge amounts of disk available to experiments (on aggregate!) Other communities exist that: have solved similar problems; will have similar problems soon – Short term concerns: Performance, scalability, bottlenecks (e.g. SRM) – The MONARC model assumed networking was a problem It is actually something we should invest more in

Scope Focus on analysis use cases – Production is in hand Start from viewpoint of user access to data – Hide details of back-end storage systems and etc. Timescale for solutions in production – 2013 run – But with incremental working prototypes that address some of the short term concerns Should not be seen as a reason for new development projects – Must use available tools as far as possible – must keep long term sustainability in mind Working model: – More network-centric than now – Few large archival repositories – long term data curation – “Cloud” of storage making optimal use of available capacity Should be a single effort of the community – There may be funding opportunities – but we should make sure that they help us in our goals and obtaining what we need Must be driven by the real needs of the experiments... – But we must learn the lesson of SRM – too many unrealistic “requirements” and no consensus on implementation

Technical areas Data archives and storage cloud – Simplify so that “tape” is really an archive – Allow remote data access when needed – Look into peer-peer technologies Data access layer – E.g. Xrootd/GFAL + some intelligence to determine when to cache/when to use remote access Output datasets – Still need a service for (asynchronous) movement of datasets to an archive Global home directory facility – Model is a global file system. Industrial solutions available? “Catalogues” – Still need to locate data. Issue of consistency between different storage systems. Authorization mechanisms – For access to files in storage systems (archive + cloud), and quotas etc.

Next steps Jamboree – 3-day workshop to look at existing tools in each of the areas: June in Amsterdam Elaboration of a more concrete plan and timelines; set up working groups – Could be at WLCG workshop in London, July 7-9 Develop demonstrator prototypes in each area – testing of components/technologies Experiment testing – integration into frameworks

Jamboree agenda – 1 10:00Welcome & Introduction to the Jamboree - goals of the workshop Ian 10:15Presentation of a strawman for a new model of data access and management - should include background and rationale behind this Experiment person (Ian F/Kors?) 11:00Status of networking (cf expectation in 2001), prospects including intercontinental and far (e.g. Asia) connectivity David Foster 11:45Back-end storage - to be used as a true archive: implications /simplifications possible. Should discuss the need to not know about or have to access back-end storage (tape). ==> What should be the interface to storage? Dirk Duellmann? 12:30Lunch 14:00Needs from a data access layer - how analysis use needs to access data, etc.Experiment person 14:45Data transfer use cases: peer-peer (many-many) or point-point. On demand vs scheduled. Experiment person 15:30Coffee 15:45Namespaces, authorization needs, quotas, catalogues - what is needed?Experiment person 16:30The need for global home directory service - missing so far. What are the use cases? Experiment person 17:00Conclusions - summary of discussion points, points arising. Progress on model? Short term improvements possible?

Agenda – 2 Technology 9:00File systems. Summary of work on Hadoop, Lustre, GPFS, etc.Hepix WG? 9:45Xrootd - outlookFabrizio Furano 10:30Coffee 10:45FTS, LFC - what is OK, what is not? What is still useful??? 11:30Alien FC - why is it interesting?Pablo Saiz 12:15P2P technologies - can we simply adapt them??? 13:00Lunch 14:00ROOT - outlook and developmentsRene Brun Site Experiences 14:45NDGF/ARCJosva Kleist 15:15GridKa - xrootd with tape back-end?? 15:45Coffee 16:00CNAF - GPFS/TSM/Storm?? 16:30Hadoop at a Tier 2Brian Bockelman? 17:00Summary and conclusions - are there areas for potential early demonstrators?

Agenda – 3 9:00Conclusions of the discussions Summaries from secretaries of Days 1 and 2 10:00Next steps: points to address: - Draft an outline plan with timelines; - Create working groups - agree mandates and convenors. Set expectations for July workshop 11:00Summary: draft summary of the workshop and conclusions for presentation to wider community 12:00Close