Final summary Ian Bird Amsterdam, DAaM 18th June 2010.

Slides:



Advertisements
Similar presentations
Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and Mass Storage to Diverse Scientific Communities.
Advertisements

HEPiX Edinburgh 28 May 2004 LCG les robertson - cern-it-1 Data Management Service Challenge Scope Networking, file transfer, data management Storage management.
OSG End User Tools Overview OSG Grid school – March 19, 2009 Marco Mambelli - University of Chicago A brief summary about the system.
Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.
Grid-enabling OGC Web Services Andrew Woolf, Arif Shaon STFC e-Science Centre Rutherford Appleton Lab.
Tier 3 Data Management, Tier 3 Rucio Caches Doug Benjamin Duke University.
PhysX CoE: LHC Data-intensive workflows and data- management Wahid Bhimji, Pete Clarke, Andrew Washbrook – Edinburgh And other CoE WP4 people…
INFSO-RI Enabling Grids for E-sciencE DAGs with data placement nodes: the “shish-kebab” jobs Francesco Prelz Enzo Martelli INFN.
Workshop summary Ian Bird, CERN WLCG Workshop; DESY, 13 th July 2011 Accelerating Science and Innovation Accelerating Science and Innovation.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
Summary of Data Management Jamboree Ian Bird WLCG Workshop Imperial College 7 th July 2010.
Summary of TEG outcomes First cut of a prioritisation/categorisation Ian Bird, CERN WLCG Workshop, New York May 20 th 2012.
Managing Data DIRAC Project. Outline  Data management components  Storage Elements  File Catalogs  DIRAC conventions for user data  Data operation.
Evolution of storage and data management Ian Bird GDB: 12 th May 2010.
Site Manageability & Monitoring Issues for LCG Ian Bird IT Department, CERN LCG MB 24 th October 2006.
The CMS Top 5 Issues/Concerns wrt. WLCG services WLCG-MB April 3, 2007 Matthias Kasemann CERN/DESY.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
Data management demonstrators Ian Bird; WLCG MB 18 th January 2011.
U.S. ATLAS Facility Planning U.S. ATLAS Tier-2 & Tier-3 Meeting at SLAC 30 November 2007.
1 DIRAC agents A.Tsaregorodtsev, CPPM, Marseille ARDA Workshop, 7 March 2005, CERN.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
Wahid Bhimji (Some slides are stolen from Markus Schulz’s presentation to WLCG MB on 19 June Apologies to those who have seen some of this before)
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The AliEn File Catalogue Jamboree on Evolution of WLCG Data &
Evolution of WLCG infrastructure Ian Bird, CERN Overview Board CERN, 30 th September 2011 Accelerating Science and Innovation Accelerating Science and.
Outcome should be a documented strategy Not everything needs to go back to square one! – Some things work! – Some work has already been (is being) done.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Evolution of WLCG Data & Storage Management Outcome of Amsterdam.
Computing Fabrics & Networking Technologies Summary Talk Tony Cass usual disclaimers apply! October 2 nd 2010.
Dave Newbold, University of BristolGridPP Middleware Meeting ‘Real World’ issues from DC04 DC04: Trying to operate the CMS computing system at 25Hz for.
Riccardo Zappi INFN-CNAF SRM Breakout session. February 28, 2012 Ingredients 1. Basic ingredients (Fabric & Conn. level) 2. (Grid) Middleware ingredients.
Baseline Services Group Status of File Transfer Service discussions Storage Management Workshop 6 th April 2005 Ian Bird IT/GD.
WLCG Jamboree – Summary of Day 1 Simone Campana, Maria Girone, Maarten Litmaath, Gavin McCance, Andrea Sciaba, Dan van der Ster.
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
Federating Data in the ALICE Experiment
Evolution of storage and data management
WLCG IPv6 deployment strategy
DAaM summary Ian Bird MB 29th June 2010.
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Sviluppi in ambito WLCG Highlights
Classic Storage Element
Vincenzo Spinoso EGI.eu/INFN
Computing models, facilities, distributed computing
Future of WAN Access in ATLAS
ALICE Monitoring
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Storage Interfaces and Access: Introduction
Storage / Data TEG Introduction
CMS OSG Motivation and Introduction Overview
Introduction to Data Management in EGI
Taming the protocol zoo
Sergio Fantinel, INFN LNL/PD
A Messaging Infrastructure for WLCG
Artem Trunov and EKP team EPK – Uni Karlsruhe
EGI UMD Storage Software Repository (Mostly former EMI Software)
Geographically distributed storage over multiple universities
Ákos Frohner EGEE'08 September 2008
Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF
WLCG Collaboration Workshop;
R. Graciani for LHCb Mumbay, Feb 2006
Data services in gLite “s” gLite and LCG.
Distributed File Systems
Distributed File Systems
Distributed File Systems
Application taxonomy & characterization
Wide Area Workload Management Work Package DATAGRID project
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Distributed File Systems
Distributed File Systems
Simplifying Expressions
Presentation transcript:

Final summary Ian Bird Amsterdam, DAaM 18th June 2010

Potential Demonstrators Bockelman xrootd demonstrator (with filesystems): should collab with IT-DSS xrootd ideas? Massimo: Stewart: +DIRAC Panda dynamic data placement? CHIRP (== afs-like fs) – similar to xrootd proposals? Behrman: ARC caching – general use? Baud/McCance: Use MSG as weak coupling to address data consistency (storage, catalogues, apps)? Or as DM info system? CMS catalogues – Simon Metson Catalogues: Oscar DHT, Bloom filters, MQ, Alien FC ??? Pablo: LFC vs AlienFC Ian.Bird@cern.ch

Dirk/Rene: proxy caches NFS j-p/gerd File access - jens Jeff: CoralCDN Dirk/Rene: proxy caches NFS j-p/gerd File access - jens Ian.Bird@cern.ch

And... Need network planning group (David F.) Ian.Bird@cern.ch

Discussion items Data transfer... SRM: Many suggestions – both here and in contributed docs Not all coherent... Probably need a group to address this: Can “FTS” + suggestions/fix do what is needed Do we need an (other) asynchronous data transfer mechanism (job finishes, here is output, deliver it to archive) SRM: Is it dead? Separate archives from caches – archive interface is simpler (subset of SRM?); cache interface is fs (-like) Not dead but only use limited pieces Ian.Bird@cern.ch

Discussion items Access protocol What is it we are trying to optimise? Can we have a data access common protocol? Is it xrootd? What is it we are trying to optimise? Not CPU! But this is what (all) we measure today... Should better specify metrics for success Monitoring now We need to measure what we are doing so that we have something to compare with! Real information from MSS and other data management components is missing. We need it now. Ian.Bird@cern.ch

And ... Can we have a simulation ?? What are the metrics?? Security issues WN access to WAN Policies Demonstrators Follow up in GDBs Ian.Bird@cern.ch

My personal summary Storage: Data Access layer Separate archive (tape) and cache systems Simplifies interfaces to both Allows industrial (standard) solutions for archives Never read tapes Data Access layer Need combination of data placement and caching Effective caches can reduce (or optimise existing) space usage (separate tools from policies) Several potential caching mechanisms Can’t assume that jobs find all of the files needed at a site – get 90% remote access (cache) the rest Can’t assume that catalogues are fully up to date Model of access is filesystem (-like) Ian.Bird@cern.ch

Summary -2 Data transfers Namespaces, catalogues, authz, quotas etc. Need a reliable way to move data from a job to an archive (or point to point) Need data placement mechanism Need transport for caching Need remote access mechanism Namespaces, catalogues, authz, quotas etc. Want dynamic catalogues that reflect changing contents of storage Could be LFC + MQ, DHT, Bloom filters, Alien FC (?) Computing models should recognise that information is best guess (not 100% reliable) Grid-wide home directory Is needed Technologies? How to do this? Ian.Bird@cern.ch