Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data Flow Workshop Henry Nebrensky Brunel University 1.

Slides:

Advertisements

Similar presentations

Software Summary Database Data Flow G4MICE Status & Plans Detector Reconstruction 1M.Ellis - CM24 - 3rd June 2009.

Advertisements

Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.

Batch Production and Monte Carlo + CDB work status Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.

Data Quality Assurance Linda R. Coney UCR CM26 Mar 25, 2010.

NextGRID & OGSA Data Architectures: Example Scenarios Stephen Davey, NeSC, UK ISSGC06 Summer School, Ischia, Italy 12 th July 2006.

Computing Panel Discussion Continued Marco Apollonio, Linda Coney, Mike Courthold, Malcolm Ellis, Jean-Sebastien Graulich, Pierrick Hanlet, Henry Nebrensky.

Physical design. Stage 6 - Physical Design Retrieve the target physical environment Create physical data design Create function component implementation.

1 Data Storage MICE DAQ Workshop 10 th February 2006 Malcolm Ellis & Paul Kyberd.

Systems Analysis I Data Flow Diagrams

Software Summary 1M.Ellis - CM23 - Harbin - 16th January 2009  Four very good presentations that produced a lot of useful discussion: u Online Reconstruction.

CFT Offline Monitoring Michael Friedman. Contents Procedure  About the executable  Notes on how to run Results  What output there is and how to access.

SOFTWARE & COMPUTING Durga Rajaram MICE PROJECT BOARD Nov 24, 2014.

December 17th 2008RAL PPD Computing Christmas Lectures 11 ATLAS Distributed Computing Stephen Burke RAL.

Henry Nebrensky - MICE CM June 2009 MICE Data Flow Henry Nebrensky Brunel University 1.

Henry Nebrensky - MICE VC May 2009 MICE Data and the Grid 1  Storage, archiving and dissemination of experimental data: u Not been a high priority.

Moodle (Course Management Systems). Assignments 1 Assignments are a refreshingly simple method for collecting student work. They are a simple and flexible.

Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.

MySQL and GRID Gabriele Carcassi STAR Collaboration 6 May Proposal.

Grid Update Henry Nebrensky Brunel University MICE Collaboration Meeting CM23.

1 All-Hands Meeting 2-4 th Sept 2003 e-Science Centre The Data Portal Glen Drinkwater.

Bookkeeping Tutorial. Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production.

Event Data History David Adams BNL Atlas Software Week December 2001.

Tony Doyle & Gavin McCance - University of Glasgow ATLAS MetaData AMI and Spitfire: Starting Point.

November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.

Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.

SRM workshop – September’05 1 SRM: Expt Reqts Nick Brook Revisit LCG baseline services working group Priorities & timescales Use case (from LHCb)

CERN Using the SAM framework for the CMS specific tests Andrea Sciabà System Analysis WG Meeting 15 November, 2007.

LHCb The LHCb Data Management System Philippe Charpentier CERN On behalf of the LHCb Collaboration.

T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.

1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.

EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.

A B A B AR InterGrid Testbed Proposal for discussion Robin Middleton/Roger Barlow Rome: October 2001.

1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.

Linda R. Coney – 5 November 2009 Online Reconstruction Linda R. Coney 5 November 2009.

Henry Nebrensky – MICE DAQ review - 4 June 2009 MICE Data Flow Henry Nebrensky Brunel University 1.

INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.

Bookkeeping Tutorial. 2 Bookkeeping content  Contains records of all “jobs” and all “files” that are produced by production jobs  Job:  In fact technically.

David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.

20 Copyright © 2008, Oracle. All rights reserved. Cache Management.

Database David Forrest. What database? DBMS: PostgreSQL. Run on dedicated Database server at RAL Need to store information on conditions of detector as.

Author - Title- Date - n° 1 Partner Logo WP5 Status John Gordon Budapest September 2002.

Testing the HEPCAL use cases J.J. Blaising, F. Harris, Andrea Sciabà GAG Meeting April,

M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 1 Markus Oldenburg GridPP Metadata Workshop July 4–7 2006, Oxford University ALICE.

1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.

Finding Data in ATLAS. May 22, 2009Jack Cranshaw (ANL)2 Starting Point Questions What is the latest reprocessing of cosmics? Are there are any AOD produced.

11/01/20081 Data simulator status CCRC’08 Preparatory Meeting Radu Stoica, CERN* 11 th January 2007 * On leave from IFIN-HH.

Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.

The GridPP DIRAC project DIRAC for non-LHC communities.

Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.

INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.

Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.

MAUS Status A. Dobbs CM43 29 th October Contents MAUS Overview Infrastructure Geometry and CDB Detector Updates CKOV EMR KL TOF Tracker Global Tracking.

CMS data access Artem Trunov. CMS site roles Tier0 –Initial reconstruction –Archive RAW + REC from first reconstruction –Analysis, detector studies, etc.

Federating Data in the ALICE Experiment

Computing and Software – Calibration Flow Overview

Bulk production of Monte Carlo

MICE Computing and Software

Bulk production of Monte Carlo

Data Management and Database Framework for the MICE Experiment

SRM2 Migration Strategy

Offline shifter training tutorial

LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.

Artem Trunov and EKP team EPK – Uni Karlsruhe

OGSA Data Architecture Scenarios

LCG Monte-Carlo Events Data Base: current status and plans

Computing Infrastructure for DAQ, DM and SC

Integrating SRB with the GIGGLE framework

INFNGRID Workshop – Bari, Italy, October 2004

Introduction to the SHIWA Simulation Platform EGI User Forum,

The LHCb Computing Data Challenge DC06

Presentation transcript:

Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data Flow Workshop Henry Nebrensky Brunel University 1

Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE and Grid Data Storage  The Grid provides MICE not only with computing (number-crunching) power, but also with a secure global framework allowing users access to data u Good news: storing development data on the Grid keeps it available to the collaboration – not stuck on an old PC in the corner of the lab u Bad news: loss of ownership – who picks up the data curation responsibilities?  Data can be downloaded from the Grid to user’s “own” PC – doesn’t need to be analysed remotely 2

Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data Flow  The basic data flow in MICE is thus something like: u The raw data file from the experiment are sent to tape using Grid protocols, including registering the files in LFC. u The offline reconstruction can then use Grid/LFC to pull down the raw data, and upload reconstructed (“RECO” or DST) files. u Users can use Grid/LFC to access RECO files they want to play with.  Combining the above description with the Grid and work being done by current users gives: 3

Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data Flow Diagram 4  Short-dashed lines indicate entities that still need confirmation  Question marks indicate even higher levels of uncertainty  More details in MICE Note 252  The diagram would look pretty much the same if non-Grid tools were used

Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data and the Grid 5  Storage, archiving and dissemination of experimental data: u Not been a high priority so far u Overall strategy not documented anywhere obvious u Individual work on parts of this – but do the pieces fit together?  Grid: u Certain Grid services are separately funded to provide a production service to MICE u Provides a ready-made set of building blocks – but “we” have to put them together u MICE need to know what they want to do, to make sure that the finished edifice meets all their needs (and that Grid includes all the necessary bricks)

Henry Nebrensky – Data Flow Workshop – 30 June 2009 Decision Time We need to start putting the pieces together NOW, including requesting sufficient resources from outside bodies. => need an agreed plan in the VERY near future There are a number of unresolved issues – see Note 252 and the data flow diagram. u Data volumes, lifetime and access control mostly unclear u (LFC) File naming scheme – see MICE Note 247 u File metadata requirements – raised at CM23 and CM24 u Management and administration Hence this workshop. 6

Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data Unknowns  MICE Note 252 identifies four main flavours of data: RAW, RECO, analysis results, and Monte Carlo simulation.  For all four, we need to understand the: u volume (the total amount of data, the rate at which it will be produced, and the size of the individual files in which it will be stored) u lifetime (ephemeral or longer lasting? will it need archiving to tape? replication?) u access control (who will create the data? who is allowed to see it? can it be modified or deleted, and if so who has those privileges?) u “service level” (desired availability? allowable downtime?)  Also need to identify use cases I’ve missed, especially ones that will need more VOMS roles or CASTOR space tokens. 7

Henry Nebrensky – Data Flow Workshop – 30 June 2009 What will users want to do? Another way of answering the same question comes from looking at what users will want to do: which sorts of data will they want access to, how much of it and how often:  RAW or just RECO?  Selected runs or a whole step at a time?  Daily? Monthly? 8

Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data - RAW  For RAW data: u volume (the total amount of data: 27 TB, the rate at which it will be produced: 30MB/s, and the size of the individual files in which it will be stored: 1-2 GB) u lifetime (ephemeral or longer lasting: permanent. will it need archiving to tape: yes. Replication?) u access control (who will create the data: archiver Who is allowed to see it: all. Can it be modified or deleted: no and if so who has those privileges?) u “service level” (desired availability: write 24/7 if ISIS up Allowable outage: 48 hrs) ( Tape Storage – see and ) ( Based on the CM million events figure. That implies that all MICE steps add up to less than a fortnight’s data taking – is that right? ) 9

Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data - RECO  For RECO data: u volume (the total amount of data: ???, the rate at which it will be produced: ???MB/s, and the size of the individual files in which it will be stored: ??? GB) u lifetime (ephemeral or longer lasting: ???. will it need archiving to tape: ???. Replication???) u access control (who will create the data: ??? Who is allowed to see it: all. Can it be modified or deleted: ??? and if so who has those privileges?) u “service level” (desired availability: write ??? if ISIS up Allowable outage: ??? hrs) (I’ve seen a claim of 6 TB for RECO data somewhere) 10

Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data - Analysis  For analysis output: u volume (the total amount of data: ???, the rate at which it will be produced: ???MB/s, and the size of the individual files in which it will be stored: ??? GB) u lifetime (ephemeral or longer lasting: ???. will it need archiving to tape: ???. Replication???) u access control (who will create the data: ??? Who is allowed to see it: all. Can it be modified or deleted: ??? and if so who has those privileges?) u “service level” (desired availability: write ??? if ISIS up Allowable outage: ??? hrs) 11

Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data - Simulation  For simulations: u volume (the total amount of data: ???, the rate at which it will be produced: ???MB/s, and the size of the individual files in which it will be stored: ??? GB) u lifetime (ephemeral or longer lasting: ???. will it need archiving to tape: ???. Replication???) u access control (who will create the data: ??? Who is allowed to see it: all. Can it be modified or deleted: ??? and if so who has those privileges?) u “service level” (desired availability: write ??? if ISIS up Allowable outage: ??? hrs) 12

Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data - Other  What other data is to be kept?: u volume (the total amount of data: ???, the rate at which it will be produced: ???MB/s, and the size of the individual files in which it will be stored: ??? GB) u lifetime (ephemeral or longer lasting: ???. will it need archiving to tape: ???. Replication???) u access control (who will create the data: ??? Who is allowed to see it: all. Can it be modified or deleted: ??? and if so who has those privileges?) u “service level” (desired availability: write ??? if ISIS up Allowable outage: ??? hrs) I know about the Tracker QA data. 13

Henry Nebrensky – Data Flow Workshop – 30 June 2009 Data Integrity  (For recent SE releases) a checksum is calculated automatically when a file is uploaded.  This can be checked when the file is transferred between SEs, or the value retrieved to check local copies.  Should we also do it ourselves before uploading the file in the first place, or should we use “compression” (can check integrity with gunzip –t …)?  (Default algorithm is Adler32 – lightweight + effective) 14

Henry Nebrensky – Data Flow Workshop – 30 June 2009 Metadata Catalogue  For many applications – such as analysis – you will want to identify the list of files containing the data that matches some parameters  This is done by a “metadata catalogue”. For MICE this doesn't yet exist  A metadata catalogue can in principle return either the GUID or an LFN – it shouldn’t matter which as long as it’s properly integrated with the other Grid services. 15

Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Metadata Catalogue  We need to select a technology to use for this u use the configuration database? (no) u gLite AMGA (who else uses it – will it remain supported?) u ?  Need to implement – i.e. register metadata to files  What metadata will be needed for analysis?  Should the catalogue include the file format and compression scheme (gzip ≠ PKzip)? 16

Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Metadata Catalogue for Humans or, in non-Gridspeak:  we have several databases (configuration DB, EPICS, e-Logbook) where we should be able to find all sorts of information about a run/timestamp.  but how do we know which runs to be interested in, for our analysis?  we need an “index” to the MICE data, and for this we need to define the set of “index terms” that will be used to search for relevant datasets. 17

Henry Nebrensky – Data Flow Workshop – 30 June 2009 If I wanted to analyse some data, …I might search for all events with a particular:  Run, date/time  Step  Beam – e -, π, p, μ (back or forward)  Nominal 4-d / transverse normalised emittance  Diffuser setting  Nominal momentum  Configuration: u Magnet currents (nominal) u Physical geometry  Absorber material  Some RF parameter?  MC Truth? Anything else? 18

Henry Nebrensky – Data Flow Workshop – 30 June 2009 People and Roles  A “role” is a combination of duties and privileges, with a specific aim. These are distinct from those of the person fulfilling that role.  The Operations Manager (“MOM”) is an example of a continuous role, enacted by different people over time.  Some roles may be so specialised that only a particular person can do them; others can have many people in them at the same time. => Don’t equate roles with FTEs! For the data flow, the privileges associated with roles are enforced by VOMS. They may also require space tokens to be set up (on a site-by-site basis). 19

Henry Nebrensky – Data Flow Workshop – 30 June 2009 Management Roles identified so far:  Online reconstruction manager  Archiver (storage of RAW data to tape)  Archivist (storage of miscellaneous data to tape)  Offline reconstruction manager  Data Manager (moving data around Tier2s, LFC consistency)  Simulation Production Manager  Analysis manager?  VO manager 20

Henry Nebrensky – Data Flow Workshop – 30 June 2009 File Catalogue Namespace (1)  Also, we need to agree on a consistent namespace for the file catalogue  Proposal (MICE Note 247, Grid talk at CM23):  We get given /grid/mice/ by the server u Five upper-level directories:  Construction/ historical data from detector development and QA  Calibration/ needed during analysis (large datasets, c.f. DB)  TestBeam/ test beam data  MICE/ DAQ output and corresponding MC simulation 21

Henry Nebrensky – Data Flow Workshop – 30 June 2009 File Catalogue Namespace (2)  /grid/mice/users/name For people to use as scratch space for their own purposes, e.g. analysis u Encourage people to do this through LFC – helps avoid “dark data” u LFC allows Unix-style access permissions  Again, the LFC namespace is something that needs to be finalised before production data can start to be registered. 22

Henry Nebrensky – Data Flow Workshop – 30 June 2009 The VOMS server (1)  File permissions will needed e.g. to ensure that users can’t accidentally delete RAW data. These rules will need to last for at least the life of the experiment.  VOMS is a Grid service that allows us to define specific roles (e.g. DAQ data archiver) which will then be allowed certain privileges (such as writing to tape at RAL Tier 1).  The VOMS service then maps humans to those roles, via their Grid certificates.  Thus the VOMS service provides us with a single portal where we can add/remove/reassign Mice, without needing to negotiate with the operators of every Grid resource worldwide – we actually keep control “in-house.” 23

Henry Nebrensky – Data Flow Workshop – 30 June 2009 The VOMS server (2)  MICE VOMS server is provided via GridPP at Manchester, UK.  New Mice are added or assigned to roles by the VO Manager (and Mouse) Paul Hodgson.  The RAL Castor must query the VOMS server to authorise every transaction – should we move it somewhere local? 24

Henry Nebrensky – Data Flow Workshop – 30 June 2009 Other Questions  How does one run become the next – what triggers it, who confirms, how is it propagated?  How does “data” come out of the DAQ and get turned into files?  How do we know a run is complete => that a file is closed?  Does the GB file size for CASTOR match the online reco sample rate? => Could the data mover trigger the online reco?  That ol’ online buffer round-robin thang  Replication of data to other Tier1s  Should EPICS monitor the data mover? 25

Henry Nebrensky – Data Flow Workshop – 30 June 2009 Actions arising  Work out desired CASTOR resources, interface (SRM?) and QoS. Meet with Tier1 and iterate.  Draw up list of VOMS roles and get them created.  Draw up list of space tokens by Tier and role.  Create LFC namespace, set permissions, upload existing data  Get archiver robot certificate  Identify needed Tier2 resources 26