Presentation is loading. Please wait.

Presentation is loading. Please wait.

Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data Flow Workshop Henry Nebrensky Brunel University 1.

Similar presentations


Presentation on theme: "Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data Flow Workshop Henry Nebrensky Brunel University 1."— Presentation transcript:

1 Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data Flow Workshop Henry Nebrensky Brunel University 1

2 Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE and Grid Data Storage  The Grid provides MICE not only with computing (number-crunching) power, but also with a secure global framework allowing users access to data u Good news: storing development data on the Grid keeps it available to the collaboration – not stuck on an old PC in the corner of the lab u Bad news: loss of ownership – who picks up the data curation responsibilities?  Data can be downloaded from the Grid to user’s “own” PC – doesn’t need to be analysed remotely 2

3 Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data Flow  The basic data flow in MICE is thus something like: u The raw data file from the experiment are sent to tape using Grid protocols, including registering the files in LFC. u The offline reconstruction can then use Grid/LFC to pull down the raw data, and upload reconstructed (“RECO” or DST) files. u Users can use Grid/LFC to access RECO files they want to play with.  Combining the above description with the Grid and work being done by current users gives: 3

4 Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data Flow Diagram 4  Short-dashed lines indicate entities that still need confirmation  Question marks indicate even higher levels of uncertainty  More details in MICE Note 252  The diagram would look pretty much the same if non-Grid tools were used

5 Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data and the Grid 5  Storage, archiving and dissemination of experimental data: u Not been a high priority so far u Overall strategy not documented anywhere obvious u Individual work on parts of this – but do the pieces fit together?  Grid: u Certain Grid services are separately funded to provide a production service to MICE u Provides a ready-made set of building blocks – but “we” have to put them together u MICE need to know what they want to do, to make sure that the finished edifice meets all their needs (and that Grid includes all the necessary bricks)

6 Henry Nebrensky – Data Flow Workshop – 30 June 2009 Decision Time We need to start putting the pieces together NOW, including requesting sufficient resources from outside bodies. => need an agreed plan in the VERY near future There are a number of unresolved issues – see Note 252 and the data flow diagram. u Data volumes, lifetime and access control mostly unclear u (LFC) File naming scheme – see MICE Note 247 u File metadata requirements – raised at CM23 and CM24 u Management and administration Hence this workshop. 6

7 Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data Unknowns  MICE Note 252 identifies four main flavours of data: RAW, RECO, analysis results, and Monte Carlo simulation.  For all four, we need to understand the: u volume (the total amount of data, the rate at which it will be produced, and the size of the individual files in which it will be stored) u lifetime (ephemeral or longer lasting? will it need archiving to tape? replication?) u access control (who will create the data? who is allowed to see it? can it be modified or deleted, and if so who has those privileges?) u “service level” (desired availability? allowable downtime?)  Also need to identify use cases I’ve missed, especially ones that will need more VOMS roles or CASTOR space tokens. 7

8 Henry Nebrensky – Data Flow Workshop – 30 June 2009 What will users want to do? Another way of answering the same question comes from looking at what users will want to do: which sorts of data will they want access to, how much of it and how often:  RAW or just RECO?  Selected runs or a whole step at a time?  Daily? Monthly? 8

9 Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data - RAW  For RAW data: u volume (the total amount of data: 27 TB, the rate at which it will be produced: 30MB/s, and the size of the individual files in which it will be stored: 1-2 GB) u lifetime (ephemeral or longer lasting: permanent. will it need archiving to tape: yes. Replication?) u access control (who will create the data: archiver Who is allowed to see it: all. Can it be modified or deleted: no and if so who has those privileges?) u “service level” (desired availability: write 24/7 if ISIS up Allowable outage: 48 hrs) ( Tape Storage – see http://www.gridpp.rl.ac.uk/blog/2009/06/10/step09-tape-drive-performance/ and http://www.gridpp.rl.ac.uk/blog/2009/06/12/step09-tape-migration-stream-policies/ ) http://www.gridpp.rl.ac.uk/blog/2009/06/10/step09-tape-drive-performance/ http://www.gridpp.rl.ac.uk/blog/2009/06/12/step09-tape-migration-stream-policies/ ( Based on the CM24 500 million events figure. That implies that all MICE steps add up to less than a fortnight’s data taking – is that right? ) 9

10 Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data - RECO  For RECO data: u volume (the total amount of data: ???, the rate at which it will be produced: ???MB/s, and the size of the individual files in which it will be stored: ??? GB) u lifetime (ephemeral or longer lasting: ???. will it need archiving to tape: ???. Replication???) u access control (who will create the data: ??? Who is allowed to see it: all. Can it be modified or deleted: ??? and if so who has those privileges?) u “service level” (desired availability: write ??? if ISIS up Allowable outage: ??? hrs) (I’ve seen a claim of 6 TB for RECO data somewhere) 10

11 Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data - Analysis  For analysis output: u volume (the total amount of data: ???, the rate at which it will be produced: ???MB/s, and the size of the individual files in which it will be stored: ??? GB) u lifetime (ephemeral or longer lasting: ???. will it need archiving to tape: ???. Replication???) u access control (who will create the data: ??? Who is allowed to see it: all. Can it be modified or deleted: ??? and if so who has those privileges?) u “service level” (desired availability: write ??? if ISIS up Allowable outage: ??? hrs) 11

12 Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data - Simulation  For simulations: u volume (the total amount of data: ???, the rate at which it will be produced: ???MB/s, and the size of the individual files in which it will be stored: ??? GB) u lifetime (ephemeral or longer lasting: ???. will it need archiving to tape: ???. Replication???) u access control (who will create the data: ??? Who is allowed to see it: all. Can it be modified or deleted: ??? and if so who has those privileges?) u “service level” (desired availability: write ??? if ISIS up Allowable outage: ??? hrs) 12

13 Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data - Other  What other data is to be kept?: u volume (the total amount of data: ???, the rate at which it will be produced: ???MB/s, and the size of the individual files in which it will be stored: ??? GB) u lifetime (ephemeral or longer lasting: ???. will it need archiving to tape: ???. Replication???) u access control (who will create the data: ??? Who is allowed to see it: all. Can it be modified or deleted: ??? and if so who has those privileges?) u “service level” (desired availability: write ??? if ISIS up Allowable outage: ??? hrs) I know about the Tracker QA data. 13

14 Henry Nebrensky – Data Flow Workshop – 30 June 2009 Data Integrity  (For recent SE releases) a checksum is calculated automatically when a file is uploaded.  This can be checked when the file is transferred between SEs, or the value retrieved to check local copies.  Should we also do it ourselves before uploading the file in the first place, or should we use “compression” (can check integrity with gunzip –t …)?  (Default algorithm is Adler32 – lightweight + effective) 14

15 Henry Nebrensky – Data Flow Workshop – 30 June 2009 Metadata Catalogue  For many applications – such as analysis – you will want to identify the list of files containing the data that matches some parameters  This is done by a “metadata catalogue”. For MICE this doesn't yet exist  A metadata catalogue can in principle return either the GUID or an LFN – it shouldn’t matter which as long as it’s properly integrated with the other Grid services. 15

16 Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Metadata Catalogue  We need to select a technology to use for this u use the configuration database? (no) u gLite AMGA (who else uses it – will it remain supported?) u ?  Need to implement – i.e. register metadata to files  What metadata will be needed for analysis?  Should the catalogue include the file format and compression scheme (gzip ≠ PKzip)? 16

17 Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Metadata Catalogue for Humans or, in non-Gridspeak:  we have several databases (configuration DB, EPICS, e-Logbook) where we should be able to find all sorts of information about a run/timestamp.  but how do we know which runs to be interested in, for our analysis?  we need an “index” to the MICE data, and for this we need to define the set of “index terms” that will be used to search for relevant datasets. 17

18 Henry Nebrensky – Data Flow Workshop – 30 June 2009 If I wanted to analyse some data, …I might search for all events with a particular:  Run, date/time  Step  Beam – e -, π, p, μ (back or forward)  Nominal 4-d / transverse normalised emittance  Diffuser setting  Nominal momentum  Configuration: u Magnet currents (nominal) u Physical geometry  Absorber material  Some RF parameter?  MC Truth? Anything else? 18

19 Henry Nebrensky – Data Flow Workshop – 30 June 2009 People and Roles  A “role” is a combination of duties and privileges, with a specific aim. These are distinct from those of the person fulfilling that role.  The Operations Manager (“MOM”) is an example of a continuous role, enacted by different people over time.  Some roles may be so specialised that only a particular person can do them; others can have many people in them at the same time. => Don’t equate roles with FTEs! For the data flow, the privileges associated with roles are enforced by VOMS. They may also require space tokens to be set up (on a site-by-site basis). 19

20 Henry Nebrensky – Data Flow Workshop – 30 June 2009 Management Roles identified so far:  Online reconstruction manager  Archiver (storage of RAW data to tape)  Archivist (storage of miscellaneous data to tape)  Offline reconstruction manager  Data Manager (moving data around Tier2s, LFC consistency)  Simulation Production Manager  Analysis manager?  VO manager 20

21 Henry Nebrensky – Data Flow Workshop – 30 June 2009 File Catalogue Namespace (1)  Also, we need to agree on a consistent namespace for the file catalogue  Proposal (MICE Note 247, Grid talk at CM23):  We get given /grid/mice/ by the server u Five upper-level directories:  Construction/ historical data from detector development and QA  Calibration/ needed during analysis (large datasets, c.f. DB)  TestBeam/ test beam data  MICE/ DAQ output and corresponding MC simulation 21

22 Henry Nebrensky – Data Flow Workshop – 30 June 2009 File Catalogue Namespace (2)  /grid/mice/users/name For people to use as scratch space for their own purposes, e.g. analysis u Encourage people to do this through LFC – helps avoid “dark data” u LFC allows Unix-style access permissions  Again, the LFC namespace is something that needs to be finalised before production data can start to be registered. 22

23 Henry Nebrensky – Data Flow Workshop – 30 June 2009 The VOMS server (1)  File permissions will needed e.g. to ensure that users can’t accidentally delete RAW data. These rules will need to last for at least the life of the experiment.  VOMS is a Grid service that allows us to define specific roles (e.g. DAQ data archiver) which will then be allowed certain privileges (such as writing to tape at RAL Tier 1).  The VOMS service then maps humans to those roles, via their Grid certificates.  Thus the VOMS service provides us with a single portal where we can add/remove/reassign Mice, without needing to negotiate with the operators of every Grid resource worldwide – we actually keep control “in-house.” 23

24 Henry Nebrensky – Data Flow Workshop – 30 June 2009 The VOMS server (2)  MICE VOMS server is provided via GridPP at Manchester, UK.  New Mice are added or assigned to roles by the VO Manager (and Mouse) Paul Hodgson.  The RAL Castor must query the VOMS server to authorise every transaction – should we move it somewhere local? 24

25 Henry Nebrensky – Data Flow Workshop – 30 June 2009 Other Questions  How does one run become the next – what triggers it, who confirms, how is it propagated?  How does “data” come out of the DAQ and get turned into files?  How do we know a run is complete => that a file is closed?  Does the GB file size for CASTOR match the online reco sample rate? => Could the data mover trigger the online reco?  That ol’ online buffer round-robin thang  Replication of data to other Tier1s  Should EPICS monitor the data mover? 25

26 Henry Nebrensky – Data Flow Workshop – 30 June 2009 Actions arising  Work out desired CASTOR resources, interface (SRM?) and QoS. Meet with Tier1 and iterate.  Draw up list of VOMS roles and get them created.  Draw up list of space tokens by Tier and role.  Create LFC namespace, set permissions, upload existing data  Get archiver robot certificate  Identify needed Tier2 resources 26


Download ppt "Henry Nebrensky – Data Flow Workshop – 30 June 2009 MICE Data Flow Workshop Henry Nebrensky Brunel University 1."

Similar presentations


Ads by Google