GridPP Collaboration Meeting 5 th November 2001 Dan Tovey, University of Sheffield Non-LHC and Non-US-Collider Experiments’ Requirements Dan Tovey, University.

Slides:



Advertisements
Similar presentations
WP2: Data Management Gavin McCance University of Glasgow November 5, 2001.
Advertisements

WP2: Data Management Gavin McCance University of Glasgow.
Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and Mass Storage to Diverse Scientific Communities.
Andrew McNab - Manchester HEP - 24 May 2001 WorkGroup H: Software Support Both middleware and application support Installation tools and expertise Communication.
University of Sheffield Dan Tovey 1 ‘Other’ Experiments’ Requirements Dan Tovey University of Sheffield.
June 19, 2002 A Software Skeleton for the Full Front-End Crate Test at BNL Goal: to provide a working data acquisition (DAQ) system for the coming full.
Chapter 19: Network Management Business Data Communications, 4e.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © Chapter 1, pp For educational use only.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
1-2.1 Grid computing infrastructure software Brief introduction to Globus © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification.
Background Info The UK Mirror Service provides mirror copies of data and programs from many sources all over the world. This enables users in the UK to.
Oxford Jan 2005 RAL Computing 1 RAL Computing Implementing the computing model: SAM and the Grid Nick West.
Check Disk. Disk Defragmenter Using Disk Defragmenter Effectively Run Disk Defragmenter when the computer will receive the least usage. Educate users.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Discussion and conclusion The OGC SOS describes a global standard for storing and recalling sensor data and the associated metadata. The standard covers.
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
CH2 System models.
Andrew McNab - Manchester HEP - 5 July 2001 WP6/Testbed Status Status by partner –CNRS, Czech R., INFN, NIKHEF, NorduGrid, LIP, Russia, UK Security Integration.
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.
EGEE is a project funded by the European Union under contract IST Testing processes Leanne Guy Testing activity manager JRA1 All hands meeting,
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
CHEP 2003Stefan Stonjek1 Physics with SAM-Grid Stefan Stonjek University of Oxford CHEP th March 2003 San Diego.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.
Module 9: Preparing to Administer a Server. Overview Introduction to Administering a Server Configuring Remote Desktop to Administer a Server Managing.
Nick Brook Current status Future Collaboration Plans Future UK plans.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
Costin Grigoras ALICE Offline. In the period of steady LHC operation, The Grid usage is constant and high and, as foreseen, is used for massive RAW and.
1 Chapter Overview Introducing Replication Planning for Replication Implementing Replication Monitoring and Administering Replication.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
D0RACE: Testbed Session Lee Lueking D0 Remote Analysis Workshop February 12, 2002.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
…building the next IT revolution From Web to Grid…
Online Software 8-July-98 Commissioning Working Group DØ Workshop S. Fuess Objective: Define for you, the customers of the Online system, the products.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
A B A B AR InterGrid Testbed Proposal for discussion Robin Middleton/Roger Barlow Rome: October 2001.
Trusted Virtual Machine Images a step towards Cloud Computing for HEP? Tony Cass on behalf of the HEPiX Virtualisation Working Group October 19 th 2010.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
11th November 2002Tim Adye1 Distributed Analysis in the BaBar Experiment Tim Adye Particle Physics Department Rutherford Appleton Laboratory University.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
BaBar and the GRID Tim Adye CLRC PP GRID Team Meeting 3rd May 2000.
10 May 2001WP6 Testbed Meeting1 WP5 - Mass Storage Management Jean-Philippe Baud PDP/IT/CERN.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
Testing the HEPCAL use cases J.J. Blaising, F. Harris, Andrea Sciabà GAG Meeting April,
M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 1 Markus Oldenburg GridPP Metadata Workshop July 4–7 2006, Oxford University ALICE.
File Transfer And Access (FTP, TFTP, NFS). Remote File Access, Transfer and Storage Networks For different goals variety of approaches to remote file.
Handling of T1D0 in CCRC’08 Tier-0 data handling Tier-1 data handling Experiment data handling Reprocessing Recalling files from tape Tier-0 data handling,
15 December 2000Tim Adye1 Data Distribution Tim Adye Rutherford Appleton Laboratory BaBar Collaboration Meeting 15 th December 2000.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
5-7 May 2003 SCD Exec_Retr 1 Research Data, May Archive Content New Archive Developments Archive Access and Provision.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Maintaining and Updating Windows Server 2008 Lesson 8.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.
J Jensen / WP5 /RAL UCL 4/5 March 2004 GridPP / DataGrid wrap-up Mass Storage Management J Jensen
Federating Data in the ALICE Experiment
ATLAS Use and Experience of FTS
Moving the LHCb Monte Carlo production system to the GRID
Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF
Data Management Components for a Research Data Archive
Grid Computing Software Interface
Presentation transcript:

GridPP Collaboration Meeting 5 th November 2001 Dan Tovey, University of Sheffield Non-LHC and Non-US-Collider Experiments’ Requirements Dan Tovey, University of Sheffield

GridPP Collaboration Meeting 5 th November 2001 Dan Tovey, University of Sheffield ‘Other’ Experiments  Representing non-LHC and non-US-collider experiments.  Includes ANTARES, MINOS and UKDMC.  In general such experiments have few resources to devote to exclusively Grid activities (although much effort targeted at e- Science related issues, e.g. analysis code development).  At present analysis of data predominantly carried out locally or at central facilities - no requirement as yet to move to large- scale distributed data processing.  That said ….. keen interest exists in testing / making use of Grid tools if will improve data handling within existing analysis frameworks.  Situation likely to change in next few years given larger data- rates / mass uptake of Grid technology by central facilities.

GridPP Collaboration Meeting 5 th November 2001 Dan Tovey, University of Sheffield Application 1: Transfer of Data Between Mass Storage Facilities = Experiments in general need to transfer large volumes of data quickly and conveniently between physically separated sites. = Sites may or may not possess high-speed network connections (e.g. RAL and Boulby Mine). = In former case a grid-based transfer protocol may be appropriate - In the latter data needs to be transferred by some means other than the network. = This problem is common to many HEP experiments and mirrors that faced by the LHC experiments. = It is hoped that common solutions can be found (e.g. using EDG testbed components such as GridFTP and WP5 protocols).

GridPP Collaboration Meeting 5 th November 2001 Dan Tovey, University of Sheffield 1.Large subsets of data held in data storage facilities at collaborating institutes. 2.User wishes to transfer large subsets of this data (not necessarily co-located) to cpu location (datastore) prior to analysis. 3.User logs onto local machine. 4.User accesses collaboration-wide web-page providing front-end to generic data discovery and transfer tool. 5.User logs onto site (password required – automatic authentication is not required at this stage). 6.Software presents query form to user. 7.User specifies datasets of interest by ‘run’ properties (e.g. ‘I wish to download all calibration data taken between 01/06/02 and 01/07/02 by detector XXX’). Specification by run number or (if necessary file name also possible. 8.Software accesses collaboration metadata catalogue to match query to file-names. Metadata catalogue probably updated manually in the first instance as part of data book keeping process. Entries (in plain-text format) give e.g. run type, run number, start time, end time for each file. Example Use Case (Data Discovery and Transfer)

GridPP Collaboration Meeting 5 th November 2001 Dan Tovey, University of Sheffield 9.Software queries replica catalogue to discover location of required files. 10.Software starts up transfer protocol (e.g. GridFTP). 11.Software initiates FTP-like connection between source site(s) and destination site (not necessarily local to user). Source and destination sites must be members of list of ‘approved’ collaboration datastores (I.e. not possible to transfer data to arbitrary location – security issues). 12.Software ‘gets’ files efficiently reliably and securely from source(s) to destination. 13.Software notifies user of status of transfer via front-end (e.g. total data volume, total volume transferred, volume remaining, estimated time required, time taken, estimated time remaining, current mean transfer rate). 14.Software notifies user if faults occur: keeps trying until time-out, then returns to user with meaningful error message (i.e. suspected reason for error) if still failing. Must permit automatic partial transfer if faults only occur for certain files / locations (i.e. fully transferred files remain, partially transferred files deleted). 15.Software updates replica catalogue and transfer log file. 16.Software notifies user when transfer complete. Example Use Case (Data Discovery and Transfer)

GridPP Collaboration Meeting 5 th November 2001 Dan Tovey, University of Sheffield Data Transfer Requirements 1.MSM software should be supplied capable of transferring certain specified data sets, but not others, onto specific physical tapes. Specification must be possible on basis of file metadata as well as physical filename. 2.MSM software should decide which tapes are most suitable for this purpose on the basis of time taken to prepare tapes and/or total number of tapes required. 3.A common translation module for file metadata such that content, format and status of given transfer tapes can be assessed automatically by any specified system (there may be more than one) and used to position to and read files or segments of files from those tapes. 4.A simple-to-use ftp-like protocol with web-based front-end suitable for reliable, transparent, efficient and secure transfer of large datasets between multiple specified collaboration sites. Software must discover names and locations of files from specified run metadata using metadata and replica catalogues.

GridPP Collaboration Meeting 5 th November 2001 Dan Tovey, University of Sheffield Collaborative Project: Generic Data Discovery and Transfer Tool = Last requirement (generic data discovery and transfer tool) is common to several experiments (ANTARES, UKDMC and MINOS). = Therefore hoped that a common solution can be found. = Generic nature of the requirement suggests that solution will also be of interest to other groups, in particular UKQCD. = 'Mainstream' experiments (e.g. BaBar and LHC collaborations) have similar data transfer requirements, so the tool may be of further interest here. = Have therefore proposed a collaborative project between several experiments, including ANTARES, UKDMC and MINOS and possibly also UKQCD, BaBar and others. = The project will deliver, on a 1-2 year timescale, a fully functioning web-based data discovery and transfer tool providing an automated interface to appropriate grid applications (metadata and replica cataloguing and file transfer services).

GridPP Collaboration Meeting 5 th November 2001 Dan Tovey, University of Sheffield Application 2: Remote Control of Underground Experiments = Novel application distinct from others proposed within GridPP. = UKDMC and MINOS identified need for remote access to and control of underground experiments. = Involves remote configuring, monitoring and debugging of DAQ code (possibly also remote high-level trigger for low background experiments). = Methodology is similar to that suggested for a Global Accelerator Network for running the next generation of colliders. = There may also be commonality with grid based remote control applications specified by AstroGrid and other collaborations.

GridPP Collaboration Meeting 5 th November 2001 Dan Tovey, University of Sheffield Application 3: Fast Access to Remote Data Sets = Simple grid-like application identified by MINOS. = Would like to perform interactive ROOT analyses in UK on selected data sets held at location(s) in US. = Would involve accessing and merging data from remotely held files => already possible in PAW (Manchester), ROOT also? = Also desire to perform batch reduction in UK on remotely held files, possibly using grid-open type command => AFS alternative?

GridPP Collaboration Meeting 5 th November 2001 Dan Tovey, University of Sheffield Summary =The 'Other' Experiments are keen to make full use ASAP of tools provided by GridPP and other initiatives in order to simplify existing analysis procedures. =Interested in developing full grid-based analyses in longer term (> 2 years). =We want to learn to walk (before we can run)!