Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.

Slides:



Advertisements
Similar presentations
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
Advertisements

Role Based VO Authorization Services Ian Fisk Gabriele Carcassi July 20, 2005.
CMS Applications Towards Requirements for Data Processing and Analysis on the Open Science Grid Greg Graham FNAL CD/CMS for OSG Deployment 16-Dec-2004.
Open Science Grid Frank Würthwein UCSD. 2/13/2006 GGF 2 “Airplane view” of the OSG  High Throughput Computing — Opportunistic scavenging on cheap hardware.
Grid Services at NERSC Shreyas Cholia Open Software and Programming Group, NERSC NERSC User Group Meeting September 17, 2007.
Oxford Jan 2005 RAL Computing 1 RAL Computing Implementing the computing model: SAM and the Grid Nick West.
Minerva Infrastructure Meeting – October 04, 2011.
OSG End User Tools Overview OSG Grid school – March 19, 2009 Marco Mambelli - University of Chicago A brief summary about the system.
Open Science Grid Software Stack, Virtual Data Toolkit and Interoperability Activities D. Olson, LBNL for the OSG International.
Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
OSG Services at Tier2 Centers Rob Gardner University of Chicago WLCG Tier2 Workshop CERN June 12-14, 2006.
OSG Site Provide one or more of the following capabilities: – access to local computational resources using a batch queue – interactive access to local.
OSG Middleware Roadmap Rob Gardner University of Chicago OSG / EGEE Operations Workshop CERN June 19-20, 2006.
Campus Grids Report OSG Area Coordinator’s Meeting Dec 15, 2010 Dan Fraser (Derek Weitzel, Brian Bockelman)
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
A. Sim, CRD, L B N L 1 OSG Applications Workshop 6/1/2005 OSG SRM/DRM Readiness and Plan Alex Sim / Jorge Rodriguez Scientific Data Management Group Computational.
Interactive Job Monitor: CafMon kill CafMon tail CafMon dir CafMon log CafMon top CafMon ps LcgCAF: CDF submission portal to LCG resources Francesco Delli.
Concept: Well-managed provisioning of storage space on OSG sites owned by large communities, for usage by other science communities in OSG. Examples –Providers:
SAMGrid as a Stakeholder of FermiGrid Valeria Bartsch Computing Division Fermilab.
Use of Condor on the Open Science Grid Chris Green, OSG User Group / FNAL Condor Week, April
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
OSG Storage Architectures Tuesday Afternoon Brian Bockelman, OSG Staff University of Nebraska-Lincoln.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
Global Grid Forum GridWorld GGF15 Boston USA October Abhishek Singh Rana and Frank Wuerthwein UC San Diegowww.opensciencegrid.org The Open Science.
UMD TIER-3 EXPERIENCES Malina Kirn October 23, 2008 UMD T3 experiences 1.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
Storage, Networks, Data Management Report on Parallel Session OSG Meet 8/2006 Frank Würthwein (UCSD)
Stefano Belforte INFN Trieste 1 Middleware February 14, 2007 Resource Broker, gLite etc. CMS vs. middleware.
Role Based VO Authorization Services Ian Fisk Gabriele Carcassi July 20, 2005.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
VO Privilege Activity. The VO Privilege Project develops and implements fine-grained authorization to grid- enabled resources and services Started Spring.
Derek Ross E-Science Department DCache Deployment at Tier1A UK HEP Sysman April 2005.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
OSG Abhishek Rana Frank Würthwein UCSD.
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
Storage and Data Movement at FNAL D. Petravick CHEP 2003.
Eileen Berman. Condor in the Fermilab Grid FacilitiesApril 30, 2008  Fermi National Accelerator Laboratory is a high energy physics laboratory outside.
Parag Mhashilkar Computing Division, Fermi National Accelerator Laboratory.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
GridKa December 2004 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Doris Ressmann dCache Implementation at FZK Forschungszentrum Karlsruhe.
CernVM-FS Infrastructure for EGI VOs Catalin Condurache - STFC RAL Tier1 EGI Webinar, 5 September 2013.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Western Tier 2 Site at SLAC Wei Yang US ATLAS Tier 2 Workshop Harvard University August 17-18, 2006.
1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
OSG Status and Rob Gardner University of Chicago US ATLAS Tier2 Meeting Harvard University, August 17-18, 2006.
OSG STORAGE AND DATA MOVEMENT Tanya Levshina. Talk Outline  Data movement methods and limitation  Open Science Grid (OSG) Storage  Storage Resource.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
OSG STORAGE OVERVIEW Tanya Levshina. Talk Outline  OSG Storage architecture  OSG Storage software  VDT cache  BeStMan  dCache  DFS:  SRM Clients.
9/20/04Storage Resource Manager, Timur Perelmutov, Jon Bakken, Don Petravick, Fermilab 1 Storage Resource Manager Timur Perelmutov Jon Bakken Don Petravick.
A. Sim, CRD, L B N L 1 Production Data Management Workshop, Mar. 3, 2009 BeStMan and Xrootd Alex Sim Scientific Data Management Research Group Computational.
Abhishek Singh Rana and Frank Wuerthwein UC San Diegowww.opensciencegrid.org The Open Science Grid ConsortiumCHEP 2006 Mumbai INDIA February gPLAZMA:
VO Experiences with Open Science Grid Storage OSG Storage Forum | Wednesday September 22, 2010 (10:30am)
The CMS Beijing Tier 2: Status and Application Xiaomei Zhang CMS IHEP Group Meeting December 28, 2007.
a brief summary for users
CASTOR: possible evolution into the LHC era
WLCG IPv6 deployment strategy
StoRM: a SRM solution for disk based storage systems
dCache “Intro” a layperson perspective Frank Würthwein UCSD
StoRM Architecture and Daemons
Introduction to Data Management in EGI
Artem Trunov and EKP team EPK – Uni Karlsruhe
Enabling High Speed Data Transfer in High Energy Physics
Presentation transcript:

Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July 26-27, 2007 Fermi National Accelerator Laboratory

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab1 Outline Storage local/mounted on Compute Elements –$OSG_APP, $OSG_WN_TMP, $OSG_DATA Condor-G based file transfer SRM based Storage Elements –SRM-dCache on OSG –DCap in dCache Typical clients Snapshots Squid based caching mechanisms

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab2 $OSG_APP Write access from GridFTP and fork via GRAM host. Read-only access from all WN’s via a mounted filesystem. Intended for a VO to install application software, to be later accessed by users. Size > ~50 GB per VO.

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab3 $OSG_WN_TMP Specific to each batch slot. Local filesystem during job’s execution. Read/write access from a job. Generally cleaned up at the end of batch slot lease. Size ~ 15 GB per batch slot.

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab4 $OSG_DATA Read/write access from GridFTP and fork at GRAM host. Read/write access from batch slot. Intended as stage-in/stage-out area for a job. Persistent across job boundaries. No quotas or guaranteed space. Its usage is discouraged because it led to complications in past.

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab5 Condor-G based file transfer Condor-G JDL using ‘transfer_input_files’ and ‘transfer_output_files’. Transfers get spooled via the CE headnode, severely overloading it if filesize is large. Its usage is discouraged for files larger than a few MB’s. GB size files should be stored in the dedicated stage- out spaces, and pulled from outside rather than spooled via the CE headnode by condor file transfer.

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab6 SRM Storage Resource Management. SRM is a specification - a ‘grid protocol’ formulated by agreements between institutions with very large storage needs, such as LBNL, FNAL, CERN, etc. v1.x in usage, v2.x in implementation/usage. Many interoperable implementations! A user needs to get familiar with only the client-side software suite of SRM. –E.g., ‘srmcp’ - it is easy to use!

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab7 SRM-dCache SRM server GridFTP server DCap server Pool Example of site architecture with pools behind NAT

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab8 SRM-dCache SRM server GridFTP server DCap server Pool PNFS Logical namespace Physical namespace Example of site architecture with pools behind NAT

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab9 SRM-dCache SRM server GridFTP server DCap server Pool PNFS Logical namespace Physical namespace GSI role-based security Example of site architecture with pools behind NAT

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab10 SRM-dCache on OSG Packaged ready-to-deploy as part of VDT. Intended for large scale grid storage. Scheduling, load balancing, fault tolerance. GSI and role-based secure access. Implicit space reservation available. Transfer types: –Localhost SRM –SRM SRM Widely deployed at production scale on OSG. –More than 12 sites with ~100 TB each.

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab11 SRM-dCache on OSG Usage strategy (short term) –Your own SRM. Have your own VO SRM server at your home site. (However, it requires deploying and operating SRM- dCache which may be non-trivial). Stage-in/stage-out using SRM client tools. Use ‘srmcp’ installed on WNs of all OSG sites to stage- out files from a WN to home site. –Opportunistic access on other sites with SRM. Negotiate access for your VO (and users) with sites where SRM servers are already deployed. Use ‘srmcp’ installed on WNs of all OSG sites to stage- out files from a WN to remote sites with SRM.

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab12 SRM-dCache on OSG Usage strategy (long term) –‘Leased’ storage of many TB’s of space for several weeks to months at a time. –Explicit space reservation. –Expected to be in OSG

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab13 DCap in dCache Has both server/client components. A user uses ‘dccp’ to read data already in dCache at the local site. Libraries and client API available (libdcap.so) and can be integrated/used within applications. Provides a set of POSIX-like functions.

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab14 Typical clients Command-line read/write client tools. –Usual arguments are src_URL and dest_URL –globus-url-copy (gsiftp://GridFTP_server:port) –srmcp (srm://SRM_server:port) –srm-ls, srm-rm, srm-mv, srm-mkdir, … –dccp (dcap://DCap_server:port) Interactive tool. –uberftp

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab15 Snapshot: Legacy storage Data is written to OSG_DATA using GridFTP or if necessary by fork jobs (unpack tarballs, etc.). Job is staged into the cluster. Job copies its data to the worker node (OSG_WN_TMP) or reads data sequentially from OSG_DATA (if the data is read once). The latter can be a significant performance issue on typical network file systems. Job output is placed in OSG_WN_TMP. At the end of job, results from OSG_WN_TMP are packaged, staged to OSG_DATA and picked up using GridFTP.

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab16 Snapshot: Legacy storage WN OSG_WN_TMP WN OSG_WN_TMP WN OSG_WN_TMP OSG_APP OSG_DATA

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab17 Snapshot: Legacy storage WN OSG_WN_TMP OSG_APP OSG_DATA

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab18 Snapshot: Legacy storage WN OSG_WN_TMP OSG_APP OSG_DATA

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab19 Snapshot: SRM storage Read access is usually by DCap (or SRM), write access is by SRM. Data is written to SRM-dCache. Job is staged into the cluster. Job execution (open/seek/read using DCap). Job output is placed in OSG_WN_TMP. At the end of job, results from OSG_WN_TMP are packaged and staged-out using SRM.

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab20 Snapshot: SRM storage WN OSG_WN_TMP OSG_APP SRM DCap

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab21 Snapshot: SRM storage WN OSG_WN_TMP OSG_APP SRM DCap

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab22 Squid based caching Intended to provide read-only http caching mechanisms. Used by CMS and CDF. (Details in Dave Dykstra’s talk).

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab23 OSG has vigorously supported distributed storage since early days. Legacy mechanisms with local or mounted access to data have been in common use on OSG. As expected, a few limitations exist. SRM based storage is widely available now. –SRM deployments with total space ~O(1000 TB). A fraction of this space on OSG is available for opportunistic usage by all interested VO’s. –SRM clients available on WN’s of all OSG sites. –Easy to use! Summary

Open Science Grid Consortium Abhishek Singh Rana OSG Users Meeting, July , Fermilab24 Contacts Please us for more information: –OSG Storage Team –OSG User Group Team