Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.

Similar presentations


Presentation on theme: "Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011."— Presentation transcript:

1 Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011

2 What is Federated ATLAS Xrootd (FAX) (Wei’s limited understanding)
Data access via a single entrance (using Xrootd’s redirection tech) Read data directly over the WAN Read data via local storage cache Work with dCache, Xrootd, and Posix storage systems Started at the Amsterdam WLCG workshop in 2009 Involves experts from US Tier 1, Tier 2s, and many Tier 3s Workshop at Chicago, Sept 12-13 Bi-weekly phone meeting Twiki pages for T1/T2/T3 instructions, meeting notes, site configuration repository Support from ADC on dq2 tools, Dubna on monitoring Software are ready and provides needed functions, almost

3 Use Cases Read a dataset directly from WAN
Figure out the content of a dataset and the file pathes using dq2-ls Just read them from global redirector Pulling data from FAX Run jobs from laptops, desktops Permanent “diskless” sites, or “diskless” mode during site maintenance Bring data to local Tier 3 Xrootd disk (cache) Use dq2-get/xprep to bring dataset to local disk cache, directly or via Xrootd’s FRM, or Just read files against local Tier 3 Xrootd storage, let Xrootd’ FRM fetch the data from FAX Users sharing non-DDM data between sites Old untouched files are purged when space is needed Remove the need to manage the space

4 How FAX Works Users Bringing in Tier 1s and Tier 2s
requires more work. They are big data sources Tier1 / Tier 2 Disk Users Global Redirector Read over WAN Read via local cache Tier 3s Disk cache Common/Unique file path: Global Name Space Tier 3 federation functionally easier

5 Tier 3s FAX Components FAX User FRM Xrootd
dq2-get –xprep dataset FAX List: File + GUID + Checksum List of Files User FRM Xrootd Missing a File: File ? GUID & Checksum List of Files Disk Storage Disk cache /atlas/dq2 disk cache managed by FRM /atlas/local locally managed space

6 Tier 1 and Tier 2 Components
Tier 1 and Tier 2s Global Redirector Xrootd Tier 3s LFC N2N Global Name Space + GUID Site specific file path N2N is difficult to implement And maybe a performance Bottleneck because of LFC Storage

7

8 Some stage-in test result
The three failed ones exist in the federation. Later xrdcp from global redirector fetched them successfully. This seems to suggest that a single retry will likely avoid the failures Low failure rate indicates reasonably solid architecture (when not under stress !) After dq2-get/xprep, reading those files will be put on hold by Xrootd until they are ready. The FRM stage-in script is trimmed to the absolute minimum, effectively just one xrdcp All files exists in SWT2_CPB's LFC, and can be found by current N2N's heuristic algorithms.

9 What are still needed Global Name Space DQ2 client tools support
Global storage path schema to locate file at Tier 3 without LFC Supported by ADC, see DQ2 support below Heuristic algorithms to locate files in Tier 1/2s via LFC, can be extended to search LFC using GUID 0.3 second penalty seems acceptable DQ2 client tools support dq2-get can trigger Tier 3 Xrootd storage to fetch data from FAX Instructions are documented dq2-ls can print dataset contents in global name space Can also be used to check against local storage for missing files The global path isn’t quite right, working with ADC Come with other extra, unwanted verification features

10 Documentation and Configuration repository
Xrootd Long waited version in rc3 (target release date 10/17) Name to Name (N2N) translation works for Proxy Function needed at Tier 1/2s. Same for the next one N2N translation works for checksum For Xrootd storage, a chain of checksum events works, from user, to proxy, to backend server A plug-in for dCache storage will be work out Not an issue for GPFS, Lustre (posix) Dual NIC binding in certain security module Info users about FRM stage-in failures --- non-ideal methods X509: hopefully by the end of this year? Documentation and Configuration repository Comprehensive documentation for Tier 3 to setup Xrootd And Tier 1/2s to setup Xrootd in various modes with N2N Site xrootd, proxy and FRM configurations Working on standardizing the FRM script

11 CMSd for dCache Xrootd door
A special cmsd to work directly with dCache Xrootd door A "authorization" plugin for the dCache Xrootd door to convert GFN->LFN A caching mechanism as N2N cache Work with Proof cluster Panda Analysis Queue upon FAX Just started looking into this: ANALY_MWT2_FAX-pbs No LFC, can Panda brokerage handle this? Many possible operational modes, including complete “diskless”. Monitoring Had a discussion with Dubna group on monitoring metrics A daemon to collect local Xrootd summary info and feed to Ganglia An application works with xrootd summary stream: need to decide how to present fetched data


Download ppt "Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011."

Similar presentations


Ads by Google