Presentation is loading. Please wait.

Presentation is loading. Please wait.

Experiences with http/WebDAV protocols for data access in high throughput computing Gerard.Bernabeu@pic.es.

Similar presentations


Presentation on theme: "Experiences with http/WebDAV protocols for data access in high throughput computing Gerard.Bernabeu@pic.es."— Presentation transcript:

1 Experiences with http/WebDAV protocols for data access in high throughput computing

2 dCache as WebDav server Performance tests Conclusions
Index Motivation The target Why WebDav/HTTP How to access WebDav dCache as WebDav server Performance tests Conclusions

3 Motivation Current usage of non standard protocols for data access at the WLCG Tier1 (at PIC mainly [gsi]dCap). Hard to justify new projects to use non standard protocols. Data access inefficiencies with current protocols 2 data access patterns observed at the Tier1 Bulk data transfer (dccp like): high throughput Read as need (randomly or stream like, dcopen from the application): usually low throughput

4 The target An efficient data access model should provide
Fast data transfer Lightweight protocol (minimize protocol/client overhead) Local (WorkerNode) caching support Random access capabilities Standard protocol Easy to use (POSIX) NFSv4.1 might be our solution Native kernel support (caching) dCache supports it! POSIX-like (read OR write)

5 Why WebDav/http then? NFSv4.1 is still in experimental stage
Requires experimental kernel in the client But it works already! (tested with dCache & SLC5 client) WebDav is standard (HTTP) We do not expect to reach NFSv4.1 performance, specially for remote random data access. but it is efficient in bulk data transfer (high throughput)

6 How to access WebDav Bulk data transfer
A WebDav share can be accessed in many ways Read as need Mounted: davfs, fusedav POSIX like (read OR write) with dCache server Native support from GNOME, KDE and MS Windows POSIX like (read OR write) with dCache server see by Dr. Gerd Berhmann, NDGF root client direct HTTP access (not working with ) Bulk data transfer Standard HTTP/WebDav clients: wget, curl, etc.

7 Using dCache as a server. Why?
Already deployed at many Tier1 & Tier2 well known by WLCG community Single server for many data access protocols ([gsi]dcap, [grid]ftp, xrootd, NFSv4.1, WebDav) Smooth protocol transition Storage Resource Management (SRMv2) Distributed & scalable Easy to administrate (~3PB served to 7 projects with 1,5 FTE!)

8 WebDav at dCache WebDav mount utilities for Linux (fusedav and davfs2) need webdav.redirect.on-read=false on dcache.conf This means that all data transfers will flow through the WebDav door and it may become a bottleneck. It might change in the future! Root client is not able to access files using HTTP protocol with dCache's WebDav server (error 203) Tested with root client version: x86_64_linux_26_dbg.tgz dCache.org is working on it Following performance tests are focused in bulk data transfer (wget)

9 Performance test setup
Server: dCache 1 dCache door ([gsi]ftp, [gsi]dcap, xrootd, webdav server) 1 dCache pool (36*2TB disk RAID60, 2*10GE NIC, 48GB RAM) 1 dCache server (Chimera, PoolManager, SRM, InfoService, etc) Client: standard workernode with SLC5.3 x86_64 1GE, 2*X5355 CPU, 16GB RAM dcap wget xrootd

10 Performance test – big file (I)

11 Performance test – big file (II)
wget: average 12453ms (114.6MB/s) 1797ms CPU  xrdcp: average 12382ms (115.2MB/s)  2914ms CPU dcap -B10M: average 14481ms (98.5MB/s) 1970ms CPU

12 Performance test – small file
wget: average 147ms (11.47MB/s)  ms CPU  xrdcp: average 176ms (9.85MB/s) ms CPU dcap: average 240ms (8.98MB/s) ms CPU dcap -B10M: average 227ms (10.85MB/s) ms CPU

13 Conclusions HTTP+wget is fast as a bulk file transfer solution for both big and small files. HTTP+wget is the most CPU efficient tested solution. HTTP+wget is standard There are other clients besides wget! HTTP proxies can be used More efforts are required to use HTTP for “read as need” data access (ie: using it as a mounted FS). Shouldn't we focus on NFSv4.1 for this use case?

14 Questions? Thanks to dCache.org Carlos Osuna from IFAE

15 dCache NFSv4.1 server is still experimental and in development
About NFSv4.1 in dCache For kernel client and above at least dCache is required. dCache NFSv4.1 server is still experimental and in development dCache NFSv4.1 server is focused in focus is read data access. In dCache NFSv3 & NFSv4.1 services can not be in the same server dCache is the latest version available today

16 Performance test methodology
Because of the observation of two groups of transfer times per protocol, performance test with small files was repeated twice, getting the same results. Test commands: (file=/pnfs/pic.es/at3/data10_7TeV _physics_Muons_ _00_D2AODM_TOPQCDMU.root; for i in `seq 1 100`; do echo WGET TEST number $i `date`; time wget -q -O /dev/null disk.pic.es:2880$file; echo DCCP TEST number $i `date`; time dccp dcap://gridftp-disk.pic.es$file /dev/null; echo XRDCP TEST number $i `date`; time /root/ /bin/xrdcp -s -f root://gridftp- disk.pic.es$file /dev/null; echo DCCPB10M TEST number $i `date`; time dccp -B dcap://gridftp- disk.pic.es$file /dev/null; done) > wgetVSdccp2MB.output 2>&1


Download ppt "Experiences with http/WebDAV protocols for data access in high throughput computing Gerard.Bernabeu@pic.es."

Similar presentations


Ads by Google