Grid Data Access: Proxy Caches and User Views EGI Technical Forum 19 September 2011 Jan Just Keijser Cristian Cirstea Jeff Templon.

Slides:



Advertisements
Similar presentations
ESUP-Portail: a pure WebDAV-based Network attached Storage Pierre Gambarotto Pascal Aubry.
Advertisements

Tableau Software Australia
Prime’ Senior Project. Presentation Outline What is Our Project? Problem Definition What does our system do? How does the system work? Implementation.
Cross Platform Single Sign On using client certificates Emmanuel Ormancey, Alberto Pace Internet Services group CERN, Information Technology department.
Coursework 2: getting started (4) – using PhoneGap to build mobile applications (optional) Chris Greenhalgh G54UBI /
Firefox 2 Feature Proposal: Remote User Profiles TeamOne August 3, 2007 TeamOne August 3, 2007.
Proxy Cache Leonid Romanovsky Olga Fomenko Winter 2003 Instructor: Konstantin Sinyuk.
Proxy Servers Dr. Ronald Bergmann, CIO, ISO. Proxy servers A proxy server is a machine which acts as an intermediary between the computers of a local.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
General Presentation August Based out of the Netherlands 8 years of development Launched in May Sales offices in Los Angeles, Amsterdam, Hong.
10 May 2007 HTTP - - User data via HTTP(S) Andrew McNab University of Manchester.
Jason G. Caudill Assistant Professor of Business Administration Carson-Newman College.
Implementation - Deployment Methods of deployment –User PC –Network shared (workstation install) –Terminal server –Web Deployment (ActiveX) (Note: this.
A Spring 2005 CS 426 Senior Project By Group 15 John Studebaker, Justin Gerthoffer, David Colborne CSE Dept., University of Nevada, Reno Advisors (CSE.
Installation and Development Tools National Center for Supercomputing Applications University of Illinois at Urbana-Champaign The SEASR project and its.
RAL PPD Computing A tier 2, a tier 3 and a load of other stuff Rob Harper, June 2011.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
ArcGIS Server for Administrators
Kiew-Hong Chua a.k.a Francis Computer Network Presentation 12/5/00.
1 DIRAC Interfaces  APIs  Shells  Command lines  Web interfaces  Portals  DIRAC on a laptop  DIRAC on Windows.
Storage cleaner: deletes files on mass storage systems. It depends on the results of deletion, files can be set in states: deleted or to repeat deletion.
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
Grid, Web services and Taverna Machiel Jansen Richard Holland.
Future home directories at CERN
Internet2 AdvCollab Apps 1 Access Grid Vision To create virtual spaces where distributed people can work together. Challenges:
Shibboleth & Grid Integration STFC and University of Oxford (and University of Manchester)
Full and Para Virtualization
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Grid2Win: Porting of gLite middleware to.
CERN IT Department CH-1211 Geneva 23 Switzerland GT HTTP solutions for data access, transfer, federation Fabrizio Furano (presenter) on.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid2Win : gLite for Microsoft Windows Roberto.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Software Architecture in Practice Mandatory project in performance engineering.
1 Egrid portal Stefano Cozzini and Angelo Leto. 2 Egrid portal Based on P-GRADE Portal 2.3 –LCG-2 middleware support: broker, CEs, SEs, BDII –MyProxy.
Selenium server By, Kartikeya Rastogi Mayur Sapre Mosheca. R
Tackling I/O Issues 1 David Race 16 March 2010.
CernVM-FS Infrastructure for EGI VOs Catalin Condurache - STFC RAL Tier1 EGI Webinar, 5 September 2013.
The Institute of High Energy of Physics, Chinese Academy of Sciences Sharing LCG files across different platforms Cheng Yaodong, Wang Lu, Liu Aigui, Chen.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
Consorzio COMETA - Progetto PI2S2 UNIONE EUROPEA Grid2Win : gLite for Microsoft Windows Elisa Ingrà - INFN.
SHIWA Desktop Cardiff University David Rogers, Ian Harvey, Ian Taylor, Andrew Jones.
Web and mobile access to digital repositories Mario Torrisi National Institute of Nuclear Physics – Division of
Bringing Federated Identity to Grid Computing Dave Dykstra CISRC16 April 6, 2016.
Federated Access to Storage EGI CF 2012 Luke Howard, Daniel Kouril, Michal Prochazka.
The LGI Pilot job portal EGI Technical Forum 20 September 2011 Jan Just Keijser Willem van Engen Mark Somers.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
PaaS services for Computing and Storage
Web and Proxy Server.
Building Azure Mobile Apps
Jean-Philippe Baud, IT-GD, CERN November 2007
Grid2Win: Porting of gLite middleware to Windows platform
Virtualisation for NA49/NA61
StoRM: a SRM solution for disk based storage systems
Vincenzo Spinoso EGI.eu/INFN
Dag Toppe Larsen UiB/CERN CERN,
The PaaS Layer in the INDIGO-DataCloud
Dag Toppe Larsen UiB/CERN CERN,
Data Bridge Solving diverse data access in scientific applications
Consulting Services JobScheduler Architecture Decision Template
Web Debugging Proxy Application
Introduction to Data Management in EGI
Grid2Win: Porting of gLite middleware to Windows XP platform
Virtualisation for NA49/NA61
Grid2Win: Porting of gLite middleware to Windows XP platform
Processes The most important processes used in Web-based systems and their internal organization.
Introduction to Apache
Cloud Web Filtering Platform
Storing and Accessing G-OnRamp’s Assembly Hubs outside of Galaxy
Distributing META-pipe on ELIXIR compute resources
Web Servers (IIS and Apache)
Presentation transcript:

Grid Data Access: Proxy Caches and User Views EGI Technical Forum 19 September 2011 Jan Just Keijser Cristian Cirstea Jeff Templon

2 Who? Why? What? How? Results What's next? Outline

3 By who Developed by Cristian Cirstea –Student at TU Eindhoven in the Netherlands –9 month master thesis project Supervised by Jeff Templon and yours truly For whom Initially for HEP users, but made applicable to all grid users End users that wish to browse datafiles on the grid on their laptop or desktop computers Grid users that have many jobs requiring access to the same set of files By who and for whom?

4 Existing data management tools: Require a specific version of Linux – no other platforms are supported Provide command-line tools only Do not integrate well into the operating system Do not scale well when requesting the same datafile many times... Need I continue? Why was it built?

5 Grid Proxy Cache Offers a standard WebDAV interface to users and to grid jobs Authentication can be done using username+password (via a MyProxy store) and using grid proxies Serves as a proxy to existing LFC and SRM implementations without having to change the LFCs or SRMs themselves Current implementation is readonly (it's a cache!) What is it?

6 Using mostly existing technologies WebDAV server and cache nodes are based on apache+mod_gridsite Cache nodes configured for automatic failover and load balancing Files retrieved from LFC/SRM using GFAL Linux FUSE client 'webdavfs' WebDAV interface done using FastCGI scripts Modifications made Gridsite's 'htproxyput' enhanced to support limited proxies 'webdavfs' enhanced to support grid proxies and HTTP REDIRECTs How does it work?

7 Architecture SRM LF C WebDAV Frontend SRM Cache nodes End User Worker Nodes MyProxy

8 Works with Windows XP/7, Mac OS X, Linux native WebDAV clients Also works with a web browser (and 'curl'/'wget') Browsing of various LFCs and SRMs is transparent Linux FUSE driver offers full POSIX access ('open'/'fopen') to grid files from within grid jobs Issues Unmounting FUSE mounts sometimes fails  Obstacle for production use MyProxy credentials need to be uploaded using Firefox plugin or command-line tool Results (1): Usability

9 Issues When many jobs all request different files a proxy cache can add extra overhead Results (2): Performance

10 Conclusions For end user file browsing current solution is production ready The system scales well under load For grid job access the FUSE issue needs to solved Future work Investigate possible write access Optimize WebDAV throughput Once SRMs provide an HTTPS interface SSLProxyCaching might be possible  further speed improvement Add a translation layer (e.g grab ZIP file from SRM, unpack it, then server the contents via WebDAV) Conclusions and future work