Presentation is loading. Please wait.

Presentation is loading. Please wait.

Andrew McNab - HTTP/HTTPS extensions HTTP/HTTPS as Grid data transport 6 March 2003 Andrew McNab, University of Manchester

Similar presentations


Presentation on theme: "Andrew McNab - HTTP/HTTPS extensions HTTP/HTTPS as Grid data transport 6 March 2003 Andrew McNab, University of Manchester"— Presentation transcript:

1 Andrew McNab - HTTP/HTTPS extensions HTTP/HTTPS as Grid data transport 6 March 2003 Andrew McNab, University of Manchester mcnab@hep.man.ac.uk

2 Andrew McNab - HTTP/HTTPS extensions - 6 Mar 2003 Overview u EDG Motivations u Why use HTTP(S) for data transport u What needs promoting/agreeing? u Example: multistream HTTP u Extensions to HTTP(S) u Example: delegation over HTTPS u HTTP(S) vs alternatives

3 Andrew McNab - HTTP/HTTPS extensions - 6 Mar 2003 Background: EDG Motivations u EU DataGrid is interested in large High Energy Physics, Earth Observation and Bio/Medical datasets. u Currently using GridFTP and HEP-specific RFIO protocol for bulk data transfer n EDG has modular “Storage Element” fileservers which can support additional transfer protocol front-ends. n Looking at adding support for HTTP(S) to Storage Elements n Widespread availability and quality of HTTP clients discussed later. u Interest in remote filesystems using GSI credentials n (cf Kerberos and AFS) n need protocol with low overhead, reuse of connections etc. u Also interest in delegation extensions for some aspects of information services

4 Andrew McNab - HTTP/HTTPS extensions - 6 Mar 2003 Why use HTTP(S) for data transport? (1) u HTTP(S) are interesting and important protocols for several reasons: n HTTPS is by far the most widely deployed secure protocol n HTTP(S) has a large amount of high quality software that we can leverage n has excellent interaction with Firewalls, Network Address Translation and Application Proxies n HTTP is the basis for most Web and Grid Services work u HTTPS consists of HTTP/1.1 over an SSL connection n security done by SSL layer, using X509 certificates (including GSI) u HTTP/1.1 (rfc2616) and extensions like WebDAV (rfc2518) have a rich set of methods (GET, PUT, DELETE, COPY, LOCK etc), headers (“Expires:” etc) and Errors (“413 Request Entity Too Large”) n so a standard way exists already for many data transfer operations

5 Andrew McNab - HTTP/HTTPS extensions - 6 Mar 2003 Why use HTTP(S) for data transport? (2) u HTTP includes mechanisms for redirection and for offering multiple versions and letting the client choose. u HTTP’s Range header allows partial GET and PUT operations n this makes it possible to implement multi-stream HTTP, with multiple TCP streams coming from one server, or striped across multiple servers. u In practice, HTTP can be as fast as other TCP-based protocols n eg multistream copying of 300MB files across Europe by HTTP or GridFTP u A very large amount of effort goes into producing HTTP(S) servers and clients with particular robustness or efficiency properties n eg Kernel-based “zero-copy” HTTP servers like tux are very efficient

6 Andrew McNab - HTTP/HTTPS extensions - 6 Mar 2003 What needs promoting/agreeing to use this? u “Informational” n eg What can be achieved using HTTP(S) n eg performance of HTTP(S) vs other protocols for given context u “Best Practice” n eg How existing standards should be used to achieve particular performance / functionality. u Standards n eg Which part of existing standards should go from “MAY” to “SHOULD” or “MUST” in a Grid data transport context. n eg What extensions or new standards do we need to achieve particular functionality or performance.

7 Andrew McNab - HTTP/HTTPS extensions - 6 Mar 2003 Best practice example: multistream HTTP u HTTP can support application-level multiple streams and striping by using the standard Range header from RFC 2616 (HTTP/1.1) to set up many partial fetches. n This mechanism is supported by almost all modern web servers n eg Apache and RedHat’s tux kernel httpd u Multiple streams implemented by client splitting into threads n Each thread requests a block of the file from the server n As each request completes, thread finds next unfetched block and requests it u For this, it is essential that servers support Range header, and yet this is a relatively obscure feature in Web contexts, which many developers are not aware of. u So best practice statement would be “support the Range header”

8 Andrew McNab - HTTP/HTTPS extensions - 6 Mar 2003 Extensions to HTTP(S) u HTTPS/HTTP already have most of the functionality we need for Grid information/control/data transport n some of these come from several sources (eg the WebDAV RFC2518 not just HTTP/1.1 itself) and can be done different ways n frequently “MAY” -> “SHOULD” / “MUST” n so want to specify a sufficient subset for interoperability u However, can identify some extensions that are also valuable: n delegation over HTTPS n some way of returning access control information along with data n may want to specify TCP parameters for bulk data tranfer n so want to specify new HTTP headers and methods for the above u (My feeling is that we should retain backwards and “pass through” compatibility with existing HTTP(S) implementations.)

9 Andrew McNab - HTTP/HTTPS extensions - 6 Mar 2003 Example: delegation over HTTPS u Client issues GET-PROXY-REQ request. n server generates a key and a certificate request, returns this in the response message body. u Client signs the cert request, and returns it in body of PUT-PROXY- CERT request. u Need a Delegation-ID header in the above exchanges so can keep track of the delegation session n may want to maintain delegation sessions for the same user at one server, but with different amounts of delegation n subsequent GET, PUT etc actions carry on using the Delegation-ID u Most clients and servers can pass through unknown methods/headers n Delegation-unaware server responds with “501 Method not implemented” u (Demonstration implementation of this in GridSite)

10 Andrew McNab - HTTP/HTTPS extensions - 6 Mar 2003 (Extended) HTTP(S) vs alternatives u Could use existing protocols: GridFTP etc n HTTP(S) motivated for reasons given at start n Some environments (eg NAT) better suited to HTTP(S) u Could use ad-hoc conventions for some things n eg always use “POST /cgi-bin/delegation.cgi” for delegation n messy to implement, difficult to agree / standardize n difficult to implement transparently (eg for Trusted Caches) u Could do it all in SOAP, Web Services etc n worried about efficiency of encoding, set up time of transfers etc: what if we want to grab a large number of small files say? n only works for SOAP- or WS- applications. u So HTTP(S) appears to address Grid data transport in some contexts better than other protocols.

11 Andrew McNab - HTTP/HTTPS extensions - 6 Mar 2003 Summary u HTTP(S) interesting because n widespread adoption, widespread support by multiple languages/platforms n coexists well with NAT etc n HTTPS naturally interoperates with GSI-based security n protocol has many features (eg Range header) which are very useful for data transport u Scope for doing informational, best practice and standardisation activities n how much (other) interest is there in doing this in GGF?


Download ppt "Andrew McNab - HTTP/HTTPS extensions HTTP/HTTPS as Grid data transport 6 March 2003 Andrew McNab, University of Manchester"

Similar presentations


Ads by Google