DataTAG Mission EU US Grid network research High Performance Transport protocols Inter-domain QoS Advance bandwidth reservation EU US Grid Interoperability Sister project to EU DataGRID T rans A tlantic G rid
CGW03, Crakow, 28 October 20035 Main DataTAG achievements (EU-US Grid interoperability) GLUE Interoperability effort with DataGrid, iVDGL & Globus GLUE testbed & demos VOMS design and implementation in collaboration with DataGrid VOMS evaluation within iVDGL underway Integration of GLUE compliant components in DataGrid and VDT middleware
CGW03, Crakow, 28 October 20036 Main DataTAG achievements (Advanced networking) Internet landspeed records have been beaten one after the other by DataTAG project members and/or teams closely associated with DataTAG: Atlas Canada lightpath experiment (iGRID2002) New Internet2 landspeed record (I2 LSR) by Nikhef/Caltech team (SC2002) Scalable TCP, HSTCP, GridDT & FAST experiments (DataTAG partners & Caltech) Intel 10GigE tests between CERN (Geneva) and SLAC (Sunnyvale) – (Caltech, CERN, Los Alamos NL, SLAC) New I2LSR (Feb 27-28, 2003): 2.38Gb/s sustained rate, single TCP/IP v4 flow, 1TB in one hour Caltech-CERN Latest IPv4 & IPv6 I2LSR were awarded live from Indianapolis during Telecom World 2003: May 6, 2003: 987 Mb/s single TCP/IP v6 stream Oct 1, 2003, 5.44 Gb/s sustained rate, single TCP/IP v4 stream, 1.1TB in 26 minutes -> 1 680MB CD/second
CGW03, Crakow, 28 October 20037 Significance of I2LSR to the Grid? Essential to establish the feasibility of multi-Gigabit/second single stream IPv4 & IPv6 data transfers: Over dedicated testbeds in a first phase Then across academic & research backbones Last but not least across campus network Disk to disk rather than memory to memory Study impact of high performance TCP over disk servers Next steps: Above 6Gb/s expected soon between CERN and Los Angeles (Caltech/CENIC PoP) across DataTAG & Abilene Goal is to reach 10Gb/s with new PCI Express buses Study alternatives to standard TCP Non-TCP transport HSTCP, FAST, Grid-DT, etc…
CGW03, Crakow, 28 October 20038 Impact of high performance flows across A&R backbones? Possible solutions: Use of “TCP friendly” non-TCP (i.e. UDP) transport Use of Scavenger (i.e. less than best effort) services
Layer1/2/3 networking (1) Conventional layer 3 technology is no longer fashionable because of: High associated costs, e.g. 200/300 KUSD for a 10G router interfaces Implied use of shared backbones The use of layer 1 or layer 2 technology is very attractive because it helps to solve a number of problems, e.g. 1500 bytes Ethernet frame size (layer1) Protocol transparency (layer1&2) Minimum functionality hence, in theory, much lower costs (layer1&2)
Layer1/2/3 networking (2) So called, « lambda Grids » are becoming very popular, Pros : circuit oriented model like the telephone network, hence no need for complex transport protocols Lower equipment costs (i.e. typically a factor 2 or 3 per layer) the concept of a dedicated end to end light path is very elegant Cons : « End to end » still very loosely defined, i.e. site to site, cluster to cluster or really host to host High cost, Scalability & Additional required middleware to deal with circuit set up, etc
State of 10G deployment and beyond Still little deployed, because of lack of demand, hence: Lack of products High costs, e.g. 150KUSD for a 10GigE port on a Juniper T320 router Even switched, layer 2, 10GigE ports are expensive, however the prices should come down to 10KUSD/port towards the end of 2003. 40G deployment, although more or less technologically ready, is unlikely to happen in the near future, i.e. before LHC starts
10G DataTAG testbed extension to Telecom World 2003 and Abilene/Cenic Sponsors: Cisco, HP, Intel, OPI (Geneva’s Office for the Promotion of Industries & Technologies), Services Industriels de Geneve, Telehouse Europe, T-Systems On September 15, 2003, the DataTAG project was the first transatlantic testbed offering direct 10GigE access using Juniper’s VPN layer2/10GigE emulation.
NEC’2003 Conference, Varna (Bulgaria) 19 September 2003 15 Impediments to high E2E throughput across LAN/WAN infrastructure For many years the Wide Area Network has been the bottlemeck, this is no longer the case in many countries thus, in principle, making the deployment of data intensive Grid infrastructure possible! Recent I2LSR records show for the first time ever that the network can be truly transparent and that throughputs are limited by the end hosts The dream of abundant bandwith has now become a reality in large, but not all, parts of the world! Challenge shifted from getting adequate bandwidth to deploying adequate LANs and cybersecurity infrastructure as well as making effective use of it! Major transport protocol issues still need to be resolved, however there are many encouraging signs that practical solutions may now be in sight.
Single TCP stream performance under periodic losses Loss rate =0.01%: è LAN BW utilization= 99% è WAN BW utilization=1.2% Bandwidth available = 1 Gbps u TCP throughput is much more sensitive to packet loss in WANs than in LANs r TCP’s congestion control algorithm (AIMD) is not suited to gigabit networks r Poor limited feedback mechanisms r The effect of even very small packet loss rates is disastrous u TCP is inefficient in high bandwidth*delay networks u The future performance of data intensive grids looks grim if we continue to rely on the widely-deployed TCP RENO stack