Presentation is loading. Please wait.

Presentation is loading. Please wait.

R. Hughes-Jones Manchester

Similar presentations


Presentation on theme: "R. Hughes-Jones Manchester"— Presentation transcript:

1 R. Hughes-Jones Manchester
Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard Hughes-Jones GNEW2004 CERN March 2004 R. Hughes-Jones Manchester

2 Network Monitoring is Essential
Detect or X-check problem reports Isolate / determine a performance issue Capacity planning Publication of data: network “cost” for middleware RBs for optimized matchmaking WP2 Replica Manager SLA verification Isolate / determine throughput bottleneck – work with real user problems Test conditions for Protocol/HW investigations Protocol performance / development Hardware performance / development Application analysis Input to middleware – eg gridftp throughput Isolate / determine a (user) performance issue Hardware / protocol investigations End2End Time Series Throughput UDP/TCP Rtt Packet loss Passive Monitoring Routers Switches SNMP MRTG Historical MRTG Packet/Protocol Dynamics tcpdump web100 Output from Application tools GNEW2004 CERN March 2004 R. Hughes-Jones Manchester

3 R. Hughes-Jones Manchester
Multi-Gigabit transfers are possible and stable 10 GigEthernet at SC2003 BW Challenge Three Server systems with 10 GigEthernet NICs Used the DataTAG altAIMD stack 9000 byte MTU Send mem-mem iperf TCP streams From SLAC/FNAL booth in Phoenix to: Pal Alto PAIX rtt 17 ms , window 30 MB Shared with Caltech booth 4.37 Gbit hstcp I=5% Then 2.87 Gbit I=16% Fall corresponds to 10 Gbit on link 3.3Gbit Scalable I=8% Tested 2 flows sum 1.9Gbit I=39% Chicago Starlight rtt 65 ms , window 60 MB Phoenix CPU 2.2 GHz 3.1 Gbit hstcp I=1.6% Amsterdam SARA rtt 175 ms , window 200 MB 4.35 Gbit hstcp I=6.9% Very Stable Both used Abilene to Chicago GNEW2004 CERN March 2004 R. Hughes-Jones Manchester

4 R. Hughes-Jones Manchester
The performance of the end host / disks is really important BaBar Case Study: RAID Throughput & PCI Activity 3Ware RAID5 parallel EIDE 3Ware forces PCI bus to 33 MHz BaBar Tyan to MB-NG SuperMicro Network mem-mem 619 Mbit/s Disk – disk throughput bbcp Mbytes/s (320 – 360 Mbit/s) PCI bus effectively full! User throughput ~ 250 Mbit/s User surprised !! Read from RAID5 Disks Write to RAID5 Disks GNEW2004 CERN March 2004 R. Hughes-Jones Manchester

5 Application design – Throughput + Web100
2Gbyte file transferred RAID0 disks Web100 output every 10 ms Gridftp See alternate 600/800 Mbit and zero Apachie web server + curl-based client See steady 720 Mbit MB - NG GNEW2004 CERN March 2004 R. Hughes-Jones Manchester

6 R. Hughes-Jones Manchester
Network Monitoring is vital Development of new TCP stacks and non-TCP protocols is required Multi-Gigabit transfers are possible and stable on current networks Complementary provision of packet IP & λ-Networks is needed The performance of the end host / disks is really important Application design can determine Perceived Network Performance Helping Real Users is a must – can be harder than herding cats Cooperation between Network providers, Network Researchers, and Network Users has been impressive Standards (eg GGF / IETF) are the way forward Many grid projects just assume the network will work !!! It takes lots of co-operation to put all the components together GNEW2004 CERN March 2004 R. Hughes-Jones Manchester

7 R. Hughes-Jones Manchester
GNEW2004 CERN March 2004 R. Hughes-Jones Manchester

8 Tuning PCI-X: Variation of mmrbc IA32
1024 bytes 2048 bytes 4096 bytes 512 bytes CSR Access PCI-X Sequence Data Transfer Interrupt & CSR Update 16080 byte packets every 200 µs Intel PRO/10GbE LR Adapter PCI-X bus occupancy vs mmrbc Plot: Measured times Times based on PCI-X times from the logic analyser Expected throughput GNEW2004 CERN March 2004 R. Hughes-Jones Manchester

9 GGF: Hierarchy Characteristics Document
“A Hierarchy of Network Performance Characteristics for Grid Applications and Services” Document defines terms & relations: Network characteristics Measurement methodologies Observation Discusses Nodes & Paths For each Characteristic Defines the meaning Attributes that SHOULD be included Issues to consider when making an observation Status: Originally submitted to GFSG as Community Practice Document draft-ggf-nmwg-hierarchy-00.pdf Jul 2003 Revised to Proposed Recommendation Jan 04 Now in 60 day Public comment from 28 Jan 04 – 18 days to go. GNEW2004 CERN March 2004 R. Hughes-Jones Manchester


Download ppt "R. Hughes-Jones Manchester"

Similar presentations


Ads by Google