Presentation on theme: "NASA EOS Active Network Performance Testing Using Web100 Andy Germain Swales Aerospace 1 August 2002 301-902-4352."— Presentation transcript:
NASA EOS Active Network Performance Testing Using Web100 Andy Germain Swales Aerospace 1 August 2002 Andy.Germain@gsfc.nasa.gov 301-902-4352
24 June 2002 Andy Germain EOS Active Testing Overview End-to-end user level test –Active testing, no visibility into network internals Communities –EOS Internal Network: 9 Sites, 8 Sources, 13 Sinks "Production" Flows, dedicated bandwidth –EOS Science Users: About 50 sites, tested from EOS DAACs "QA" and Science flows, often via Abilene –CEOS: About 20 International sites Earth Observation data sharing Purposes –Verify that networks as implemented meet SLA and/or requirements –Assess whether networks can support intended applications –Resolve user complaints: Network problems -- or elsewhere?? –Determine bottlenecks -- seek routing alternatives –Provide a basis for allocation of additional resources Results at http://corn.eos.nasa.gov/networks
24 June 2002 Andy Germain Test Process Test script runs hourly to each site: Traceroute (1 way) –Number of hops -- route stability Hops Chart Pings –100 pings prior to thruput test and/or 100/300 during –Round Trip Time RTT Chart –Packet Loss Packet Loss Chart TCP Throughput –Iperf Thruput Chart –keeps send buffer full for 30 Seconds –Netstat packets retransmitted (if pings blocked)
24 June 2002 Andy Germain EOS DAAC NASA Nodes SCFs QA Other Key: ORST UCSB Ariz LANL Wisc Miami SUNY-SB BU GSFC LaRC EDC MSFC, NSSTC NCAR Mont JPL Toronto Colo St. Niagara ASF Chicago Other Nodes SLAC NSIDC NMEX CCRS UVA UMD GPN NGDC, NOAA USF RSS EOS Performance Test Sites Texas UCSD Wash Mich NOAA Ohio Penn State NCDC MIT
24 June 2002 Andy Germain EOSDIS Mission Partner CEOS PI: QA/IST EOS International Test Sites GSFC CCRS JPL NASDA (ADEOS, TRMM, Aura, Aqua) CSIRO ESRIN INPE (Aqua), IDN CONAE IRE-RAS Israel ASF NSIDC EDC LaRC MITI (Terra) CAO (SAGE III) RAL, OXFORD (Aura) Toronto (Terra) UCL (Terra) JRC AIT, RFD, GISTDA KNMI (Aura)
24 June 2002 Andy Germain Uses of Web100 One of our sources at GSFC runs Web100 –King = "GSFC MAX" –Connected to MAX by GigE Typical use is in problem solving –DTB, Triage Window size (easier to use than tcpdump) Vs. circuit limitations vs. packet loss –Also ANLiperf Window size again Plan: extract packet drops from web100, not pings or netstats
24 June 2002 Andy Germain A recent case Sending data from LaRC to JPL via a project dedicated 20 mbps ATM VC. –Problem surfaced after firewall was installed Portus "proxy" firewall RTT of 60 ms requires 150 KB windows –To fill pipe with a single TCP stream Iperf worked well – a single stream typically got over 15 mbps But ftp got < 8 mbps
24 June 2002 Andy Germain A recent case (2) The problem, of course, was window size –Looked like it was the ftp application, since iperf performance showed that O/S was OK –But which end? Ran ftps from both nodes to web100 node –Used DTB to capture window size –Problem: small disk quota FTPs were quick FTP data session not established until ftp started So had to be quick to capture data with DTB –DTB showed one site had 64 kb windows But problem was in O/S (IRIX), not ftp –Tcp_recvspace and tcp_sendspace –Iperf can exceed O/S defaults!
24 June 2002 Andy Germain Case #2 Another case of limited thruput –This time iperf was limited –from one source to several destinations –Limit inverse to RTT window size –But source and dest clearly used large windows Testing to Web100 box showed source was not using extended windows TCPdump on source showed it was! Problem turned out to be PIX firewall –Nop'd out the WSCALE field!
24 June 2002 Andy Germain Case #3 Iperf from GSFC to Tokyo XP –Via MAX, Abilene, Seattle, TransPac Thruput appears to ramp up linearly for about 5 minutes (when no loss) –Then becomes window limited: 1 MB window @ 188 ms RTT 42.5 mbps –Repeatable (more or less) –Low or no packet loss Web100 Triage usually reports 100% path limited –But can't show early part of session (?) What causes this ramp-up ???
24 June 2002 Andy Germain Traceroute traceroute to perf.jp.apan.net (22.214.171.124), 30 hops max, 38 byte packets 1 enpl-rtr1-ge (126.96.36.199) 0.427 ms 0.325 ms 0.396 ms 2 188.8.131.52 (184.108.40.206) 0.397 ms 0.375 ms 0.275 ms 3 220.127.116.11 (18.104.22.168) 0.740 ms 1.266 ms 1.225 ms 4 gsfc-wash.maxgigapop.net (22.214.171.124) 1.093 ms 1.169 ms 0.907 ms 5 dcne-so3-1-0.maxgigapop.net (126.96.36.199) 1.434 ms 1.621 ms 1.410 ms 6 abilene-wash-oc48.maxgigapop.net (188.8.131.52) 1.073 ms 1.439 ms 1.352 ms 7 nycm-wash.abilene.ucaid.edu (184.108.40.206) 5.436 ms 5.570 ms 5.680 ms 8 clev-nycm.abilene.ucaid.edu (220.127.116.11) 17.747 ms 17.954 ms 17.764 ms 9 ipls-clev.abilene.ucaid.edu (18.104.22.168) 24.006 ms 24.380 ms 24.072 ms 10 kscy-ipls.abilene.ucaid.edu (22.214.171.124) 33.335 ms 33.263 ms 33.321 ms 11 dnvr-kscy.abilene.ucaid.edu (126.96.36.199) 43.781 ms 43.977 ms 43.756 ms 12 sttl-dnvr.abilene.ucaid.edu (188.8.131.52) 72.129 ms 72.286 ms 72.004 ms 13 TRANSPAC-PWAVE.pnw-gigapop.net (184.108.40.206) 72.204 ms 72.404 ms 72.220 ms 14 220.127.116.11 (18.104.22.168) 188.150 ms 188.216 ms 187.811 ms 15 perf.jp.apan.net (22.214.171.124) 187.786 ms 188.103 ms 188.040 ms