Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Performance Bottleneck Application, Computer, or Network Richard Carlson eVLBI Workshop – Performance Tuning Tutorial September 17, 2006 Richard Carlson.

Similar presentations


Presentation on theme: "The Performance Bottleneck Application, Computer, or Network Richard Carlson eVLBI Workshop – Performance Tuning Tutorial September 17, 2006 Richard Carlson."— Presentation transcript:

1 The Performance Bottleneck Application, Computer, or Network Richard Carlson eVLBI Workshop – Performance Tuning Tutorial September 17, 2006 Richard Carlson eVLBI Workshop – Performance Tuning Tutorial September 17, 2006

2 Outline Why there is a problem What can be done to find/fix problems Tools you can use

3 Basic Premise Application’s performance should meet your expectations! If they don’t you should complain! But you have to complain effectively.

4 Questions How many times have you said: What’s wrong with the network? Why is the network so slow? Do you have any way to find out? Tools to check local host Tools to check local network Tools to check end-to-end path

5 Unfortunate Reality Every problem, regardless of cause, exhibits the same symptom The application performance doesn’t meet the users expectations!

6 Possible Bottlenecks Network infrastructure Host computer/appliance Application design

7 Simple Network Picture Bob’s Host Network Infrastructure Carol’s Host

8 Switch 1 Switch 2Switch 3 Network Infrastructure R1 R3 R4 R2 R7 R6 R9 R8 R5 Switch 4 Switch 1 Switch 2 Switch 3 R1 R3 R4 R2 R7 R6 R9 R8 R5 Switch 4

9 Network Infrastructure Bottlenecks Links too small Using FastEthernet instead of Gigabit Ethernet Links congested Too many hosts crossing this link Scenic routing End-to-end path is longer than it needs to be Broken equipment Bad NIC, broken wire/cable, cross-talk Administrative restrictions Firewalls, Filters, shapers, restrictors

10 Host Computer Bottlenecks CPU utilization What else is the processor doing? Memory limitations Main memory and network buffers I/O bus speed Getting data into and out of the NIC Disk access speed

11 Application Behavior Bottlenecks Chatty protocol Lots of short messages between peers High reliability protocol Send packet and wait for reply before continuing No run-time tuning options Use only default settings Blaster protocol Ignore congestion control feedback

12 Problems, Problems, Problems Problems can exist at multiple levels Network infrastructure Host computer Application design Multiple problems can exist at the same time All problems must be found and fixed before things get better

13 Transport Protocols 101 Transmission Control Protocol (TCP) Provides applications with a reliable in-order delivery service The most widely used Internet transport protocol Web, File transfers, email, P2P, Remote login User Datagram Protocol (UDP) Provides applications with an unreliable delivery service RTP, DVTS, DNS

14 Outline Why there is a problem What can be done to find/fix problems Tools you can use

15 Remote Image Processing Carol is analyzing astronomical images. Bob needs to send a data file containing digital images (50 MB per file) to Carol every ½ hour. Bob and Carol are 2,000 miles apart. How long should each transfer take? 5 minutes? 1 minute? 5 seconds?

16 What should we expect? Assumptions: 100 Mbps Fast Ethernet is the slowest link 50 msec round trip time Bob & Carol calculate: 50 MB * 8 = 400 Mbits 400 Mb / 100 Mb/sec = 4 seconds

17 Initial Test Results

18 18 Minutes!!! This is unacceptable! First look for network infrastructure problem Use NDT tester to examine both hosts

19 Initial NDT testing shows Duplex Mismatch at one end

20 NDT Found Duplex Mismatch Investigating this it is found that the switch port is configured for 100 Mbps Full-Duplex operation. Network administrator corrects configuration and asks for re-test

21 Duplex Mismatch Corrected

22 SCP results after Duplex Mismatch Corrected

23 Intermediate Results Time dropped from 18 minutes to 40 seconds. Is this acceptable??? Remember your calculations said it should take 4 seconds. 400 Mb / 40 sec = 10 Mbps Why are we limited to 10 Mbps? Are you satisfied with 1/10 th of the possible performance?

24 Default TCP window size

25 Calculating the Window Size Remember Bob found the round-trip time was 50 msec Calculate window size limit 85.3KB * 8 b/B = 698777 b 698777 b /.050 s = 13.98 Mbps Stated another way 698777 b / 100 Mb/s = 6.99 msec 43 msec of idle time every RTT

26 Calculating the Window Size Calculate new window size (100 Mb/s *.050 s) / 8 b/B = 610.3 KB Use 8MB for testing purposes

27 Resetting Window Buffer

28 Intermediate Results Use application specific options to manually reset buffer size Fixes problem for this application Doesn’t fix problem for other applications Need better ‘default behavior’ for all applications

29 With TCP window size tuned

30 Steps so far Found and fixed Duplex Mismatch Network Infrastructure problem Found and fixed TCP window size values Host configuration problem Are we done yet?

31 SCP results with auto-tuning enabled

32 Intermediate Results SCP still runs slower than expected Hint: SSH uses internal buffers Design choice by Application Developers limit performance Patch available from PSC

33 SCP Results with tuned SCP

34 Final Results Fixed infrastructure problem Fixed host configuration problem Fixed Application configuration problem Achieved target time of 4 seconds to transfer 50 MB file over 2000 miles

35 Follow-up questions What would have happened if I tried the patched SCP version before fixing the TCP buffer problem? Would not have been able to see improvement. Discard patch because “it didn’t work”?

36 Why is it hard to Find/Fix Problems? Network infrastructure is complex Network infrastructure is shared Network infrastructure consists of multiple components

37 Shared Infrastructure Other applications accessing the network Remote disk access Automatic email checking Heartbeat facilities Other computers are attached to the closet switch Uplink to facility infrastructure Other users on and off site Uplink from facility to gigapop/backbone

38 Other Network Components DHCP (Dynamic Host Resolution Protocol) At least 2 packets exchanged to configure your host DNS (Domain Name Resolution) At least 2 packets exchanged to translate FQDN into IP address Multiple addresses require a sequential search Network Security Devices Intrusion Detection, VPN, Firewall

39 Why is it hard to Find/Fix Problems? Computers have multiple components Each Operating System (OS) has a unique set of tools to tune the network stack Network Interface Cards also have tuning options Application Appliances come with few knobs and limited options

40 Computer Components Main CPU (clock speed) Front & Back side bus Main Memory I/O Bus (ATA, SCSI, SATA) Disk (access speed and size)

41 Computer Issues Lots of internal components with multi- tasking OS Lots of tunable TCP/IP parameters that need to be ‘right’ for each possible connection

42 Why is it hard to Find/Fix Problems? Applications depend on default system settings Problems scale with distance More access to remote resources 80/20 % rule since the early 1990’s, 80% of your traffic leaves your local network

43 Default System Settings For Linux 2.6.13 there are: 11 tunable IP parameters 45 tunable TCP parameters 148 Web100 variables (TCP MIB) Currently no OS ships with default settings that work well over trans-continental distances Some applications allow run-time setting of some options 30 settable/viewable IP parameters 24 settable/viewable TCP parameters There are no standard ways to set run-time option ‘flags’

44 Application Issues Setting tunable parameters to the ‘right’ value Getting the protocol ‘right’

45 Outline Why there is a problem What can be done to find/fix problems Tools you can use

46 Tools, Tools, Tools Ping Traceroute Iperf Tcpdump Tcptrace BWCTL NDT OWAMP AMP Advisor Thrulay Web100 MonaLisa pathchar NPAD Pathdiag Surveyor Ethereal CoralReef MRTG Skitter Cflowd Cricket Net100

47 Active Measurement Tools Tools that inject packets into the network to measure some value Available Bandwidth Delay/Jitter Loss May require bi-directional traffic or synchronized hosts May require running test program on both hosts

48 Passive Measurement Tools Tools that monitor existing traffic on the network and extract some information Bandwidth used Jitter Loss rate May generate some privacy and/or security concerns

49 How do you set realistic Expectations? Assume network bandwidth exists or find out what the limits are Local LAN connection Site Access link Monitor the link utilization occasionally Weathermap MRTG graphs Look at your host config/utilization What is the CPU utilization

50 Distance Matters It’s harder to go fast over a long distance TCP congestion control requires numerous round trips to prevent flooding network TCP buffer limits can stop sender from injecting new data into the network Application can exhibit poor behavior when used over long distances

51 Ethernet, FastEthernet, Gigabit Ethernet, 10 GE 10/100/1000 auto-sensing NICs are common today Most facilities have installed 10/100 switched infrastructure Access network links are currently the limiting factor in most networks Backbone networks are 10 Gigabit/sec

52 Wireless LAN’s 802.11b - 11 Mbps (expect 5) 802.11a – 34 Mbps (expect 15) 802.11g – 54 Mbps (expect 25) Expect large variations in speed due to radio signal propagation

53 Focus on 2 tools Existing NDT tool Allows users to test network path for a limited number of common problems Emerging PerfSonar tool Allows users to retrieve network path data from major national and international REN network

54 Network Diagnostic Tool (NDT) Measure performance to users desktop Identify real problems for real users Network infrastructure is the problem Host tuning issues are the problem Make tool simple to use and understand Make tool useful for users and network administrators Web-based JAVA applet allows testing from any browser

55 Installing your own server All Internet2 tools are FREE Visit http://e2epi.internet2.edu/ for detailshttp://e2epi.internet2.edu/ Workshops are available to help your administrator get them up and running ( http://e2epi.internet2.edu/net-perf-wkshp/ )http://e2epi.internet2.edu/net-perf-wkshp/ Encourage your peers to start testing Encourage your vendors to include the client programs

56 NPToolkit Bootable CD Knoppix based Live-CD Contains listed tools Download from Internet2 Ask for a pre-built CD-ROM http://e2epi.internet2.edu/network-performance-toolkit/network-performance-toolkit.iso

57 PerfSonar – Next Steps in Performance Monitoring New Initiative involving multiple partners ESnet (DOE labs) GEANT (European Research and Education network) Internet2 (Abilene and connectors) Sample tool (Joe Metzger ESnet) https://performance.es.net/cgi-bin/perfsonar-trace.cgi

58 Traceroute Visualizer

59 Abilene Weather Map http://loadrunner.uits.iu.edu/weathermaps/abilene/

60 Windows XP Performance

61 Google it! Enter “tuning tcp” into the google search engine. Top 2 hits are: http://www.psc.edu/networking/perf_tune.html http://www-didc.lbl.gov/TCP-tuning/TCP-tuning.html

62 PSC Tuning Page

63 LBNL Tuning Page

64 Conclusions Applications can fully utilize the network All problems have a single symptom All problems must be found and fixed before things get better Some people stop investigating before finding all problems Tools exist, and more are being developed, to make it easier to find problems


Download ppt "The Performance Bottleneck Application, Computer, or Network Richard Carlson eVLBI Workshop – Performance Tuning Tutorial September 17, 2006 Richard Carlson."

Similar presentations


Ads by Google