Performance Troubleshooting across Networks

Slides:



Advertisements
Similar presentations
GSFC to Alaska Performance Results Tino Sciuto Swales Aerospace ESDIS Network Prototype Lab. NASA GSFC Greenbelt, MD.
Advertisements

Esma Yildirim Department of Computer Engineering Fatih University Istanbul, Turkey DATACLOUD 2013.
BZUPAGES.COM 1 User Datagram Protocol - UDP RFC 768, Protocol 17 Provides unreliable, connectionless on top of IP Minimal overhead, high performance –No.
Iperf Tutorial Jon Dugan Summer JointTechs 2010, Columbus, OH.
QoS Solutions Confidential 2010 NetQuality Analyzer and QPerf.
TCP loss sensitivity analysis ADAM KRAJEWSKI, IT-CS-CE.
Maximizing End-to-End Network Performance Thomas Hacker University of Michigan October 5, 2001.
Ch. 28 Q and A IS 333 Spring Q1 Q: What is network latency? 1.Changes in delay and duration of the changes 2.time required to transfer data across.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
GridFTP Guy Warner, NeSC Training.
KEK Network Qi Fazhi KEK SW L2/L3 Switch for outside connections Central L2/L3 Switch A Netscreen Firewall Super Sinet Router 10GbE 2 x GbE IDS.
Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.
Maximizing End-to-End Network Performance Thomas Hacker University of Michigan October 26, 2001.
POSTECH DP&NM Lab. Internet Traffic Monitoring and Analysis: Methods and Applications (1) 4. Active Monitoring Techniques.
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
Block1 Wrapping Your Nugget Around Distributed Processing.
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
Securing and Monitoring 10GbE WAN Links Steven Carter Center for Computational Sciences Oak Ridge National Laboratory.
RAC parameter tuning for remote access Carlos Fernando Gamboa, Brookhaven National Lab, US Frederick Luehring, Indiana University, US Distributed Database.
NET100 Development of network-aware operating systems Tom Dunigan
Slide 1 9/29/15 End-to-End Performance Tuning and Best Practices Moderator: Charlie McMahon, Tulane University Jan Cheetham, University of Wisconsin-Madison.
Experiences Tuning Cluster Hosts 1GigE and 10GbE Paul Hyder Cooperative Institute for Research in Environmental Sciences, CU Boulder Cooperative Institute.
Prentice HallHigh Performance TCP/IP Networking, Hassan-Jain Chapter 13 TCP Implementation.
GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard.
TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.
EDC Intenet2 AmericaView TAH Oct 28, 2002 AlaskaView Proof-of-Concept Test Tom Heinrichs, UAF/GINA/ION Grant Mah USGS/EDC Mike Rechtenbaugh USGS/EDC Jeff.
TechTarget Backup School New Customer Training exagrid.com | 1© 2014 ExaGrid Systems, Inc. Confidential Backup School Backup School
Final EU Review - 24/03/2004 DataTAG is a project funded by the European Commission under contract IST Richard Hughes-Jones The University of.
GridFTP Guy Warner, NeSC Training Team.
Run-time Adaptation of Grid Data Placement Jobs George Kola, Tevfik Kosar and Miron Livny Condor Project, University of Wisconsin.
Trojan Express Network II Goal: Develop Next Generation research network in parallel to production network to address increasing research data transfer.
Connect communicate collaborate Performance Metrics & Basic Tools Robert Stoy, DFN EGI TF, Madrid September 2013.
Samuel Wood Manikandan Punniyakotti Supervisors: Brad Smith, Katia Obraczka, JJ Garcia-Luna-Aceves
G. Russo, D. Del Prete, S. Pardi Frascati, 2011 april 4th-7th The Naples' testbed for the SuperB computing model: first tests G. Russo, D. Del Prete, S.
Intro To Virtualization Mohammed Morsi
Network Monitoring Sebastian Büttrich, NSRC / IT University of Copenhagen Last edit: February 2012, ICTP Trieste
UNM SCIENCE DMZ Sean Taylor Senior Network Engineer.
Troubleshooting Ben Fineman,
Semester 3, Chapter 7 Allan Johnson
NFV Compute Acceleration APIs and Evaluation
Wide Area Network Performance Analysis Methodology
WP18, High-speed data recording Krzysztof Wrona, European XFEL
FileCatalyst Performance
Network Tools and Utilities
TCP loss sensitivity analysis
The transfer performance of iRODS between CC-IN2P3 and KEK
Securing the Network Perimeter with ISA 2004
Efficient utilization of 40/100 Gbps long-distance network
R. Hughes-Jones Manchester
DriveScale Proprietary Information © 2016
Transport Protocols over Circuits/VCs
Chapter 4: Network Layer
Chapter 5 TCP Sliding Window
Transport Layer Unit 5.
Understanding Throughput & TCP Windows
Wide Area Networking at SLAC, Feb ‘03
File Transfer Issues with TCP Acceleration with FileCatalyst
Dynamic Packet-filtering in High-speed Networks Using NetFPGAs
Event Building With Smart NICs
Net301 LECTURE 10 11/19/2015 Lect
Chapter 3 Part 3 Switching and Bridging
Beyond FTP & hard drives: Accelerating LAN file transfers
Network Performance Definitions
Cost Effective Network Storage Solutions
Anant Mudambi, U. Virginia
Instructor Materials Chapter 5: Windows Installation
Microsoft Virtual Academy
Chapter 4: Network Layer
Achieving reliable high performance in LFNs (long-fat networks)
Summer 2002 at SLAC Ajay Tirumala.
Presentation transcript:

Performance Troubleshooting across Networks Joe Breen University of Utah Center for High Performance Computing

What are User Expectations? Fasterdata Network Requirements and Expectations http://fasterdata.es.net/home/requirements-and-expectations/ http://fasterdata.es.net/home/requirements-and-expectations/

What are the steps to attain the expectations? First, make sure the host specs are adequate Are you shooting for 1G, 10G, 25G, 40G, 100G? Second, tune the host most are auto-tuning but higher speeds are still problematic Third, validate network is clean between hosts Fourth, make sure the network stays clean

Host specs Motherboard specs Higher CPU speed better than higher core count PCI interrupts tie to CPU processor ==> Try to minimize crossing bus between CPU processors Storage Host Bus adapters and Network Interface Cards require the correct generation of PCI Express and the correct number of lanes http://fasterdata.es.net/science-dmz/DTN/hardware-selection/motherboard-and-chassis/ http://www.tested.com/tech/457440-theoretical-vs-actual-bandwidth-pci-express-and-thunderbolt/ https://fasterdata.es.net/science-dmz/DTN/hardware-selection/motherboard-and-chassis/ http://darkness.codefu.org/wordpress/2005/08/pci-vs-pci-x-vs-pci-express-2/ https://en.wikipedia.org/wiki/PCI_Express#PCI_Express_3.0 https://fasterdata.es.net/science-dmz/DTN/100g-dtn/ https://fasterdata.es.net/science-dmz/DTN/reference-implementation/

Host specs PCI bus What generation of PCI Express (PCIe) and how many lanes? 4, 8, and16 lanes possible # of lanes supported depends on motherboard and Network Interface Card (NIC) Speed of lane depends on generation of PCIe PCIe 2 -> 2.5Gb/s per lane including overhead PCIe 3 -> 5Gb/s per lane including overhead http://www.tested.com/tech/457440-theoretical-vs-actual-bandwidth-pci-express-and-thunderbolt/ https://fasterdata.es.net/science-dmz/DTN/hardware-selection/motherboard-and-chassis/ http://darkness.codefu.org/wordpress/2005/08/pci-vs-pci-x-vs-pci-express-2/ https://en.wikipedia.org/wiki/PCI_Express#PCI_Express_3.0 https://fasterdata.es.net/science-dmz/DTN/100g-dtn/ https://fasterdata.es.net/science-dmz/DTN/reference-implementation/

Host specs PCI Implications PCIe v2 with 8 lanes and greater for 10G http://www.tested.com/tech/457440-theoretical-vs-actual-bandwidth-pci-express-and-thunderbolt/ https://fasterdata.es.net/science-dmz/DTN/hardware-selection/motherboard-and-chassis/ http://darkness.codefu.org/wordpress/2005/08/pci-vs-pci-x-vs-pci-express-2/ https://en.wikipedia.org/wiki/PCI_Express#PCI_Express_3.0 https://fasterdata.es.net/science-dmz/DTN/100g-dtn/ https://fasterdata.es.net/science-dmz/DTN/reference-implementation/

Host specs Storage subsystem factors Local disk Network disk RAID6, RAID5 or RAID 1+0 SATA or SAS Spinning disk vs SSD Network disk High speed parallel system vs NFS or SMB mounts, Local Storage: * http://fasterdata.es.net/science-dmz/DTN/hardware-selection/storage/ * https://en.wikipedia.org/wiki/Serial_ATA * https://en.wikipedia.org/wiki/Serial_Attached_SCSI * https://en.wikipedia.org/wiki/Solid-state_drive Network Storage NFS: * v3 https://tools.ietf.org/html/rfc1813 * v4.1 https://tools.ietf.org/html/rfc5661 CIFS/SMB: * https://en.wikipedia.org/wiki/Server_Message_Block * https://msdn.microsoft.com/en-us/library/windows/desktop/aa365233(v=vs.85).aspx * https://technet.microsoft.com/en-us/library/cc939973.aspx Storage Performance: * http://www.citi.umich.edu/projects/nfs-perf/results/cel/write-throughput.html Storage performance testers: * http://beyondtheblocks.reduxio.com/8-incredibly-useful-tools-to-run-storage-benchmarks * http://www.clustermonkey.net/FileSystems/benchmarking-parallel-file-systems.html * http://wiki.opensfs.org/Benchmarking_Basics * https://www.spec.org/sfs2008/press/release.html Parallel File Systems * Parallel NFS: http://www.pnfs.com/ * Lustre:http://lustre.org/ * GPFS: ** https://en.wikipedia.org/wiki/IBM_General_Parallel_File_System ** https://www.ibm.com/support/knowledgecenter/en/SSFKCN/gpfs_welcome.html Multi-tenancy: * http://whatis.techtarget.com/definition/multi-tenancy * https://en.wikipedia.org/wiki/Multitenancy

Host specs and other Memory – 32GB or greater Other factors such as multi-tenancy – how busy is your system? Local Storage: * http://fasterdata.es.net/science-dmz/DTN/hardware-selection/storage/ * https://en.wikipedia.org/wiki/Serial_ATA * https://en.wikipedia.org/wiki/Serial_Attached_SCSI * https://en.wikipedia.org/wiki/Solid-state_drive Network Storage NFS: * v3 https://tools.ietf.org/html/rfc1813 * v4.1 https://tools.ietf.org/html/rfc5661 CIFS/SMB: * https://en.wikipedia.org/wiki/Server_Message_Block * https://msdn.microsoft.com/en-us/library/windows/desktop/aa365233(v=vs.85).aspx * https://technet.microsoft.com/en-us/library/cc939973.aspx Storage Performance: * http://www.citi.umich.edu/projects/nfs-perf/results/cel/write-throughput.html Storage performance testers: * http://beyondtheblocks.reduxio.com/8-incredibly-useful-tools-to-run-storage-benchmarks * http://www.clustermonkey.net/FileSystems/benchmarking-parallel-file-systems.html * http://wiki.opensfs.org/Benchmarking_Basics * https://www.spec.org/sfs2008/press/release.html Parallel File Systems * Parallel NFS: http://www.pnfs.com/ * Lustre:http://lustre.org/ * GPFS: ** https://en.wikipedia.org/wiki/IBM_General_Parallel_File_System ** https://www.ibm.com/support/knowledgecenter/en/SSFKCN/gpfs_welcome.html Multi-tenancy: * http://whatis.techtarget.com/definition/multi-tenancy * https://en.wikipedia.org/wiki/Multitenancy

Host tuning TCP Buffers sets the max data rate Too small means TCP cannot fill the pipe Buffer size = Bandwidth * Round Trip Time Use ping for the RTT Most recent Operating Systems now have auto- tuning which helps For high bandwidth, i.e. 40Gbps+ NICs, the admin should double-check the maximum TCP buffer settings (OS dependent) http://fasterdata.es.net/host-tuning/background/ Matt Mathis paper

Host tuning needs info on the network Determine the Bandwidth-Delay Product (BDP) Bandwidth Delay Product = Bandwidth * Round Trip Time BDP = BW * RTT e.g. 10Gbps*70ms =700,000,000bits = 87,500,000Bytes BDP determines proper TCP Receive Window RFC 1323 allows TCP extensions, i.e. window scaling Long Fat Pipe (LFN) – networks with large bandwidth delay. Matt Mathis original paper * http://ccr.sigcomm.org/archive/1997/jul97/ccr-9707-mathis.pdf TCP Performance and the Mathis equation * http://www.netcraftsmen.com/tcp-performance-and-the-mathis-equation/ Enabling High Performance Data Transfers * https://www.psc.edu/services/networking/68-research/networking/641-tcp-tune TCP Large Window extensions – window scale and Long Fat Pipes * RFC 1323 - https://www.ietf.org/rfc/rfc1323.txt A User's Guide to TCP Windows (Von Welch) http://www.vonwelch.com/report/tcp_windows Sizing Router Buffers (Appenzeller, Keslassy, McKeown) http://yuba.stanford.edu/techreports/TR04-HPNG-060800.pdf Internet Protocol Journal -- TCP Performance (Geoff Huston) * http://www.cisco.com/c/en/us/about/press/internet-protocol-journal/back-issues/table-contents-5/ipj-archive/article09186a00800c8417.html

Host Tuning Linux Apple Mac Modify /etc/sysctl.conf with See Notes section for links with details and description Linux Modify /etc/sysctl.conf with recommended parameters Apple Mac # allow testing with buffers up to 128MB net.core.rmem_max = 134217728 net.core.wmem_max = 134217728 # increase Linux autotuning TCP buffer limit to 64MB net.ipv4.tcp_rmem = 4096 87380 67108864 net.ipv4.tcp_wmem = 4096 65536 67108864 # recommended default congestion control is htcp net.ipv4.tcp_congestion_control=htcp # recommended for hosts with jumbo frames enabled net.ipv4.tcp_mtu_probing=1 # recommended for CentOS7/Debian8 hosts net.core.default_qdisc = fq # OSX default of 3 is not big enough net.inet.tcp.win_scale_factor=8 # increase OSX TCP autotuning maximums net.inet.tcp.autorcvbufmax=33554432 net.inet.tcp.autosndbufmax=33554432 Linux: * http://fasterdata.es.net/host-tuning/linux/ Apple Mac: * http://fasterdata.es.net/host-tuning/osx/ * https://rolande.wordpress.com/2010/12/30/performance-tuning-the-network-stack-on-mac-osx-10-6/ MS Windows: * http://www.thewindowsclub.com/window-auto-tuning-in-windows-10 * https://www.speedguide.net/articles/windows-8-10-2012-server-tcpip-tweaks-5077 * MS Win 10 and Server 2016 Powershell Network cmdlets ** https://technet.microsoft.com/en-us/itpro/powershell/windows/netadapter/netadapter ** https://technet.microsoft.com/itpro/powershell/windows/nettcpip/set-nettcpsetting

Host Tuning See Notes section for links with details and description MS Windows Show the autotuning status - "netsh interface tcp show global" Use Powershell Network cmdlets for changing parameters in Windows 10 and Windows 2016 e.g. Set-NetTCPSetting -SettingName "Custom" -CongestionProvider CTCP -InitialCongestionWindowMss 6 Linux: * http://fasterdata.es.net/host-tuning/linux/ Apple Mac: * http://fasterdata.es.net/host-tuning/osx/ * https://rolande.wordpress.com/2010/12/30/performance-tuning-the-network-stack-on-mac-osx-10-6/ MS Windows: * http://www.thewindowsclub.com/window-auto-tuning-in-windows-10 * https://www.speedguide.net/articles/windows-8-10-2012-server-tcpip-tweaks-5077 * MS Win 10 and Server 2016 Powershell Network cmdlets ** https://technet.microsoft.com/en-us/itpro/powershell/windows/netadapter/netadapter ** https://technet.microsoft.com/itpro/powershell/windows/nettcpip/set-nettcpsetting

What does the Network look like? What bandwidth do you expect? How far away is the destination? What is the round trip time that ping gives? Are you able to support jumbo frames? Send test packets with the "don't fragment bit set Linux or Mac: "ping -s 9000 -Mdo <destination>" Windows: "ping -l 9000 -f <destination>" Matt Mathis original paper * PSC pages * https://www.psc.edu/services/networking/68-research/networking/641-tcp-tune * http://www.cisco.com/c/en/us/about/press/internet-protocol-journal/back-issues/table-contents-5/ipj-archive/article09186a00800c8417.html

What does the Network look like? Do you have asymmetric routing? Traceroute from your local machine gives one direction Are you able to traceroute from the remote site? Are they mirrors of each other? Matt Mathis original paper * PSC pages * https://www.psc.edu/services/networking/68-research/networking/641-tcp-tune * http://www.cisco.com/c/en/us/about/press/internet-protocol-journal/back-issues/table-contents-5/ipj-archive/article09186a00800c8417.html

What does the Network look like? Determine the Bandwidth-Delay Product (BDP) Bandwidth Delay Product = Bandwidth * Round Trip Time BDP = BW * RTT e.g. 10Gbps*70ms =700,000,000bits = 87,500,000Bytes BDP determines proper TCP Receive Window RFC 1323 allows TCP extensions, i.e. window scaling Long Fat Pipe (LFN) – networks with large bandwidth delay. Matt Mathis original paper * http://ccr.sigcomm.org/archive/1997/jul97/ccr-9707-mathis.pdf TCP Performance and the Mathis equation * http://www.netcraftsmen.com/tcp-performance-and-the-mathis-equation/ Enabling High Performance Data Transfers * https://www.psc.edu/services/networking/68-research/networking/641-tcp-tune TCP Large Window extensions – window scale and Long Fat Pipes * RFC 1323 - https://www.ietf.org/rfc/rfc1323.txt A User's Guide to TCP Windows (Von Welch) http://www.vonwelch.com/report/tcp_windows Sizing Router Buffers (Appenzeller, Keslassy, McKeown) http://yuba.stanford.edu/techreports/TR04-HPNG-060800.pdf Internet Protocol Journal -- TCP Performance (Geoff Huston) * http://www.cisco.com/c/en/us/about/press/internet-protocol-journal/back-issues/table-contents-5/ipj-archive/article09186a00800c8417.html

How clean does the network really have to be? ES net TCP tuning http://fasterdata.es.net/network-tuning/tcp-issues-explained/packet-loss/ http://fasterdata.es.net/network-tuning/tcp-issues-explained/packet-loss/

How do I validate the network? Measurement! Active measurement Perfsonar – www.perfsonar.net Iperf - https://github.com/esnet/iperf Nuttcp - https://www.nuttcp.net/Welcome%20Page.html Passive measurement Nagios, Solarwinds, Zabbix, Zenoss Cacti, PRTG, RRDtool Trend the drops/discards Perfsonar http://www.perfsonar.net Iperf https://github.com/esnet/iperf Nuttcp https://www.nuttcp.net Nagios

How do I make sure the network is clean on a continual basis? Design network security zone without performance inhibitors Set up appropriate full bandwidth security Access Control Lists Remotely Triggered Black Hole Routing Setup ongoing monitoring with tools such as perfSONAR Create a maddash dashboard

Set up a performance/security zone Science DMZ architecture is a dedicated performance/security zone on a campus Science DMZ http://fasterdata.es.net/science-dmz/motivation/ Science DMZ: A Network Design Pattern for Data-Intensive Science https://www.es.net/assets/pubs_presos/sc13sciDMZ-final.pdf http://fasterdata.es.net/science-dmz/motivation/

Use the right tool Rclone - https://rclone.org/ Globus - https://www.globus.org/ FDT - http://monalisa.cern.ch/FDT/ Bbcp - http://www.slac.stanford.edu/~abh/bbcp/ Udt - http://udt.sourceforge.net/

100G Host, Parallel Streams: no pacing vs 20G pacing Techniques such as Packet Pacing 100G Host, Parallel Streams: no pacing vs 20G pacing Optimizing Data Transfer Nodes using Packet Pacing https://www.es.net/assets/pubs_presos/packet-pacing.pdf Credit: Brian Tierney, Nathan Hanford, Dipak Ghosal https://www.es.net/assets/pubs_presos/packet-pacing.pdf

Techniques such as Packet Pacing Faster Data Packet pacing (ESnet) https://fasterdata.es.net/host-tuning/packet-pacing/ Credit: Brian Tierney, Nathan Hanford, Dipak Ghosal https://www.es.net/assets/pubs_presos/packet-pacing.pdf

Not just about research Troubleshooting to the cloud is similar High latency, with big pipes Latency is not just to the front door but also internal to the cloud providers Example: Backups to the cloud are a lot like big science flows

Live example troubleshooting using bwctl on perfsonar boxes Bwctl -s <server_ip> -c <client_ip>

References: See Notes pages on print out of slides for references for each slide