SHARKFEST '09 | Stanford University | June 15–18, 2009 Protocol Analysis in a Complex Enterprise: The Importance of “The Art of Recognition.” June 16 th,

Slides:



Advertisements
Similar presentations
Fred P. Baker CCIE, CCIP(security), CCSA, MCSE+I, MCSE(2000)
Advertisements

Protocol layers and Wireshark Rahul Hiran TDTS11:Computer Networks and Internet Protocols 1 Note: T he slides are adapted and modified based on slides.
Top Causes for Poor Application Performance Case Studies Mike Canney.
ITIS 1210 Introduction to Web-Based Information Systems Chapter 44 How Firewalls Work How Firewalls Work.
Camarillo / Schulzrinne / Kantola November 26th, 2001 SIP over SCTP performance analysis
Cisco 2 - Routers Perrine. J Page 14/30/2015 Chapter 10 TCP/IP Protocol Suite The function of the TCP/IP protocol stack is to transfer information from.
B5 – TCP Analysis - First Steps Jasper Bongertz, Senior Consultant Airbus Defence and Space.
IS333, Ch. 26: TCP Victor Norman Calvin College 1.
11 TROUBLESHOOTING Chapter 12. Chapter 12: TROUBLESHOOTING2 OVERVIEW  Determine whether a network communications problem is related to TCP/IP.  Understand.
SHARKFEST ‘10 | Stanford University | June 14–17, 2010 Wireshark in the Large Enterprise June 16, 2010 Hansang Bae Senior Vice President | Citi (f.k.a.
How do Networks work – Really The purposes of set of slides is to show networks really work. Most people (including technical people) don’t know Many people.
Network Traffic Measurement and Modeling CSCI 780, Fall 2005.
Network Analyzer Example
Network Architectures Week 3 Part 2. Comparing The Internet & OSI.
Student Projects in Computer Networking: Simulation versus Coding Leann M. Christianson Kevin A. Brown Cal State East Bay.
Reducing Flow-Completion Time for Small Flows by Modifying Slow-Start Affan Rauf ( )
SM3121 Software Technology Mark Green School of Creative Media.
Ch. 31 Q and A IS 333 Spring 2015 Victor Norman. SNMP, MIBs, and ASN.1 SNMP defines the protocol used to send requests and get responses. MIBs are like.
FIREWALL TECHNOLOGIES Tahani al jehani. Firewall benefits  A firewall functions as a choke point – all traffic in and out must pass through this single.
Port Scanning.
TCP Sockets Reliable Communication. TCP As mentioned before, TCP sits on top of other layers (IP, hardware) and implements Reliability In-order delivery.
1 Lab 3 Transport Layer T.A. Youngjoo Han. 2 Transport Layer  Providing logical communication b/w application processes running on different hosts 
1 Computer Networks and Internets Spring 2005 Assistant Professor JainShing Liu.
Network Address Translation (NAT)
Slow Web Site Problem Analysis Last Update Copyright 2013 Kenneth M. Chipps Ph.D. 1.
1 IP: putting it all together Part 2 G53ACC Chris Greenhalgh.
CCNA 1 v3.0 Module 11 TCP/IP Transport and Application Layers.
SHARKFEST '08 | Foothill College | March 31 - April 2, 2008 Protocol Analysis in a Complex Enterprise April 2 nd, 2008 Hansang Bae Senior VP | Citigroup.
ECE4112 Lab 7: Honeypots and Network Monitoring and Forensics Group 13 + Group 14 Allen Brewer Jiayue (Simon) Chen Daniel Chu Chinmay Patel.
Shepard’s Valley Cowboy Church Web Server File Download Problem Analysis Last Update Copyright 2013 Kenneth M. Chipps Ph.D.
Ideas to Improve SharePoint Usage 4. What are these 4 Ideas? 1. 7 Steps to check SharePoint Health 2. Avoid common Deployment Mistakes 3. Analyze SharePoint.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
1 Computer and Network Bottlenecks Author: Rodger Burgess 27th October 2008 © Copyright reserved.
GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year UCL, 17 th June 2002.
CS332, Ch. 26: TCP Victor Norman Calvin College 1.

Day16 Protocols. TCP “Transmission Control Protocol” –Connection oriented Very like a phone call, an actual connection is made between the 2 parties.
EFFECTIVELY TEACHING WITH WIRESHARK LAURA CHAPPELL EFFECTIVELY TEACHING WITH WIRESHARK LAURA CHAPPELL CHAPPELLU.COM WIRESHARKTRAINING.COM.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 1 v3.0 Module 11 TCP/IP Transport and Application Layers.
Securing the Network Infrastructure. Firewalls Typically used to filter packets Designed to prevent malicious packets from entering the network or its.
TCP Sockets Reliable Communication. TCP As mentioned before, TCP sits on top of other layers (IP, hardware) and implements Reliability In-order delivery.
Networking Fundamentals. Basics Network – collection of nodes and links that cooperate for communication Nodes – computer systems –Internal (routers,
TCP/IP (Transmission Control Protocol / Internet Protocol)
9: Troubleshooting Your Network
Sniffer, tcpdump, Ethereal, ntop
Networks Part 3: Packet Paths + Wireshark NYU-Poly: HSWP Instructor: Mandy Galante.
Advanced Packet Analysis and Troubleshooting Using Wireshark 23AF
1 Microsoft Windows 2000 Network Infrastructure Administration Chapter 4 Monitoring Network Activity.
Wireshark In the Large Enterprise Hansang Bae Director – Product Architecture
DoS/DDoS attack and defense
1 Day 2 Logging in, Passwords, Man, talk, write. 2 Logging in Unix is a multi user system –Many people can be using it at the same time. –Connections.
Intrusion Detection System
TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.
Page 12/9/2016 Chapter 10 Intermediate TCP : TCP and UDP segments, Transport Layer Ports CCNA2 Chapter 10.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
COMP2322 Lab 1 Introduction to Wireshark Weichao Li Jan. 22, 2016.
LSNDI RMRA 1 Design and troubleshooting M Clements.
#16 Application Measurement Presentation by Bobin John.
Ch. 31 Q and A IS 333 Spring 2016 Victor Norman. SNMP, MIBs, and ASN.1 SNMP defines the protocol used to send requests and get responses. MIBs are like.
SharkFest ‘16 Computer History Museum June 13-16, 2016 SharkFest ‘16 Markers – Beacons in an Ocean of Packets Matthew York 15th June 2016 Performance &
Solving Real-World Problems with Wireshark
NET 536 Network Security Firewalls and VPN
Lab 2: Packet Capture & Traffic Analysis with Wireshark
Network Tools and Utilities
Lecture 6: TCP/IP Networking By: Adal Alashban
Chapter 4: Access Control Lists (ACLs)
Packet Sniffing.
Transport Protocols Relates to Lab 5. An overview of the transport protocols of the TCP/IP protocol suite. Also, a short discussion of UDP.
Routing and the Network Layer (ref: Interconnections by Perlman
Office 365 Performance Management
Presentation transcript:

SHARKFEST '09 | Stanford University | June 15–18, 2009 Protocol Analysis in a Complex Enterprise: The Importance of “The Art of Recognition.” June 16 th, 2009 Hansang Bae Senior VP| Citi (f.k.a Citigroup) SHARKFEST '09 Stanford University June 15-18, 2009

SHARKFEST '09 | Stanford University | June 15–18, 2009 Challenges:  As it turns out, size does matter!  Citi’s branch network spans 5,000+ locations in the US  Citi’s network infrastructure includes 30,000+ devices  300,000 users located in over 100 countries.  Number of servers in use is mind numbingly large!  Compliance/Security Quagmire  Doing a full packet capture is difficult.  Tools in use include NetVCR and Opnet’s ACE.  Wireshark is the only approved protocol analyzer at Citi. It dislodged past market leaders.

SHARKFEST '09 | Stanford University | June 15–18, 2009 Act I: Much Ado about Nothing!  Old medical school saying: When you hear hooves beating, think horses and not zebras!  Server SA reports extreme slowness during file transfers  What are the top issues that come to mind?  Server SA started a ping script and in it showed…..  Lessons Learned:  Learn to recognize what should and should not change as you go through the trace files.  RFC1323 was not in play because they are on the same switch!  Take a few minutes to scan the trace files. Learn to trust your brain’s ability to spot differences.  Know how protocols work so you can rule out red-herrings. This is what separates “techs” from “engineers”  Try not to filter. You might have missed the “arp” frames in this trace. This is different than capturing in “promiscuous” mode.

SHARKFEST '09 | Stanford University | June 15–18, 2009 Act II: Taming the SSH  Logging into a server via ssh takes over two minutes:  What are the top issues that come to mind for slow telnet/ssh login?  Let’s capture and find out. Packet captures are like Shakira’s hips. They don’t lie!  Lessons Learned:  Scroll through the trace to look for patterns. Again, trust your brain.  Develop a technique; a list of common filters to run through when troubleshooting. e.g. tcp.flags==02, tcp.analysis.flags  Don’t forget UDP. What important function runs on UDP?  Do not blindly trust the tcp analysis. Wireshark can only know what you feed it. It too suffers from GIGO (Garbage In, Garbage Out)  Use the graphical tools available in Wireshark. Picture *IS* worth a thousand words!  Capture placement is important. If I captured at the client, I would still be wondering why there is a delay!

SHARKFEST '09 | Stanford University | June 15–18, 2009 Act III: To Stream or Not to Stream?  Application developers report extreme slowness when ftp’ing a file.  What are the top issues that come to mind for slow ftp sessions?  Lessons Learned:  Scroll through the trace to look for patterns. Again, trust your brain.  Develop a technique; a list of common filters to run through when troubleshooting. e.g. tcp.flags==02, tcp.analysis.flags  Buffer tearing is pretty common. Applications are constantly trying to do TCP’s job. App bytes can help you identify it. Learn to recognize it! (Oracle, MS SQL, Sybase, they all do it)  Understand what “streaming” really means. TCP *HAS NO* byte boundaries.  Use the graphical tools available in Wireshark. Picture *IS* worth a thousand words!

SHARKFEST '09 | Stanford University | June 15–18, 2009 Act IV: Window’s Tale  Call center servers are not able to keep up with call volume after a data center migration  The servers are not getting the data fast enough - causing a backlog. What simple change can increase the throughput?  The path after the migration is longer by 50 ms.  Lessons Learned:  If latency is causing a problem, look for RFC1323 related problems.  Know what affects a transfer throughput. Buffer tearing, window sizes, or packet loss.  Use the graphical plots to zoom in on the problem – so let’s look at the window size. Should we look at the receive or send window?  Argue your case. If you’re right, you’re right! But you had better be right. You earn your “cred” over time, but you can blow it in one shot!  Use the graphical tools available in Wireshark. Picture *IS* worth a thousand words! See next page.

SHARKFEST '09 | Stanford University | June 15–18, 2009 Act IV: Window’s Tale Use STATISTICS, IO GRAPH to bring up this graph. Modify the highlighted items to bring up this view

SHARKFEST '09 | Stanford University | June 15–18, 2009 Act V: A User’s Complaint  Smith Barney Financial Consultants are complaining of slow page load times for their home page. The problem is sporadic and random but happens enough that it’s impacting their productivity.  The problem is wide-spread, not easily reproducible….where do you start? What do you do? “Who you gonna call?”  What’s common in the problem? Home page; use of load balancer; common backend servers; affecting many users.  What’s the job of a load balancer?  Where should we take the trace?  What “bad things” can happen if you are using a load balancer with Source NAT configured?

SHARKFEST '09 | Stanford University | June 15–18, 2009 Act V: A User’s Complaint (con’t)

SHARKFEST '09 | Stanford University | June 15–18, 2009 Act V: A User’s Complaint (con’t)

SHARKFEST '09 | Stanford University | June 15–18, 2009 Act V: A User’s Complaint (con’t)  Lessons Learned:  Start by looking at what infrastructure is in common for all users experiencing the problem.  What constitutes a TCP packet? 2-Tuple? 4-Tuple?  Remember that sequence numbers are nothing more than the number of bytes transferred. Acknowledgement is nothing more than an indication of how much of the data you received. You receive something outside of what’s expected, something went horribly wrong!  When you have a 22,000 user base, having a ephemeral port range of can be exhausted quickly.  Sometimes, you have to resort to turning off “relative sequence numbers” for analysis. This is especially true when load balancers – or any device that NATs – is in the data path.

SHARKFEST '09 | Stanford University | June 15–18, 2009 Act V: A User’s Complaint (con’t)  Lessons Learned (con’t): (Turn off relative sequence numbers)  Frames 1-8 contain the orderly close of a connections.  Frame 9 which occurs approx. 14 seconds later is an attempt of a ‘new’ client to open a connection to the LB. (Frame 10 is the LB translated request to the web server).  Frame 11 is an acknowledgement for the prior connection. This occurs, because the Web server still has this socket in FIN-WAIT. (Frame 12 is the translated request – LB to client).  Frames 13 and 14 is the RST generated by the client, and the translated request, respectively.  Frames contain a connection creation. This is allowed to occur because of the RST. However, this causes the client to pause for approx 3. Seconds.

SHARKFEST '09 | Stanford University | June 15–18, 2009 Act VI: As You Log It  After a data center migration, an application was no longer able to support the production traffic. The new data center was separated by 11ms round trip latency. Before the move, both servers were located in the same DC  Naturally, first inclination was to blame the network! After all, the problem started after the migration.  The application generates a 3 byte “alert” message followed by another small packet with the actual data.  What should be the first problem that comes to your mind?  What looked like a slam-dunk turned out be quite complicated!  In the Army, we had a saying: Be, Know, Do. It applies to packet analysis.  At the end of the day, in depth knowledge of how TCP should work allowed us to find the problem.

SHARKFEST '09 | Stanford University | June 15–18, 2009 Act VI: As You Log It (con’t)  Lessons Learned:  Nagle and Delayed Acknowledgment deadlock is very common when TCP is used to shuttle small amounts of data.  This can be a “killer” when trading programs are involved.  Turning on application level logging can help, but don’t forget to turn it off!  Know what impact you can have if you decide to log. For us router- jockeys, it’s equivalent to doing a “debug ip ospf” on a production backbone router. Hint: not a good idea. It’s a self correcting error – if you do it once, you’ll never do it again!  If you know how TCP really works, you can argue your point with conviction because deep down inside, you know you’re right.

SHARKFEST '09 | Stanford University | June 15–18, 2009 Appendix: IP’s used in the examples  ACT I: ICMP_BHNew*pcap  and are servers on the same switch.  ACT II: SlowSSHLoging2.pcap:  is the client is the ssh server and are NIS+ servers.  ACT III: SlowFtpAnon.pcap  is the ftp server client is pulling the file from the server.  ACT IV: MQSlow.pcap  is the MQ server is the MQ client. The server is pushing the file to the client.  ACT V: LBProblemNew.pcap  and are users in different branches and belong to the load balancer is the real web server is end user facing IP of the LB and is the IP used by the LB for source NAT’ing when talking to the real web server.  ACT VI: DCMove_*.pcap  and are two servers involved in the transfer. Both send data independently of one another. Please me at if you would like the “The Tool” Visio