Internet Measurement 2007. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges.

Slides:



Advertisements
Similar presentations
Measurement: Techniques, Strategies, and Pitfalls Nick Feamster CS 7260 February 7, 2007.
Advertisements

1 o Two issues in practice – Scale – Administrative autonomy o Autonomous system (AS) or region o Intra autonomous system routing protocol o Gateway routers.
Cs/ee 143 Communication Networks Chapter 6 Internetworking Text: Walrand & Parekh, 2010 Steven Low CMS, EE, Caltech.
Internet Control Protocols Savera Tanwir. Internet Control Protocols ICMP ARP RARP DHCP.
CSCI 4550/8556 Computer Networks Comer, Chapter 23: An Error Reporting Mechanism (ICMP)
Network Measurement COS 461 Recitation
Internet Control Message Protocol (ICMP)
Internet Measurement Jennifer Rexford. Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet.
Traffic Engineering With Traditional IP Routing Protocols
Internet Traffic Patterns Learning outcomes –Be aware of how information is transmitted on the Internet –Understand the concept of Internet traffic –Identify.
Network Traffic Measurement and Modeling CSCI 780, Fall 2005.
Measurement in the Internet. Outline Internet topology Bandwidth estimation Tomography Workload characterization Routing dynamics.
Measurement and Monitoring Nick Feamster Georgia Tech.
Network Monitoring for Internet Traffic Engineering Jennifer Rexford AT&T Labs – Research Florham Park, NJ 07932
Department of Electronic Engineering City University of Hong Kong EE3900 Computer Networks Transport Protocols Slide 1 Transport Protocols.
FIREWALLS & NETWORK SECURITY with Intrusion Detection and VPNs, 2 nd ed. 6 Packet Filtering By Whitman, Mattord, & Austin© 2008 Course Technology.
5/12/05CS118/Spring051 A Day in the Life of an HTTP Query 1.HTTP Brower application Socket interface 3.TCP 4.IP 5.Ethernet 2.DNS query 6.IP router 7.Running.
Network Measurement Bandwidth Analysis. Why measure bandwidth? Network congestion has increased tremendously. Network congestion has increased tremendously.
Network Layer Moving datagrams. How do it know? Tom-Tom.
CCNA Introduction to Networking 5.0 Rick Graziani Cabrillo College
Reading Report 14 Yin Chen 14 Apr 2004 Reference: Internet Service Performance: Data Analysis and Visualization, Cross-Industry Working Team, July, 2000.
Guide to TCP/IP, Third Edition
TCP/IP Protocol Suite 1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 9 Internet Control Message.
ICMP (Internet Control Message Protocol) Computer Networks By: Saeedeh Zahmatkesh spring.
Packet Filtering. 2 Objectives Describe packets and packet filtering Explain the approaches to packet filtering Recommend specific filtering rules.
Internet Control Message Protocol (ICMP)
Chapter 4. After completion of this chapter, you should be able to: Explain “what is the Internet? And how we connect to the Internet using an ISP. Explain.
Exploring the Packet Delivery Process Chapter
ICMP : Internet Control Message Protocol. Introduction ICMP is often considered part of the IP layer. It communicates error messages and other conditions.
1 IP: putting it all together Part 2 G53ACC Chris Greenhalgh.
Chapter 6: Packet Filtering
1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.
Network Measurement Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
TCP/IP Essentials A Lab-Based Approach Shivendra Panwar, Shiwen Mao Jeong-dong Ryoo, and Yihan Li Chapter 5 UDP and Its Applications.
IP Forwarding.
POSTECH DP&NM Lab. Internet Traffic Monitoring and Analysis: Methods and Applications (1) 4. Active Monitoring Techniques.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 2 Module 9 Basic Router Troubleshooting.
Using Measurement Data to Construct a Network-Wide View Jennifer Rexford AT&T Labs—Research Florham Park, NJ
Computer Networks Performance Metrics. Performance Metrics Outline Generic Performance Metrics Network performance Measures Components of Hop and End-to-End.
Fundamentals of Computer Networks ECE 478/578 Lecture #19: Transport Layer Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
Packet Filtering Chapter 4. Learning Objectives Understand packets and packet filtering Understand approaches to packet filtering Set specific filtering.
1 The Internet and Networked Multimedia. 2 Layering  Internet protocols are designed to work in layers, with each layer building on the facilities provided.
Chapter 6 – Connectivity Devices
Tony McGregor RIPE NCC Visiting Researcher The University of Waikato DAR Active measurement in the large.
性能评价技术 : 实验 - 测量,解析,仿真 / 模拟 实验 / 测量 (measurement) 技术:通过测量设备或测量程 序(软件)直接测量计算机系统的各种性能指标,或与之相 关的量,然后由它们经过运算求出相应的性能的指标。 模型 / 建模 (modeling) 技术:对评价的计算机系统建立一.
Review the key networking concepts –TCP/IP reference model –Ethernet –Switched Ethernet –IP, ARP –TCP –DNS.
1 Internet Control Message Protocol (ICMP) Used to send error and control messages. It is a necessary part of the TCP/IP suite. It is above the IP module.
Internet Measurement Basics
CCNA 2 Week 9 Router Troubleshooting. Copyright © 2005 University of Bolton Topics Routing Table Overview Network Testing Troubleshooting Router Issues.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
Module 10: How Middleboxes Impact Performance
Networking Fundamentals. Basics Network – collection of nodes and links that cooperate for communication Nodes – computer systems –Internal (routers,
Lecture 14 Internet Measurements. 2 Web of interconnected networks Grows with no central authority Autonomous Systems optimize local communication efficiency.
Trajectory Sampling for Direct Traffic Oberservation N.G. Duffield and Matthias Grossglauser IEEE/ACM Transactions on Networking, Vol. 9, No. 3 June 2001.
Sniffer, tcpdump, Ethereal, ntop
Lecture 14: Internet Measurement CS 765: Complex Networks.
1 IEX8175 RF Electronics Avo Ots telekommunikatsiooni õppetool, TTÜ raadio- ja sidetehnika inst.
TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.
Spring 2000CS 4611 Routing Outline Algorithms Scalability.
1 Switching and Forwarding Sections Connecting More Than Two Hosts Multi-access link: Ethernet, wireless –Single physical link, shared by multiple.
IP1 The Underlying Technologies. What is inside the Internet? Or What are the key underlying technologies that make it work so successfully? –Packet Switching.
Internet Measurements. 2 Web of interconnected networks Grows with no central authority Autonomous Systems optimize local communication efficiency The.
1 Internet Traffic Measurement and Modeling Carey Williamson Department of Computer Science University of Calgary.
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
Measurement: Techniques, Strategies, and Pitfalls David Andersen CMU Many (most) slides in this lecture from Nick Feamster's measurement lecture.
Network Tools and Utilities
CCNA 2 v3.1 Module 6 Routing and Routing Protocols
Transport Layer Unit 5.
Network Core and QoS.
Network Core and QoS.
Presentation transcript:

Internet Measurement 2007

Outline Measurement overview –Why measure? Why model measurements? –What to measure? Where to measure? Internet challenges Measurement tools –Active: ping, traceroute, and pathchar –Passive: logs, SNMP, packet, and flow monitoring Operational applications of measurement Discussion

性能评价技术 : 实验 - 测量,解析,仿真 / 模拟 实验 / 测量 (measurement) 技术:通过测量设备或测量程 序(软件)直接测量计算机系统的各种性能指标,或与之相 关的量,然后由它们经过运算求出相应的性能的指标。 模型 / 建模 (modeling) 技术:对评价的计算机系统建立一 个适当的模型,然后求出模型的性能指标,以便对计算机系 统进行评价,该技术又分为解析技术和仿真技术两种。 解析 (analysis) 技术是采用数学分析方法,通过对系统的 简化及解析模型的建立,以求得系统的性能。 仿真 (simulation) 技术是采用软件仿真原理,通过构造仿 真模型,详尽、逼真地描述计算机系统。当模型按照系统本 身的方式运行时,对系统的动态行为进行统计,从而得到有 关的性能指标。 测量、解析、仿真之间,相互联系,相互验证,各有优缺。 模拟( emulation ) ---- simulation + experiment

Why Measure? The Internet is a man-made system, so why do we need to measure it? –Because we still don’t really understand it –Because sometimes things go wrong –Analyze/characterize network phenomena Measurement for network operations –Detecting and diagnosing problems –What-if analysis of future changes Measurement for scientific discovery –Characterizing a complex system as organism –Creating accurate models that represent reality –Identifying new features and phenomena –Test new tools, protocols, systems, etc.

Why Build Models of Measurements? Compact summary of measurements –Efficient way to represent a large data set –E.g., exponential distribution with mean 100 sec Expose important properties of measurements –Reveals underlying cause or engineering question –E.g., mean RTT to help explain TCP throughout Generate random but realistic data as input –Generate new data that agree in key properties –E.g., topology models to feed into simulators “All models are wrong, but some models are useful.” – George Box

What Can be Measured? Traffic –Packet or flow traces –Load statistics Performance of paths –Application performance, e.g,. Web download time –Transport performance, e.g., TCP bulk throughput –Network performance, e.g., packet delay and loss Network structure –Topology, and paths on the topology –Dynamics of the routing protocol Performance Metrics –Throughput, Latency, Response time, Loss, Utilization, Arrival rate, Bandwidth, Routing ( hop ), Reliability

Sample Question: Topology? What is the topology of the network? –At the IP router layer –Without “inside” knowledge or official network maps –Without SNMP or other privileged access Why do we care? –Often need topologies for simulation and evaluation –Intrinsic interest in how the Internet behaves “But we built it! We should understand it” Emergent behavior; organic growth

Where Measure? Short answer –Anywhere you can! End hosts –Sending active probes to measure performance –Application logs, e.g., Web server logs Individual links/routers –Load statistics, packet traces, flow traces –Configuration state –Routing-protocol messages or table dumps –Alarms

Internet Challenges Make Measurement an Art Stateless routers –Routers do not routinely store packet/flow state –Measurement is an afterthought, adds overhead IP narrow waist –IP measurements cannot see below network layer –E.g., link-layer retransmission, tunnels, etc. Violations of end-to-end argument –E.g., firewalls, address translators, and proxies –Not directly visible, and may block measurements Decentralized control –Autonomous Systems may block measurements –No global notion of time

Active Measurement: Ping Adding traffic for purposes of measurement –Send probe packet(s) into the network and measure a response –Trade-offs between accuracy and overhead –Need careful methods to avoid introducing bias Ping: RTT and connectivity –Host sends an ICMP ECHO packet to a target –… and captures the ICMP ECHO REPLY –Useful for checking connectivity, and RTT –Only requires control of one of the two end-points Problems with ping –Round-trip rather than one-way delays –Some hosts might not respond

Active Measurement: Traceroute Traceroute: path and RTT –TTL (Time-To-Live) field in IP packet header Source sends a packet with a TTL of n Each router along the path decrements the TTL “TTL exceeded” sent when TTL reaches 0 –Traceroute tool exploits this TTL behavior Send packets with increasing TTL values source destination TTL=1 Time exceeded TTL=2 Send packets with TTL=1, 2, 3, … and record source of “time exceeded” message

Problems with Traceroute Round-trip vs. one-way measurements –Paths may have asymmetric properties –Can’t unambiguously identify one-way outages Failure to reach host : failure of reverse path? Returns IP address of interfaces, not routers –Routers have multiple interfaces –IP address of “time exceeded” packet may be the outgoing interface of the return packet Non-participating network elements –Some routers and firewalls don’t reply –ICMP messages “TTL exceeded” may be filtered or rate-limited Inaccurate delay –including processing delays on the router

Famous Traceroute Pitfall Question: What ASes does traffic traverse? Strawman approach –Run traceroute to destination –Collect IP addresses –Use “whois” to map IP addresses to AS numbers Thought Questions –What IP address is used to send “time exceeded” messages from routers? –How are interfaces numbered? –How accurate is whois data?

Measuring multiple paths –Host sends out a sequence of packets –Successive probes may traverse different paths Each has a different destination port Load balancers send probes along different paths Less Famous Traceroute Pitfall Question: Why won’t just setting same port number work?

More Caveats: Topology Measurement Routers have multiple interfaces Measured topology is a function of vantage points Example: Node degree –Must “alias” all interfaces to a single node (PS 2) –Is topology a function of vantage point? Each vantage point forms a tree See Lakhina et al.

Applications of Traceroute Network troubleshooting –Identify forwarding loops and black holes –Identify long and convoluted paths –See how far the probe packets get Network topology inference –Launch traceroute probes from many places –… toward many destinations –Join together to fill in parts of the topology –… though traceroute undersamples the edges

Designing for Measurement What mechanisms should routers incorporate to make traceroutes more useful? –Source IP address to “loopback” interface –AS number in time-exceeded message –?? More general question: How should the network support measurement (and management)?

Active Measurement: Pathchar for Links ---- per-hop capacity, latency, loss Three delay components: How to infer d,c? d min. RTT (L) L rtt(i+1) -rtt(i) slope=1/c 

Passive Measurement –Capture data as it passes by Two Main Approaches –Packet-level Monitoring Keep packet-level statistics Examine (and potentially, log) variety of packet-level statistics. Essentially, anything in the packet Timing –Flow-level Monitoring Monitor packet-by-packet (though sometimes sampled) Keep aggregate statistics on a flow

Passive Measurement: Logs at Hosts Web server logs –Host, time, URL, response code, content length, … –E.g., [15/Oct/1998:00:00: ] "GET /images/wwwtlogo.gif HTTP/1.0" " "Mozilla/2.0 (compatible; MSIE 3.02; Update a; AK; AOL 4.0; Windows 95)" "-" DNS logs –Request, response, time Useful for workload characterization, troubleshooting, etc.

Passive Measurement: SNMP Simple Network Management Protocol (SNMP) –Get # of packets across interface per 5 min or other similar very coarse states -- –Coarse-grained counters on the router –E.g., byte and packet counts Polling –Management system can poll the counters –E.g., once every five minutes Advantages: ubiquitous Limitations –Extremely coarse-grained statistics –Delivered over UDP!

Host A Host B Host C Monitor SwitchSwitch Multicast switch Passive Measurement: Packet Monitoring Tapping a link Host A Host B Monitor Shared media (Ethernet, wireless) Router A Router B Monitor Splitting a point-to-point link Router A Line card that does packet sampling

Packet Monitoring: Selecting the Traffic Filter to focus on a subset of the packets –IP addresses/prefixes (e.g., to/from specific Web sites, client machines, DNS servers, mail servers) –Protocol (e.g., TCP, UDP, or ICMP) –Port numbers (e.g., HTTP, DNS, BGP, Napster) Collect first n bytes of packet (snap length) –Medium access control header (if present) –IP header (typically 20 bytes) –IP+UDP header (typically 28 bytes) –IP+TCP header (typically 40 bytes) –Application-layer message (entire packet)

Packet Capture: tcpdump/bpf Put interface in promiscuous mode Use bpf (Berkeley packet filter) to extract packets of interest Accuracy Issues –Packets may be dropped by filter –Failure of tcpdump to keep up with filter –Failure of filter to keep up with dump speeds –Question: How to recover lost information from packet drops?

Tcpdump Output (three-way TCP handshake and HTTP request message) 23:40: eth0 > > lovelace.acm.org.www: S : (0) win (DF) timestamp client address and port # Web server (port 80) SYN flag 23:40: eth : S : (0) ack win :40: eth0 > > lovelace.acm.org.www:. 1:1(0) ack 1 win (DF) 23:40: eth0 > > lovelace.acm.org.www: P 1:513(512) ack 1 win (DF) 23:40: eth :. 1:1(0) ack 513 win :40: eth0 > > lovelace.acm.org.www: P 513:676(163) ack 1 win (DF) 23:40: eth : P 1:179(178) ack 676 win sequence number TCP options

Analysis of Packet Traces IP header –Traffic volume by IP addresses or protocol –Burstiness of the stream of packets –Packet properties (e.g., sizes, out-of-order) TCP header –Traffic breakdown by application (e.g., Web) –TCP congestion and flow control –Number of bytes and packets per session Application header –URLs, HTTP headers (e.g., cacheable response?) –DNS queries and responses, user key strokes, …

flow 1flow 2flow 3 flow 4 Aggregating Packets into IP Flows Set of packets that “belong together” –Source/destination IP addresses and port # –Same protocol, ToS bits, … –Same input/output interfaces at a router Packets that are “close” together in time –Maximum spacing between packets (e.g., 15 sec, 30 sec) –Example: flows 2 and 4 are different flows due to time

Traffic Flow Statistics Flow monitoring (e.g., Cisco Netflow) –Statistics about groups of related packets (e.g., same IP/TCP headers and close in time) –Records header information, counts, and time –May be sampled Flow Record Contents –Basic information about the flow…… Source and Destination, IP address and port Packet and byte counts Start and end times ToS, TCP flags –plus, information related to routing Next-hop IP address Source and destination AS Source and destination prefix

Packet vs. Flow Measurement Basic statistics (available from both techniques) –Traffic mix by IP addresses, port numbers, and protocol –Average packet size Traffic over time –Both: traffic volumes on a medium-to-large time scale –Packet: burstiness of the traffic on a small time scale Statistics per TCP connection –Both: number of packets & bytes transferred over the link –Packet: frequency of lost or out-of-order packets, and the number of application-level bytes delivered Per-packet info (available only from packet traces) –TCP seq/ack #s, receiver window, per-packet flags, … –Probability distribution of packet sizes –Application-level header and body (full packet contents)

Why Trust Your Data? Measurement requires a degree of suspicion –Why should I trust your data? Why should you? Resolving that... –Use current best practices e.g., paris-traceroute, CAIDA topologies, etc. –Don't trust the data until forced to Sanity checks and cross-validation Spot checks (when applicable)

Strategy: Examine the Zeroth-Order Paxson calls this “looking at spikes and outliers” More general: Look at the data, not just aggregate statistics –Tempting/dangerous to blindly compute aggregates –Timeseries plots are telling (gaps, spikes, etc.) –Basics Are the raw trace files empty? –Need not be 0-byte files (e.g., BGP update logs have state messages but no updates) Metadata/context: Did weird things happen during collection (machine crash, disk full, etc.)

Strategy: Sanity Checks & Cross-Validation Paxson breaks cross validation into two aspects –Self-consistency checks (and sanity checks) –Independent observations (Looking at same phenomenon in multiple ways) Example Sanity Checks –Exploiting additional properties of the measured phenomenon E.g., TCP: reliability, ACK cumulative (packet drop measurement problem) –Is time moving backwards? Typical cause: clock synchronization issues –Has the speed of light increased? E.g., 10ms cross-country latencies –Do values make sense? IP addresses like indicate bug

Cross-Validation Example Traceroutes captured in parallel with BGP routing updates Puzzle –Route monitor sees route withdrawal for prefix –Routing table has no route to the prefix –IP addresses within prefix still reachable from within the IP address space (i.e., traceroute goes through) Why? –Collection bugs … or –Broken mental model of routing setup: A default route!

Measurement Challenges for Operators Network-wide view –Crucial for evaluating control actions –Multiple kinds of data from multiple locations Large scale –Large number of high-speed links and routers –Large volume of measurement data Poor state-of-the-art –Working within existing protocols and products –Technology not designed with measurement in mind The “do no harm” principle –Don’t degrade router performance –Don’t require disabling key router features –Don’t overload the network with measurement data

Network Operations Tasks Reporting of network-wide statistics –Generating basic information about usage and reliability Performance/reliability troubleshooting –Detecting and diagnosing anomalous events Security –Detecting, diagnosing, and blocking security problems Traffic engineering –Adjusting network configuration to the prevailing traffic Capacity planning –Deciding where and when to install new equipment

Basic Reporting Producing basic statistics about the network –For business purposes, network planning, ad hoc studies Examples –Proportion of transit vs. customer-customer traffic –Total volume of traffic sent to/from each private peer –Mixture of traffic by application (Web, Napster, etc.) –Mixture of traffic to/from individual customers –Usage, loss, and reliability trends for each link Requirements –Network-wide view of basic traffic and reliability statistics –Ability to “slice and dice” measurements in different ways (e.g., by application, by customer, by peer, by link type)

Troubleshooting Detecting and diagnosing problems –Recognizing and explaining anomalous events Examples –Why a backbone link is suddenly overloaded –Why the route to a destination prefix is flapping –Why DNS queries are failing with high probability –Why a route processor has high CPU utilization –Why a customer cannot reach certain Web sites Requirements –Network-wide view of many protocols and systems –Diverse measurements at different protocol levels –Thresholds for isolating significant phenomena

Security Detecting and diagnosing problems –Recognizing suspicious traffic or disruptions Examples –Denial-of-service attack on a customer or service –Spread of a worm or virus through the network –Route hijack of an address block by adversary Requirements –Detailed measurements from multiple places –Including deep-packet inspection, in some cases –Online analysis of the data –Installing filters to block the offending traffic

Traffic Engineering Adjusting resource allocation policies –Path selection, buffer management, and link scheduling Examples –OSPF weights to divert traffic from congested links –BGP policies to balance load on peering links –Link-scheduling weights to reduce delay for “gold” traffic Requirements –Netwrk-wide view of the traffic carried in backbone –Timely view of the network topology and config –Accurate models to predict impact of control operations (e.g., the impact of RED parameters on TCP throughput)

Capacity Planning Deciding whether to buy/install new equipment –What? Where? When? Examples –Where to put the next backbone router –When to upgrade a link to higher capacity –Whether to add/remove a particular peer –Whether the network can accommodate a new customer –Whether to install a caching proxy for cable modems Requirements –Projections of future traffic patterns from measmnt –Cost estimates for buying/deploying new equipmnt –Model of the potential impact of the change (e.g., latency reduction and bandwidth savings from a caching proxy)

Examples of Public Data Sets Network-wide data –Abilene and GEANT backbones –Netflow, IGP, and BGP traces CAIDA DatCat –Data catalogue maintained by CAIDA – Interdomain routing –RouteViews and RIPE-NCC –BGP routing tables and update messages Traceroute and looking glass servers – –

PlanetLab for Network Measurement Nodes are largely at academic sites –Other alternatives: RON testbed (disadvantage: smaller, less software support) Repeatability of network experiments is tricky –Proportional sharing Minimum guarantees provided by limiting the number of outstanding shares –Work-conserving CPU scheduler means experiment could get more resources if there is less contention

Discussion How important is accuracy of the data? How can we validate measurement studies? How to do controlled experiments with measurement techniques? Can we move measurement to a science rather than an art? Can we identify incentives for making measurement possible and data available? M easurement is meaningless without careful analysis Distributed analysis of measurement data? An architecture for router or line-card support for traffic and performance measurement? Trade-offs between security and privacy?