A victim-centric peer-assisted framework for monitoring and troubleshooting routing problems.

Slides:



Advertisements
Similar presentations
Data-Plane Accountability with In-Band Path Diagnosis Murtaza Motiwala, Nick Feamster Georgia Tech Andy Bavier Princeton University.
Advertisements

Theory Lunch. 2 Problem Areas Network Virtualization for Experimentation and Architecture –Embedding problems –Economics problems (markets, etc.) Network.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public ITE PC v4.0 Chapter 1 1 Addressing the Network – IPv4 Network Fundamentals – Chapter 6.
BY PAYEL BANDYOPADYAY WHAT AM I GOING TO DEAL ABOUT? WHAT IS AN AD-HOC NETWORK? That doesn't depend on any infrastructure (eg. Access points, routers)
CCNA2 Module 4. Discovering and Connecting to Neighbors Enable and disable CDP Use the show cdp neighbors command Determine which neighboring devices.
1 Scalability is King. 2 Internet: Scalability Rules Scalability is : a critical factor in every decision Ease of deployment and interconnection The intelligence.
Part IV: BGP Routing Instability. March 8, BGP routing updates  Route updates at prefix level  No activity in “steady state”  Routing messages.
11 TROUBLESHOOTING Chapter 12. Chapter 12: TROUBLESHOOTING2 OVERVIEW  Determine whether a network communications problem is related to TCP/IP.  Understand.
1 A survey of Internet Topology Discovery. 2 Outline Motivations Internet topology IP Interface Level Router Level AS Level PoP Level.
Traffic Engineering With Traditional IP Routing Protocols
NetQuest: A Flexible Framework for Internet Measurement Lili Qiu Joint work with Mike Dahlin, Harrick Vin, and Yin Zhang UT Austin.
Multiple constraints QoS Routing Given: - a (real time) connection request with specified QoS requirements (e.g., Bdw, Delay, Jitter, packet loss, path.
CS Summer 2003 Quiz 1 Q1) Answer the following: List one protocol that is commonly used for intra AS routing? List one protocol that is used for.
MIRED: Managing IP Routing is Extremely Difficult Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
A Measurement Framework for Pin-Pointing Routing Changes Renata Teixeira (UC San Diego) with Jennifer Rexford (AT&T)
Internet Routing Instability Labovitz et al. Sigcomm 1997 Largely adopted from Ion Stoica’s slide at UCB.
Dynamics of Hot-Potato Routing in IP Networks Renata Teixeira (UC San Diego) with Aman Shaikh (AT&T), Tim Griffin(Intel),
Routing problems are easy to cause, and hard to diagnose (“Happy operators make happy packets”) Jennifer Rexford AT&T Labs—Research
Measurement and Monitoring Nick Feamster Georgia Tech.
FTDCS 2003 Network Tomography based Unresponsive Flow Detection and Control Authors Ahsan Habib, Bharat Bhragava Presenter Mohamed.
1 CCNA 2 v3.1 Module 8. 2 TCP/IP Suite Error and Control Messages CCNA 2 Module 8.
04/05/20011 ecs298k: Routing in General... lecture #2 Dr. S. Felix Wu Computer Science Department University of California, Davis
CSE 461: Distance Vector Routing. Next Topic  Focus  How do we calculate routes for packets?  Routing is a network layer function  Routing Algorithms.
Stealth Probing: Efficient Data- Plane Security for IP Routing Ioannis Avramopoulos Princeton University Joint work with Jennifer Rexford.
Network Measurement Bandwidth Analysis. Why measure bandwidth? Network congestion has increased tremendously. Network congestion has increased tremendously.
Dynamics of Hot-Potato Routing in IP Networks Jennifer Rexford AT&T Labs—Research Joint work with Renata Teixeira, Aman.
EHealth Network Monitoring Network Tool Presentation J. Gaston Senior Network Design Seminar Professor Morteza Anvari 10 December 2004.
A Guide to major network components
A Scalable, Commodity Data Center Network Architecture.
Bandwidth DoS Attacks and Defenses Robert Morris Frans Kaashoek, Hari Balakrishnan, Students MIT LCS.
1 Studying Black Holes on the Internet with Hubble Ethan Katz-Bassett, Harsha V. Madhyastha, John P. John, Arvind Krishnamurthy, David Wetherall, Thomas.
INTERNET TOPOLOGY MAPPING INTERNET MAPPING PROBING OVERHEAD MINIMIZATION  Intra- and inter-monitor redundancy reduction IBRAHIM ETHEM COSKUN University.
CCNA Introduction to Networking 5.0 Rick Graziani Cabrillo College
1 Version 3.1 Module 4 Learning About Other Devices.
Reading Report 14 Yin Chen 14 Apr 2004 Reference: Internet Service Performance: Data Analysis and Visualization, Cross-Industry Working Team, July, 2000.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public ITE PC v4.0 Chapter 1 1 Troubleshooting Your Network Networking for Home and Small Businesses.
Towards a Logic for Wide- Area Internet Routing Nick Feamster Hari Balakrishnan.
Profiles and Multi-Topology Routing in Highly Heterogeneous Ad Hoc Networks Audun Fosselie Hansen Tarik Cicic Paal Engelstad Audun Fosselie Hansen – Poster,
1 Controlling IP Spoofing via Inter-Domain Packet Filters Zhenhai Duan Department of Computer Science Florida State University.
Top-Down Network Design Chapter Nine Developing Network Management Strategies Oppenheimer.
VeriFlow: Verifying Network-Wide Invariants in Real Time
CS 3700 Networks and Distributed Systems Inter Domain Routing (It’s all about the Money) Revised 8/20/15.
Happy Network Administrators  Happy Packets  Happy Users WIRED Position Statement Aman Shaikh AT&T Labs – Research October 16,
IDRM: Inter-Domain Routing Protocol for Mobile Ad Hoc Networks C.-K. Chau, J. Crowcroft, K.-W. Lee, S. H.Y. Wong.
© 2002, Cisco Systems, Inc. All rights reserved..
POSTECH DP&NM Lab. Internet Traffic Monitoring and Analysis: Methods and Applications (1) 4. Active Monitoring Techniques.
Reducing Transient Disconnectivity using Anomaly-Cognizant Forwarding Andrey Ermolinskiy, Scott Shenker University of California – Berkeley and ICSI.
Advanced Networking Lab. Given two IP addresses, the estimation algorithm for the path and latency between them is as follows: Step 1: Map IP addresses.
Chapter 6 – Connectivity Devices
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks BGP.
Use cases Navigation Problem notification Problem analysis.
Interdomain Routing Security. How Secure are BGP Security Protocols? Some strange assumptions? – Focused on attracting traffic from as many Ases as possible.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 6: Static Routing Routing and Switching Essentials.
A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance Feng Wang 1, Zhuoqing Morley Mao 2 Jia Wang 3, Lixin Gao 1,
A Light-Weight Distributed Scheme for Detecting IP Prefix Hijacks in Real-Time Lusheng Ji†, Joint work with Changxi Zheng‡, Dan Pei†, Jia Wang†, Paul Francis‡
1 A Framework for Measuring and Predicting the Impact of Routing Changes Ying Zhang Z. Morley Mao Jia Wang.
Detecting Selective Dropping Attacks in BGP Mooi Chuah Kun Huang November 2006.
Intradomain Traffic Engineering By Behzad Akbari These slides are based in part upon slides of J. Rexford (Princeton university)
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 6: Static Routing Routing and Switching Essentials.
Protocol Layering Chapter 11.
6.1 © 2004 Pearson Education, Inc. Exam Designing a Microsoft ® Windows ® Server 2003 Active Directory and Network Infrastructure Lesson 6: Designing.
Change Is Hard: Adapting Dependency Graph Models For Unified Diagnosis in Wired/Wireless Networks Lenin Ravindranath, Victor Bahl, Ranveer Chandra, David.
Company LOGO Network Management Architecture By Dr. Shadi Masadeh 1.
1 Effective Diagnosis of Routing Disruptions from End Systems Ying Zhang Z. Morley Mao Ming Zhang.
IETF 62 NSIS WG1 Porgress Report: Metering NSLP (M-NSLP) Georg Carle, Falko Dressler, Changpeng Fan, Ali Fessi, Cornelia Kappler, Andreas Klenk, Juergen.
MPLS Introduction How MPLS Works ?? MPLS - The Motivation MPLS Application MPLS Advantages Conclusion.
PlanetSeer: Internet Path Failure Monitoring and Characterization in Wide-Area Services Ming Zhang, Chi Zhang Vivek Pai, Larry Peterson, Randy Wang Princeton.
Chapter 2: Static Routing
Chapter 2: Static Routing
Chapter 2: Static Routing
Presentation transcript:

A victim-centric peer-assisted framework for monitoring and troubleshooting routing problems

How to Monitor? Four schemes How to monitor?ProsCons Monitor devices such as routers No overhead Device status does not directly translate into user perceived performance Monitor BGP updates No overhead; Know what happens in other network Do not see some data-plane anomaly Monitor flow- level traffic No overhead; Real traffic; Witness direct impact of failures Do not witness failures directly Active probingWitness direct impact of failures Extra overhead; May not mimic the real traffic

What Constraint to Monitor? Network meets ISP’s goals Resource utilization Routing goes as specified by policy … Network meets users’ goals Reachability Most fundamental end-to-end property Easy to define and formulate Delay, loss Less easier to define and formulate Application level: Bulk transfer, VOIP Depends on reachability, delay, loss, etc

Our Monitor Scheme Monitor reachability using active probing Focus on reachability Use ping – no need for remote cooperation Trade off between probing efficiency and probing coverage (challenges) Disclaimers Do not monitor delay or loss Do not consider ISP’s goals

Troubleshooting -- Next Step to Monitoring Goal of troubleshooting Localize the root cause How local? Depends on the nature of the cause Purpose of troubleshooting Local root cause Pin-point the problem and fix it Remote root cause Contact the responsible networks to solve the problem By-pass the faulty network

Localize the Root Cause AS 1 AS 2 AS 3 Topology dimension Forwarding paths (those who do forwarding) Control plane Physical and link layers Firewalls (those who prevent forwarding) Protocol dimension a->b b->c c->d m->n m->l n->l x->y y->z z->x Localize the cause at protocol level Link level Localize the cause at AS level Both AS and protocol level

Troubleshooting: Three building blocks Tool traceroute, ping, netflow, looking glass, etc Data: generated by tool e2e reachability, BGP updates, traffic profile, etc Brain: the intelligent part, usually network operator Digest the data, make inference, leverage dependency, draw from past experience The key of troubleshooting. Hard problem

What Can We Do to Improve? Improve the tool Promote the cooperation among networks Traceroute -> resilient remote traceroute BGP feed -> resilient remote BGP feed Improve the automation of brain Unify previous work

Automatic Brain It’s a challenging problem Fault may occur at multiple levels Involve machine learning Example work: Enterprise network services, sigcomm’07, by Paramvir Bahl et al.

Dependency Graph Approach Decompose a large system into components Infer the dependencies among components A depends on B: If B fails, A fails Lead to a hierarchy of dependencies: dependency graph (like Makefile) A set of observations on some components For example, F,H,X works but G fails Infer the status of other components using dependencies, finally locate the root cause component

Dependency Graph Example 1 Multi-tier dependency graph. Diagnoses multi-level fault but needs automated construction. [ From Paramvir Bahl et al, sigcomm’07 ]

Dependency Graph Example 2 Flat dependency graph. Diagnoses simple fault. [From Ramana Kompella et al, infocom’07 ]

Trade-off in Decomposition The granularity of decomposition determines the how specific the troubleshooting is Fine-grained decomposition Advantage: more specific Disadvantage: graph is more complex, constructing and solve it is challenging Coarse-grained decomposition Advantage: graph is simple, constructing and solving it is less challenging Disadvantage: less specific

Dependency Graph Regarding Internet Routing p can send packets to q Forwarding path p->q is OK Link u_i->u_{i+1} is up AS N_i has correct route AB A depends on B Path p->q before failure: IP hops: u_0, u_1, …, u_n, AS hops: N_0, N_1, …., N_m Physical path p->q is OK Control plane info is correctly propagated p can ping q q can send packets to p … AS N_i imports routes of prefix p N_{i+1}

Dependency Graph Regarding Internet Routing (cont.) Account for three common root causes Link/router failure Router misconfiguration leading to missing route (i.e. does not import route) Router misconfiguration or attack leading to prefix hijacking Topology-wise locate the root cause, and also tell among the three root causes Reasonably specific

Recent Work on Network Troubleshooting Infocom’07, Detection and Localization of Network Black Holes, by Ramana R. Kompella et al Automate the “brain”. Consider only physical failure. Mainly for intra-domain. Flat dependency graph. CoNext’07, NetDiagnoser: Troubleshooting network unreachabilities using end-to-end probes and routing data, by Amogh Dhamdhere et al Automate the “brain”. Consider both physical failure and control plane fault. For inter-domain. Flat dependency graph. Sigcomm’07, Automating Cross-layer Diagnosis of Enterprise Wireless Networks, by Cheng et al Improving the “tool”. Measure and infer various delays in a wireless environment Sigcomm’07, Towards Highly Reliable Enterprise Network Services Via Inference of Multi-level Dependencies, by Paramvir Bahl et al Automate the “brain”. Mainly for enterprise network and services. Deal with multi-level faults. Automatically generate multi-tier dependency graph.

NetDiagnoser: Overview Troubleshooting unreachability Fault assumption: Link failure, router misconfiguration causing partial link failure (in particular BGP export filter misconfiguration) Deal with filtered traceroute More comprehensive than previous work Infrastructure: sensors, all pair-wise traceroute Mechanisms: Binary tomography Per-neighor-basis logical link modeling control plane Combining BGP withdraw message

NetDiagnoser: Logical Links

Netdiagnoser: Dependency Assumption P can send packets to q Forwarding path p->q is OK Link u_i->u_{i+1} is up AS N_{i+1} exports prefix q to AS N_i AB A depends on B P->q: IP hops: u_0, u_1, …, u_n, AS hops: N_0, N_1, …., N_m Physical path p->q is OK Control plane info is correctly propagated