Presentation on theme: "Improving Internet Availability"— Presentation transcript:
1Improving Internet Availability Nick Feamster Georgia Tech
2Can the Internet be “Always On”? OK for and the Web, but what about:E911 serviceAir traffic control…Stanford University Clean-Slate Design for the Internet:“It is not difficult to create a list of desired characteristics for a new Internet. Deciding how to design and deploy a network that achieves these goals is much harder. … It should be:Robust and available. The network should be as robust, fault-tolerant and available as the wire-line telephone network is today.…
3Work to do…Various studies (Paxson, Andersen, etc.) show the Internet is at about 2.5 “nines”More “critical” (or at least availability-centric) applications on the InternetAt the same time, the Internet is getting more difficult to debugScale, complexity, disconnection, etc.It is not difficult to create a list of desired characteristics for a new Internet. Deciding how todesign and deploy a network that achieves these goals is much harder. Over time, our listwill evolve. It should be:1. Robust and available. The network should be as robust, fault-tolerant andavailable as the wire-line telephone network is today.
8Threats to Availability Natural disastersPhysical failures (node, link)Router software bugsMisconfigurationMis-coordinationDenial-of-service (DoS) attacksChanges in traffic patterns (e.g., flash crowd)…
9Availability of Other Services Carrier Airlines (2002 FAA Fact Book)41 accidents, 6.7M departures% availability911 Phone service (1993 NRIC report +)29 minutes per year per line99.994% availabilityStd. Phone service (various sources)53+ minutes per line per year99.99+% availabilityCredit: David Andersen job talk
10Can the Internet Be “Always On”? Various studies (Paxson, Andersen, etc.) show the Internet is at about 2.5 “nines”More “critical” (or at least availability-centric) applications on the InternetAt the same time, the Internet is getting more difficult to debugIncreasing scale, complexity, disconnection, etc.Is it possible to get to “5 nines” of availability? If so, how?
11Two Philosophies Bandage: Accept the Internet as is. Devise band-aids. Amputation: Redesign Internet routing to guarantee safety, route validity, and path visibility
12Two ApproachesProactive: Catch the fault before it happens on the live network.Reactive: Recover from the fault when it occurs, and mask or limit the damage.
13Tutorial Outline Proactive Reactive rcc (routers) FIREMAN (firewalls), OpNet, …IP Fast Reroute,RONRouting Control Platform4D ArchitectureCoNManFailure-Carrying PacketsMulti-Router ConfigurationPath SplicingBandageAmputation
14Proactive Techniques Today: router configuration checker (“rcc”) Check configuration offline, in advanceReason about protocol dynamics with static analysisTomorrowSimplify the configurationCONManSimplify the protocol operationRCP4D
15Some downtime is very hard to protect against… What can go wrong?Some downtime is very hard to protect against…But…Two-thirds of the problems are caused by configuration of the routing protocol
16Internet Routing Protocol: BGP Autonomous Systems (ASes)Route AdvertisementDestination Next-hop AS Path/16174… 2637SessionTrafficDiagram of routing table is very confusing because it’s not pointing to anythingGreen arrow shorter, and too thick… green is a msgMore intuition about how the system actually works.Don’t say “interdomain”DESTINATION-BASED RoutingTables look like a set of possible routes and a rankings over these routes(pop up a simplified table fragment)
17Question: What’s the difference between IGP and iBGP? Two Flavors of BGPiBGPeBGPExternal BGP (eBGP): exchanging routes between ASesInternal BGP (iBGP): disseminating routes to external destinations among the routers within an ASQuestion: What’s the difference between IGP and iBGP?
18Complex configuration! Flexibility for realizing goals in complex business landscapeWhich neighboring networks can send trafficWhere traffic enters and leaves the networkHow routers within the network learn routes to external destinationsTrafficPrimary route selectionTalk about customers and competitors; economic/business landscapeGoal: ensure that customers have reachability to rest of internet, etc.RouteNo RouteFlexibilityComplexity
19What types of problems does configuration cause? Persistent oscillation (last time)Forwarding loopsPartitions“Blackholes”Route instability…
20Real Problems: “AS 7007”“…a glitch at a small ISP… triggered a major outage in Internet access across the country. The problem started when MAI Network Services...passed bad router information from one of its customers onto Sprint.” news.com, April 25, 1997UUNetSprintFlorida InternetBarn
21Real, Recurrent Problems “…a glitch at a small ISP… triggered a major outage in Internet access across the country. The problem started when MAI Network Services...passed bad router information from one of its customers onto Sprint.” news.com, April 25, 1997“Microsoft's websites were offline for up to 23 hours...because of a [router] misconfiguration…it took nearly a day to determine what was wrong and undo the changes.” wired.com, January 25, 2001“WorldCom Inc…suffered a widespread outage on its Internet backbone that affected roughly 20 percent of its U.S. customer base. The network problems…affected millions of computer users worldwide. A spokeswoman attributed the outage to "a route table issue."-- cnn.com, October 3, 2002"A number of Covad customers went out from 5pm today due to, supposedly, a DDOS (distributed denial of service attack) on a key Level3 data center, which later was described as a route leak (misconfiguration).”-- dslreports.com, February 23, 2004
22January 2006: Route Leak, Take 2 Con Ed 'stealing' Panix routes (alexis) Sun Jan 22 12:38:All Panix services are currently unreachable from large portions of the Internet (though not all of it). This is because Con Ed Communications, a competence-challenged ISP in New York, is announcing our routes to the Internet. In English, that means that they are claiming that all our traffic should be passing through them, when of course it should not. Those portions of the net that are "closer" (in network topology terms) to Con Ed will send them our traffic, which makes us unreachable.“Of course, there are measures one can take against this sort of thing; but it's hard to deploy some of them effectively when the party stealing your routes was in fact once authorized to offer them, and its own peers may be explicitly allowing them in filter lists (which, I think, is the case here). “
23Several “Big” Problems a Week Informal analysis (collected from a mailing list archive): by no means a complete listBe careful: incidents not decreasing. With larger scale, these incidents could be further reaching.Statistical analysis just shows that there are a lot of these incidents. Don’t want to make the statement that 1 fire is not important.Do we need to make the point about ‘fires’?83 power outages, 1 fire“Two rats crawled through an underground cable conduit into a cabinet of power switching gear adjacent to the Stanford University cogeneration plant, and caused an explosion that cut off power to the Stanford area.” (October 12, 1996)
24Why is routing hard to get right? Defining correctness is hardInteractions cause unintended consequencesEach network independently configuredUnintended policy interactionsOperators make mistakesConfiguration is difficultComplex policies, distributed configurationSay:Talk is about the firstupcoming sigcomm paperNote: mistakes are not the only thingPrevents two endpoints from *being able* to communicate with each otherUnintended consequence: why is this a real problem in practice?(maybe a concrete example here…or with the quotes)
25Today: Stimulus-Response What happens if I tweak this policy…?RevertNoWorkflow!! Away from sr towards operation with tool can be used prior to deploymentPICTUREProblems may be maskedNot just mistakesWhat are the kinds of things that people are likely to want to change:For example: link provision, flash crowd (simple example)YesDesiredEffect?Wait forNext ProblemConfigureObserveProblems cause downtimeProblems often not immediately apparent
26Idea: Proactive Checks Distributed routerconfigurations(Single AS)“rcc”CorrectnessSpecificationConstraintsFaultsNormalizedRepresentationChallengesAnalyzing complex, distributed configurationDefining a correctness specificationMapping specification to constraintsSomehow include the theory on this slide. Say “local correctness specification”Play up the fact that normalized representation is *not easy*!Interesting engineering sidenote: difficult to parseScientific side note: had to come up with a normalized references so that testing constraints is easyCom,piler guys try to do this for multiple languages (cf)
27Correctness Specification SafetyThe protocol converges to a stable path assignment for every possible initial state and message orderingThe protocol does not oscillateExplanation of path visibility wasn’t obvious.Need to make a simpler explanation of the propertyGive some more informal language here: explicitly call out the problems that result when these things are violated. “if you don’t have path visibility, then…”
28We need additional correctness properties. What about properties of resulting paths, after the protocol has converged?We need additional correctness properties.
29Correctness Specification SafetyThe protocol converges to a stable path assignment for every possible initial state and message orderingThe protocol does not oscillatePath VisibilityEvery destination with a usable path has a route advertisementIf there exists a path,then there exists a routeExplanation of path visibility wasn’t obvious.Need to make a simpler explanation of the propertyGive some more informal language here: explicitly call out the problems that result when these things are violated. “if you don’t have path visibility, then…”Example violation: Network partitionRoute ValidityEvery route advertisement corresponds to a usable pathIf there exists a route,then there exists a pathExample violation: Routing loop
30Configuration Semantics Filtering: route advertisementRanking: route selectionCustomerPrimaryDissemination: internal route advertisementExternal filtering vs. internal dissemination(external vs. internal)What makes distributed configuration hard?Filtering: Who gets whatDissemination: how they get it (the path by which they get it)Ranking: what they getNot much else in terms of complicated side effects –Verifying distributed program’s correctnessCompetitorBackup
31Path Visibility: Internal BGP (iBGP) Default: “Full mesh” iBGP.Doesn’t scale.Large ASes use “Route reflection”Route reflector: non-client routes over client sessions; client routes over all sessionsClient: don’t re-advertise iBGP routes.Put “RR” and “C” in the bubbles to make clear what’s a route reflectorWhat problem arises?If the client receives something from a route reflector, it doesn’t re-advertise it
32iBGP Signaling: Static Check Theorem.Suppose the iBGP reflector-client relationship graph contains no cycles. Then, path visibility is satisfied if, and only if, the set of routers that are not route reflector clients forms a clique.Condition is easy to check with static analysis.Dina’s question about more complex example.
33(Cisco, Avici, Juniper, Procket, etc.) rcc ImplementationPreprocessorParserDistributed routerconfigurationsRelationalDatabase(mySQL)(Cisco, Avici, Juniper, Procket, etc.)ConstraintsHow many lines of code for each one of these modulesComplexity of verifier. (could also talk about this at the beginning)Advantages of sql: extensibleTalk about the *complexity* of the database and verifier operationsExtensible…Do it quickly!Declarative language to run deductive queriesVerifierFaults
34Configuration Checking: Take-home lessons Static configuration analysis uncovers many errorsMajor causes of error:Distributed configurationIntra-AS dissemination is too complexMechanistic expression of policyWORKFLOWSay something about routing as a distributed program. Approach still likely to be useful.
35Limitations of Static Analysis Problem: Many problems can’t be detected from static configuration analysis of a single ASDependencies/Interactions among multiple ASesContract violationsRoute hijacksBGP “wedgies” (RFC 4264)FilteringDependencies on route arrivalsSimple network configurations can oscillate, but operators can’t tell until the routes actually arrive.
36BGP WedgiesAS 1 implements backup link by sending AS 2 a “depref me” community.AS 2 sets localpref to smaller than that of routes from its upstream provider (AS 3 routes)AS 3AS 4AS 2“Depref”BackupPrimaryAS 1
37Wedgie: Failure and “Recovery” AS 3AS 4AS 2“Depref”BackupPrimaryAS 1Requires manual intervention
38Routing Attributes and Route Selection BGP routes have the following attributes, on which the route selection process is based:Local preference: numerical value assigned by routing policy. Higher values are more preferred.AS path length: number of AS-level hops in the pathMultiple exit discriminator (“MED”): allows one AS to specify that one exit point is more preferred than another. Lower values are more preferred.eBGP over iBGPShortest IGP path cost to next hop: implements “hot potato” routingRouter ID tiebreak: arbitrary tiebreak, since only a single “best” route can be selected
39Problems with MED R3 selects A R1 advertises A to R2 R2 selects C (R1 withdraws A from R2)R2 selects B(R2 withdraws C from R1)R1 selects A, advertises to R221R3R2ABMED: 10MED: 20CPreference between B and C at R2 depends on presence or absence of A.
40Tutorial Outline Proactive Reactive rcc (routers) FIREMAN (firewalls), OpNet, …IP Fast Reroute,RONRouting Control Platform4D ArchitectureCoNManFailure-Carrying PacketsMulti-Router ConfigurationPath SplicingBandageAmputation
41Routing Control Platform Before: conventional iBGPeBGPiBGPAfter: RCP gets “best” iBGP routes (and IGP topology)iBGPRCPCaesar et al., “Design and Implementation of a Routing Control Platform”, NSDI, 2005
42How ISPs Route Border router Internal router 624913Internal routerBorder routerSay “There are four parts to internet routing” (don’t say in route selection)Provide internal reachability (IGP)Learn routes to external destinations (eBGP)Distribute externally learned routes internally (iBGP)Select closest egress (IGP)
43What’s wrong with Internet routing? Full-mesh iBGP doesn’t scale# sessions, control traffic, router memory/cpuRoute-reflectors help by introducing hierarchybut introduce configuration complexity, protocol oscillations/loopsHard to manageMany highly configurable mechanismsDifficult to model effects of configuration changesHard to diagnose when things go wrongHard to evolveHard to provide new services, improve upon protocols
44Routing Control Platform What’s causing these problems?Each router has limited visibility of IGP and BGPNo central point of control/observationResource limitations on legacy routersCompute routes from central point, remove protocols from routersWhy drawing RCP above network boxes – just say orally, that it’s not part of the forawarind planeInter-AS ProtocolRCPRCPRCPnetworknetworknetwork
45RCP in a Single ISP RCP Better scalability: reduces load on routers Then bring up 3 points after describe slide (Animate)On this slide – say what it is that y’uouve impelemntede, this is an imprlmentation that emultates full mesh, empasisze per router decision making - very important piont to bring up!!! Say why is not just a route reflector.Put text on slide to say why arrows going in both directions, available routes up and assigned routes down.Reduce number of routers, just show some on maybe just internal ones send up igp. Maybe just say gets igp, and take out lines to internal routers.Better scalability: reduces load on routersEasier management: configuration from a single pointEasier evolvability: freedom from router software
46Example of a Forwarding Loop RR1RR2C2 sends packets to RR2 via its IGP shortest path which traverses C1C1 learns BGP route to destination from RR1C2 learns BGP route to destination from RR23113C1C21C1 sends packets to RR1 via its IGP shortest path which traverses C2
47Avoiding Forwarding Loops with RCP RR1RR2RCP“RR2”11“RR2”33C1C21RCP learns BGP routesComputes consistent router-level pathsIntrinsic loop freedom and faster convergenceNo neeed to stick to BGP decision process
48RCP architecture Routing Control Platform (RCP) Route Control Server (RCS)Available BGP routesBGP updates…Selected BGP routesBGP updates…Path cost matrixIGP link-state advertisements…BGP EngineIGP Viewer(NSDI ’04)Divide design into componentsReplication improves availabilityDistributed operation, but global state per component
49Challenges and contributions ReliabilityProblem: single point of failureContribution: simple replication of RCP componentsConsistencyProblem: inconsistent decisions by replicasContribution: guaranteed consistency without inter-replica protocolScalabilityProblem: storing all routes increases cpu/memory usageContribution : can support large ISP in one computer
50Generalization: 4D Architecture Separate decision logic from packet forwarding.Decision: makes all decisions re: network controlDissemination: connect routers with decision elementsDiscovery: discover physical identifiers and assign logical identifiersData: handle packets based on data output by the decision plane
51Configuration is too complex: Fix Bottom Up! ProblemSolutionMIBDepot.com lists:6200 SNMP MIBs, from 142 vendors, a million MIB objectsSNMPLink.com lists:More than 1000 management applicationsMarket survey:Config errors account for 62% of network downtimeCONMan abstraction exploits commonality among all protocolsProtocol details are hidden inside protocol implementationsShift complexity from network manager to protocol implementerWho in any event must deal with the complexity
52Research: Techniques for Availability Efficient algorithms for testing correctness offlineNetworks: VLANs, IGP, BGP, etc.Security: FirewallsScalable techniques for enforcing correct behavior in the protocol itself
53Tutorial Outline Proactive Reactive rcc (routers) FIREMAN (firewalls), OpNet, …IP Fast Reroute,RONRouting Control Platform4D ArchitectureCoNManFailure-Carrying PacketsMulti-Router ConfigurationPath SplicingBandageAmputation
54Reactive Approach Failures will happen…what to do? (At least) three optionsNothingDiagnosis + Semi-manual interventionAutomatic masking/recoveryHow to detect faults?At the network interface/node (MPLS fast reroute)End-to-end (RON, Path Splicing)How to mask faults?At the network layer (“conventional” routing, FRR, splicing)Above the network layer (Overlays)
55The Internet Ideal Dynamic routing routes around failures End-user is none the wiser
56Reality Routing pathologies: 3.3% of routes had “serious problems” Slow convergence: BGP can take a long time to convergeUp to 30 minutes!10% of routes available < 95% of the time [Labovitz]“Invisible” failures: about 50% of prolonged outages not visible in BGP [Feamster]
57What to Protect?LinksShared risk link groupsEnd-to-end paths
58Fast RerouteIdea: Detect link failure locally, switch to a pre-computed backup path that protects that linkTwo deployment scenariosMPLS Fast RerouteSource-routed path around each link failureRequires MPLS infrastructureIP Fast RerouteConnectionless alternativeVarious approaches: ECMP, Not-via
59IP Fast Reroute Interface protection (vs. path protection) Detect interface/node failure locallyReroute either to that node or one hop pastVarious mechanismsEqual cost multipathLoop-free AlternativesNot-via Addresses
60Equal Cost Multipath155Set up link weights so that several paths have equal costProtects only the paths for which such weights existS555ILink not protected1520155D
61ECMP: Strengths and Weaknesses SimpleNo path stretch upon recovery (at least not nominally)WeaknessesWon’t protect a large number of pathsHard to protect a path from multiple failuresMight interfere with other objectives (e.g., TE)
62Loop-Free Alternates Precompute alternate next-hop Choose alternate next-hop to avoid microloops:5326910DMore flexibility than ECMPTradeoff between loop-freedom and available alternate paths
63Not-via Addresses Connectionless version of MPLS Fast Reroute Local detection + tunnelingAvoid the failed componentRepair to next-next hopCreate special not-via addresses for ”deflection”2E addresses neededDSFBf
64Not-via Memory Overhead Extra FIB entriesdstnexthopKansas C.DenverNew York…2095117690212956398561893587233260846700548366NVD->KLos Angeles…# of extra entries = # of unidirectional links unprotected by LFA
65Not-via: Strengths and Weaknesses 100% coverageEasy support for multicast trafficDue to repair to next-next hopEasy support for SRLGsWeaknessesRelies on tunnelingHeavy processingMTU issuesSuboptimal backup path lengthsDue to repair to next-next hop
66Concerns with Recovery: Loops Topology changes caused by either failures or management operations can cause temporary inconsistent stateCausesAsymmetric link costsTopology changes that affect multiple linksUndesirable effectsWasted bandwidthFailure to recover
68Incremental Cost Change A change in a link cost of x can only cause loops whose “cyclic” cost is <=xMinimum cycle is 2 (1 in each direction)Cost change of 1 can never cause a loopWhere minimum cycle is larger, larger increments can be usedOnce cost reaches cost of alternate path no more loops possible
69Synchronized FIB Installation Network synchronized change-over at predetermined timeSignal/determine time to changeNetwork Synchronized Time (NTP is there)Either Two FIBs for fast swapSubstantial hardware implicationsFIB update “fast-enough” from change-over timeDependent on NTP
70Ordered FIB changes For any isolated link/node change Determine “safe” ordering for FIB installationbad news: update from edge to failure,good news: update from change to edgeEach router computes its “rank” with respect to the change.Delays for a number of worst-case FIB compute/install times proportional to its rank.
71Computing the Ordering Single Reverse SPF rooted at change nodeUse old SPT to determine relevant nodeFor bad news: count maximum depth of sub-tree below youFor good news: count maximum hops to change
72Delay Proportional to Network Diameter For Good News, rSPF gives necessary depth.For Bad News, rSPF is overly pessimistic for some topologies.Strategies to reduce unnecessary delayPrune rSPF by only considering the branch across the failure – but still too pessimistic.Run SPF rooted at edge nodes to correctly prune them – but doesn’t scale.Compare rSPFs before and after failureAvoids all micro-loops and requires single FIB install. Delay depends on network diameter; may be unacceptable.
73Ordered SPF Summary No forwarding changes required. No signalling required at time of change.Complete prevention of loops for isolated node or link changes.Requires cooperation from all routersMay delay re-convergence for tens of seconds (unless optional signalling used)
74Packet MarkingMark packets to force forwarding according to a particular topology.Topology can be new or old
75Failure-Carrying Packets When a router detects a failed link, the packet carries the information about the failureRouters recompute shortest paths based on these missing edges
76FCP: Strengths and Weaknesses Stretch is bounded/enforced for single failuresThough still somewhat high (20% of paths have 1.2+)No tunneling requiredWeaknessesOverheadOption 1: All nodes must have same network map, and recompute SPFOption 2: Packets must carry source routesMultiple failures could cause very high stretch
77Alternate Approach: Protect Paths Idea: compute backup topologies in advanceNo dynamic routing, just dynamic forwardingEnd systems (routers, hosts, proxies) detect failures and send hints to deflect packetsDetection can also happen locallyVarious proposalsMultihoming/multi-path routingMulti-router configurationsPath Splicing
78What is Multihoming?The use of redundant network links for the purposes of external connectivityCan be achieved at many layers of the protocol stack and many places in the networkMultiple network interfaces in a PCAn ISP with multiple upstream interfacesCan refer to having multiple connections toThe same ISPMultiple ISPs
79Why Multihome? Redundancy Availability Performance Cost Interdomain traffic engineering: the process by which a multihomed network configures its network to achieve these goals
80Redundancy Maintain connectivity in the face of: Physical connectivity problems (fiber cut, device failures, etc.)Failures in upstream ISP
81PerformanceUse multiple network links at once to achieve higher throughput than just over a single link.Allows incoming traffic to be load-balanced.30% of traffic70% of traffic
82Multihoming in IP Networks Today Stub AS: no transit service for other ASesNo need to use BGPMulti-homed stub AS: has connectivity to multiple immediate upstream ISPsNeed BGPNo need for a public AS numberNo need for IP prefix allocationMulti-homed transit AS: connectivity to multiple ASes and transit serviceNeed BGP, public AS number, IP prefix allocation
83BGP or no? Advantages of static routing Advantages of BGP Cheaper/smaller routers (less true nowadays)Simpler to configureAdvantages of BGPMore control of your destiny (have providers stop announcing you)Faster/more intelligent selection of where to send outbound packets.Better debugging of net problems (you can see the Internet topology now)
84Same Provider or Multiple? If your provider is reliable and fast, and affordably, and offers good tech-support, you may want to multi-home initially to them via some backup path (slow is better than dead).Eventually you’ll want to multi-home to different providers, to avoid failure modes due to one provider’s architecture decisions.
85Multihomed Stub: One Link Multiple links between same pair of routers.Default routes to “border”“Stub” ISPUpstream ISPDownstream ISP’s routers configure default (“static”) routes pointing to border router.Upstream ISP advertises reachability
86Multihomed Stub: Multiple Links Multiple links to different upstream routersBGP for load balance at edge“Stub” ISPUpstream ISPInternal routing for “hot potato”Use BGP to share loadUse private AS number (why is this OK?)As before, upstream ISP advertises prefix
87Multihomed Stub: Multiple ISPs Upstream ISP 1“Stub” ISPUpstream ISP 2Many possibilitiesLoad sharingPrimary-backupSelective use of different ISPsRequires BGP, public AS number, etc.
88Multihomed Transit Network ISP 1Transit ISPISP 3ISP 2BGP everywhereIncoming and outcoming trafficChallenge: balancing load on intradomain and egress links, given an offered traffic load
89Protecting Paths: Multi-Path Routing Idea: Compute multiple pathsIf paths are disjoint, one is likely to survive failureSend traffic along multiple paths in parallelTwo functionsDissemination: How nodes discover pathsSelection: How nodes send traffic along pathsKey problem: Scaling
90Multiple Routing Configurations Relies on multiple logical topologiesBuilds backup configurations so that all components are protectedRecovered traffic is routed in the backup configurationsDetection and recovery is localPath protection to egress node
91MRC: How It WorksPrecomputation of backup: Backup paths computed to protect failed nodes or edgesSet link weights high so that no traffic goes through a particular nodeRecovery: When a router detects a failure, it switches topologies.Packets keep track of when they have switched topologies, to avoid loops
92Configuration Each configuration is a set of link weights. Each is resistant to failures in node n and link l if no traffic is routed over l or via n. In this configuration, l and n are called isolated.There must exist enough configurations so that every network component is isolated in one of them.A link to an isolated node is called restricted. Its weight is set high enough so that only packets addressed to that node is sent over the link.Isolated links have their weight set to infinity, and are never used.9292
93Recovering from Failure The router does not immediately inform the rest of the network.Packets that should be forwarded over the failed interface are marked as belonging to the chosen backup configuration and are forwarded over an alternative interface.The routers along the way will see which configuration to use.
94MRC: Strengths and Weaknesses 100% coverageBetter control over recovery pathsRecovered traffic routed independentlyWeaknessesNeeds a topology identifierPacket marking, or tunnellingPotentially large number of topologies requiredNo end-to-end recoveryOnly one switch
95Multipath: Promise and Problems Bad: If any link fails on both paths, s is disconnected from tWant: End systems remain connected unless the underlying graph is disconnected
96Path SplicingCompute multiple forwarding trees per destination. Allow packets to switch slices midstream.tsStep 1 (Perturbations): Run multiple instances of the routing protocol, each with slightly perturbed versions of the configurationStep 2 (Parallelization): Allow traffic to switch between instances at any node in the protocol
97Perturbations Goal: Each instance provides different paths Mechanism: Each edge is given a weight that is a slightly perturbed version of the original weightTwo schemes: Uniform and degree-based“Base” Graphts3.5451.51.25Perturbed Graph33st3
98Slicing Goal: Allow multiple instances to co-exist Mechanism: Virtual forwarding tablesatcsbt at cSlice 1Slice 2dstnext-hop
99Path Splicing in Practice Packet has shim header with routing bitsRouters use lg(k) bits to index forwarding tablesShift bits after inspectionIncremental deployment is trivialPersistent loops cannot occurTo access different (or multiple) paths, end systems simply change the forwarding bits
101Recovery in the Wide-Area BGPScalabilityRouting overlays (e.g., RON)Performance (convergence speed, etc.)
102Slow Convergence in BGP Given a failure, can take up to 15 minutes to see BGP. Sometimes, not at all.
103BGP Path Exploration R * AS0 AS1 AS2 AS3 *B R via AS3 B R via AS0,AS3
104Routing Convergence in the Wild Route withdrawn, but stub cycles through backup path…
105Resilient Overlay Networks: Goal Increase reliability of communication for a small (i.e., < 50 nodes) set of connected hostsMain idea: End hosts discover network-level path failure and cooperate to re-route.
106RON: Resilient Overlay Networks Premise: application overlay network to increase performance and reliability of routingapplication- layer routerTwo-hop (application-level) route
107RON Can Outperform IP Routing IP routing does not adapt to congestionBut RON can reroute when the direct path is congestedIP routing is sometimes slow to convergeBut RON can quickly direct traffic through intermediaryIP routing depends on AS routing policiesBut RON may pick paths that circumvent policiesThen again, RON has its own overheadsPackets go in and out at intermediate nodesPerformance degradation, load on hosts, and financial costProbing overhead to monitor the virtual linksLimits RON to deployments with a small number of nodes
108RON Architecture Outage detection Active UDP-based probingUniform random in [0,14]O(n2)3-way probeBoth sides get RTT informationStore latency and loss-rate information in DBRouting protocol: Link-state between nodesPolicy: restrict some paths from hostsE.g., don’t use Internet2 hosts to improve non-Internet2 paths
109Main results RON can route around failures in ~ 10 seconds Often improves latency, loss, and throughputSingle-hop indirection works well enoughMotivation for second paper (SOSR)Also begs the question about the benefits of overlays
110When (and why) does RON work? Location: Where do failures appear?A few paths experience many failures, but many paths experience at least a few failures (80% of failures on 20% of links).Duration: How long do failures last?70% of failures last less than 5 minutesCorrelation: Do failures correlate with BGP instability?BGP updates often coincide with failuresFailures near end hosts less likely to coincide with BGPSometimes, BGP updates precede failures (why?)Feamster et al., Measuring the Effects of Internet Path Faults on Reactive Routing, SIGMETRICS 2003
111Location of FailuresWhy it matters: failures closer to the edge are more difficult to route around, particularly last-hop failuresRON testbed study (2003): About 60% of failures within two hops of the edgeSOSR study (2004): About half of failures potentially recoverable with one-hop source routingHarder to route around broadband failures (why?)
112Benefits of Overlays Access to multiple paths Fast outage detection Provided by BGP multihomingFast outage detectionBut…requires aggressive probing; doesn’t scaleQuestion: What benefits does overlay routing provide over traditional multihoming + intelligent routing (e.g., RouteScience)?
113Drawbacks and Open Questions EfficiencyRequires redundant traffic on access linksScalingCan a RON be made to scale to > 50 nodes?How to achieve probing efficiency?Interaction of overlays and IP networkInteraction of multiple overlays
114EfficiencyProblem: traffic must traverse bottleneck link both inbound and outboundUpstream ISPSolution: in-network support for overlaysEnd-hosts establish reflection points in routersReduces strain on bottleneck linksReduces packet duplication in application-layer multicast (next lecture)
115Interaction of Overlays and IP Network ISPs: “Overlays will interfere with our traffic engineering goals.”Likely would only become a problem if overlays became a significant fraction of all trafficControl theory: feedback loop between ISPs and overlaysPhilosophy: Who should have the final say in how traffic flows through the network?Traffic matrixEnd-hosts observe conditions, reactISP measures traffic matrix, changes routing config.Changes in end-to-end paths
116Interaction of Multiple Overlays End-hosts observe qualities of end-to-end pathsMight multiple overlays see a common “good path”Could these multiple overlays interact to create increase congestion, oscillations, etc.?“Selfish routing”
117Lesson from Routing Overlays End-hosts are often better informed about performance, reachability problems than routers.End-hosts can measure path performance metrics on the (small number of) paths that matterInternet routing scales well, but at the cost of performance
118Recovery: Research Problems Tradeoffs between stretch and reliabilityFast, scalable recovery in the wide areaInteractions between routing at multiple layers
119Summary Proactive Reactive rcc (routers) FIREMAN (firewalls), OpNet, … IP Fast Reroute,RONRouting Control Platform4D ArchitectureCoNManFailure-Carrying PacketsMulti-Router ConfigurationPath SplicingBandageAmputation
120Looking Forward: Rethinking Availability What definitions of availability are appropriate?DowntimeFraction of time that path exists between endpointsFraction of time that endpoints can communicate on any pathTransfer timeHow long must I wait to get content?(Perhaps this makes more sense in delay-tolerant networks, bittorrent-style protocols, etc.)Some applications depend more on availability of content, rather than uptime/availability of any particular Internet path or host