1 Would Diversity Really Increase the Robustness of the Routing Infrastructure against Software Defects? February 2008 Juan Caballero, Theocharis Kampouris.

1 Would Diversity Really Increase the Robustness of the Routing Infrastructure against Software Defects? February 2008 Juan Caballero, Theocharis Kampouris Carnegie Mellon Dawn Song Carnegie Mellon & UC Berkeley Jia Wang AT&T Labs The answer is: Yes

2 Software defects in routers Defects in router software not uncommon Multiple vulnerabilities in routers uncovered –DoS: maliciously crafted packets cause reload [CERT5] –DoS: maliciously crafted packets cause excessive resource consumption [CERT2,CERT4] –Remote execution of system-level commands [CERT3] –Unauthorized privileged access [CERT1] –Possible remote shell execution [CERT7]

3 Simultaneous router failure Routing infrastructure highly homogeneous What if a software defect makes it possible to simultaneously take down many routers? –Worst case scenario. Rare. –But, huge impact  Highly damaging to ISP’s reputation Diversity –Multiple implementations from different code bases –Reduces number of nodes affected by a bug [Zhang01,Junqueira05,O’Donnell04] But, how well would it work on routers?

4 Scope We focus on the effect on network connectivity –Impact on higher layers left as future work –Includes: routing convergence, packet loss, delay… Why? –Because no connectivity means no communication What about fundamental limitations of diversity? –Vulnerabilities that are shared among vendors »General problem with no good solution –Deployment cost »Depends on how much diversity is already available

5 Statement This paper does not claim: –that diversity can protect against all software defects –that we should redesign all networks to accommodate for diversity Rather, we show: – that diversity greatly helps with simultaneous router failures –that networks might already have a surprising amount of diversity But, it is not used to increase the robustness!

6 Contributions Answering four fundamental questions: 1.How do we measure robustness of a network against simultaneous router failures? 2.How to best use the diversity? 3.How much diversity is needed to guarantee a certain degree of robustness? 4.Is there enough diversity already in the network or do we need to introduce more?

7 Problem definition Graph theoretic approach G = (V,E) –Nodes are routers (V), Edges are links (E) A version of a graph coloring problem where: –Colors represent implementations –A failure is a color removal –Different from well-known optimal coloring problem Network Robustness = Resilience to simultaneous router failure –How connected is the network when multiple nodes fail? The goal is to assign a color to each router from a set of k available colors such that the network robustness (Φ) is maximized

8 Determining the best coloring Abilene network with 2 colors (k = 2) Φ = 0.18 Φ = 0.42Φ = 0.05 Φ = 0.23 We want to automatically select the best coloring

9 Outline Introduction Metrics Evaluation Algorithms  Connectivity  Robustness

10 Metrics Need metrics to quantify the robustness of the colored graph  the resilience to the failure We need two types of metrics: 1.Connectivity metrics: Given a graph determine how connected it is –Many graph connectivity metrics already proposed –We select some existing ones 2.Robustness metrics: Given a colored graph determine how robust it is –We propose new ones –The robustness metrics will be a function of the connectivity metrics

12 Connectivity metrics: NSLC Given a graph determine how connected it is Normalized size of largest component (NSLC) [Albert00] A 1 component B 2 components NSLC = 1NSLC = 0.66

13 Connectivity metrics: PC Pair Connectivity (PC) [Park03] A 1 component B 2 components PC= 1PC = 0.33 We have versions of the metrics that support node weights

15 Robustness metrics Robustness of a colored graph measures the remaining connectivity when a color is removed –Remove a color => Disconnect all nodes using the color Robustness is a function of the connectivity metric f applied over the diverse color-removal subgraphs Probability of failure of each color is unknown Two metrics: average and minimum (worst-case)

16 Minimum and average robustness Average robustness good Minimum robustness bad Average robustness can be misleading by itself Robustness Metric (Φ)G2G2 Average Robustness (NSLC)0.5 Minimum Robustness (NSLC)0.18 G2G2 G 2 red G 2 blue NSLC=0.18 NSLC=0.82

17 Outline Introduction Metrics Evaluation Algorithms

18 Algorithms We have devised a total of 9 algorithms which can be classified into 4 families Only present the Region coloring algorithms in paper –Rest are on the extended version [ColoringTR] Region coloring algorithms outperform others in evaluation

19 Region coloring algorithms Divide the network into contiguous regions Regions are automatically found Includes 2 algorithms: Cluster & Partition –Algorithms accept number of regions (k) as input Graph partitioning algorithms try to balance the number of nodes in each partition (i.e., region) Region 1 Region 2

20 Results overview There is a trade-off usually between perfectly balanced partitions and contiguous partitions Results will show that: 1.Balanced regions are better 2.Slightly imbalanced but contiguous partitions are better than perfectly balanced but discontiguous partitions Good partition Region 1Region 2 Region 1Region 2 Region 1 Bad partition

21 Roles and Replicated nodes Roles: –Not all routers can use all implementations –Two roles: Access / Backbone –One color-set for each role –Nodes have roles and can only use implementations from the color-set of their role Replicated nodes: –ISPs usually replicate important nodes »Increases resilience against single node failures »Load-sharing –In real networks, replicas are colored identically –For robustness, replicas need to be colored differently

22 Extended Partition Algorithm 1.Color all backbone routers –Create backbone graph by removing all access routers –First color replicas with different colors –Then color rest using partition algorithm 2.Color the access routers –Create the access graph by collapsing all backbone nodes into a single node –Two cases depending on independence of access / backbone implementations

23 Outline Introduction Metrics Evaluation Algorithms

24 Evaluation Setup TopologyDateNodesEdges Tier-1 ISPOct. 2006A few hundredA couple thousand CenicAug. 20065191 AbileneSep. 20061215 ExodusJan. 2002201434 SprintJan. 20026042268 VerioJan. 20029602821 Fully connectedN/A1004950 Real Rocket fuel Synth. Metrics + algorithms implemented using the JUNG graph library [JUNG] Graph clustering algorithm from Wu et al. [Wu04] Graph partition algorithm from Karypis et al. [Karypis00]

25 Coloring Algorithms: Setup Same topology (Tier-1 ISP) colored using different algorithms Random as “lower bound” Max as “upper bound”

26 Coloring Algorithms: Results Partition/Cluster best on average Region coloring minimizes impact Partition best on worst case More balanced coloring than Cluster Partition performs close to Max in both average/worst cases Non-contiguous partitions are bad (dip at k=5)

27 Redistributing the existing diversity MetricOriginal coloringExtended Partition Average0.7130.855 Minimum0.0550.760 Tier-1 ISP contains 8 implementations (2 backbone, 6 access) ─ Due to: legacy routers, vendor change, budget constraints Two implementations used by 90% of the nodes What happens if we redistribute the same diversity using our algorithms? Number of nodes in largest component goes from 5% to 76% Requires: 1. Changing the number of nodes that use each implementation 2. Changing the geographical distribution of the implementations

28 Minimal diversity for decent robustness Two colors are enough for the backbone Most backbone routers are replicated Decent robustness starts with 3 colors for access routers More than 5 colors for access routers do not buy much

29 Related Work Diversity as solution against software defects –Diversity in all network layers [Zhang01] –Diversity in distributed systems [Junqueira05] –Diversity to slow malware propagation [O’Donnell04] Analysis of the Internet robustness [Albert00, Faloutsos99, Li04, Magoni03, Palmer01, Park03, Tangmunarunkit02, Zegura97] Analysis of failures in networks [Markopoulou04, NIST02] Router-level topologies [Spring02] Node Importance metrics [Freeman77, Lorrain71, Newman02, Tauro01] Clustering and Partitioning [Karypis00, Wu04, etc]

30 Conclusions 1.How do we measure robustness of a network against simultaneous router failures? Proposed robustness metrics 2.How to use the diversity best? Proposed coloring algorithms that achieve robustness close to the one obtained by a fully connected network 3.How much diversity is needed to guarantee a certain degree of robustness? Not much. 2 backbone + 3 access for Tier-1 ISP 4.Is there enough diversity already in the network or do we need to introduce more? Amount of diversity surprisingly high Redistributing the diversity can increase the number of nodes surviving a failure from 5% to 76%

31 Questions?

1 Would Diversity Really Increase the Robustness of the Routing Infrastructure against Software Defects? February 2008 Juan Caballero, Theocharis Kampouris.

Similar presentations

Presentation on theme: "1 Would Diversity Really Increase the Robustness of the Routing Infrastructure against Software Defects? February 2008 Juan Caballero, Theocharis Kampouris."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Would Diversity Really Increase the Robustness of the Routing Infrastructure against Software Defects? February 2008 Juan Caballero, Theocharis Kampouris.

Similar presentations

Presentation on theme: "1 Would Diversity Really Increase the Robustness of the Routing Infrastructure against Software Defects? February 2008 Juan Caballero, Theocharis Kampouris."— Presentation transcript:

Similar presentations

About project

Feedback