Presentation on theme: "Complexity and Degrees of Freedom in Network Design Michael Sinatra University of California, Berkeley 17 July 2007 Internet2/ESCC Joint Techs."— Presentation transcript:
Complexity and Degrees of Freedom in Network Design Michael Sinatra University of California, Berkeley 17 July 2007 Internet2/ESCC Joint Techs
Enhanced Gratuitous Logo Slide (EGLS)
Inspirations Terry Gray, Scott Sagan, Charles Perrow, Todd LaPorte, Martin Landau Poorly-designed networks and network disruption devices Greg Bell, Greg Travis and everyone who sent me interesting examples after the 5-9s talk
Redundancy in Systems Single points of “failure” Probabilistic analysis of redundancy –Redundant components can reduce the chances of failure –A component with a 10% failure probability can be made redundant with another component with a 10% failure probability and yield a 1% system-failure probability –But there’s a BIG assumption here!
Common-mode failures Components must be fully redundant! Are they? Classic example: aircraft engines Can you think of some networking examples?
Common-mode failures - example FW Switch Outside Inside
“Difficult” Failures You must be this tall to really break the network.
The Jordan Baker Phenomenon Nick:You're a rotten driver, either you ought to be more careful or you oughtn't drive at all. Jordan: I am careful. Nick: No you're not. Jordan: Well, other people are. Nick: What's that got to do with it? Jordan: They'll keep out of my way, It takes two to make an accident. Nick: Suppose you met somebody just as careless as yourself? Jordan: I hope I never will, I hate careless people. That's why I like you.
The Jordan Baker Phenomenon The problem is, there are too many careless devices on the network! Client Firewall Net LBServer
Common-mode failures - example Switch /FW Hosts Router To border
High-reliability organizations Demanded by high-reliability systems Organizational redundancy Change management Multiple approval/sign-off
High-reliability organizations Organizations can be made redundant in the same way as systems… …with many of the same problems –Common-mode failures –Non-linear complexity –And more…
Social shirking/buck passing Not really an analogous concept in physical systems Change-management difficulties
Overcompensation Has to do with the way physical systems are designed and operated Does anycast DNS encourage bad behavior?
Conclusions How do we deal with all of this? –Points of failure? Really points of freedom (and that’s a bad thing) –We need to reduce degrees of freedom in networks, not necessarily increase redundancy! –Networks need to get simpler, not more complex!
Conclusions Risks exist where we may not expect them –Five-nines mentality –Virtualization –Network disruption devices: duh! –Security Maybe we shouldn’t assume that the system can be made fully reliable (Travis)
Conclusions Need to recognize trade-offs: In complex systems, “win-win scenarios” are very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very RARE!