Conceiving “Availability” 1. It seems like the basic objective “All” a network does is make stuff available. – We view with suspicion networks that transform.

Conceiving “Availability” 1

It seems like the basic objective “All” a network does is make stuff available. – We view with suspicion networks that transform what they transport. – We are ok with networks that store it—that just makes it more available. But our field seems to lack a theory of availability, nor a general definition. 2

In the early days… Pragmatically, we knew things failed. Links and routers would fail. – So we did dynamic routing. Routers would drop packets. – So we did retransmission. We addressed specific aspects of availability. – We had no general theory. 3

We can generalize this Assume that a network without failures will be available. – Begs definition of “failure”. – Included deliberate adverse intervention. To deal with failures: – Must detect them. – Must localize them to a region or component. – Must reconfigure so as not to depend on that region or component. – Must trigger repair of the failed device. 4

In practice We do ok with simple failures – Fail-stop. Routers talk to each other—if they stop talking we assume they failed. – Localization is implied We do much less well with more general or Byzantine failures. – Devices that succeed at the “I’m ok” protocol but don’t actually perform their function. 5

Example: email If a mail relay agent fails to receive mail, go back to DNS and find another agent. However: – If it receives mail but does not forward it, no detection of failure. – No way to tell the operator of the agent that it has failed. Just wait until a human figures it out. 6

Security makes it worse Attacks do not normally manifest as simple failures. – Comcast attacked BitTorrent by injecting resets. – Redirection of packets due to rerouting. Encryption can prevent disclosure. Encryption turns attacks on integrity into a successful attack on availability. – Users care about availability. One reason they “click through” warnings. 7

Detecting failure By the end-to-end principle, the only locus that can in general detect failures is the end- points. – They understand what the correct function is. – But how can they localize the problem? – And how can they avoid the affected region? – And can they tell anyone about the failure? 8

Is this logic true in general? Perhaps for some architecture, this problem is mitigated by design. – Any failure that affects correct delivery can be detected by the network. No end-node correction required. – Or there is a well-defined end-node response to all classes of failure. Is there an exhaustive classification of failure modes? – Encryption may help. Does not prevent failures. Reduces the kinds of failures. 9

Quantification Are there aspects of availability that are amenable to quantification. – Can we talk in a meaningful way about a system that is “more” available? Are any such measures useful to compare different architectural approaches with respect to availability? 10

Metrics of availability Is there enough redundancy to allow reconfiguration? – Cut-sets as a metric. – But how does this apply if the definition of success is access to a service or to content? Outage: the opposite of available. – How much went down for how long? – To what part of the Internet? 11

Definitions of availability How do regulators define availability? – Current discussion at FCC, etc. – Builds (perhaps improperly) on definitions from phone era – Is the network “available” if your access ISP is working? How does availability and censorship relate? – How much of the Internet must be reachable for it to be “available”? 12

What can architecture do? Can architectural features improve the components of availability? – Detect and localize faults at the network layer. – Provide means to reconfigure. Must not be a new attack vector. – Allow reporting of failed components. To which part of the ecosystem? Is there a relation between architecture and redundancy? 13

At every level… Design at every level must build in detection, localization, reconfiguration, recovery. A higher level may be designed to recover from unrecovered failures at lower layers. – Applications can be available even in the face of some lack of lower-level availability. At what layer should availability be evaluated? – If you can call 911, is the phone system available? 14

Conceiving “Availability” 1. It seems like the basic objective “All” a network does is make stuff available. – We view with suspicion networks that transform.

Similar presentations

Presentation on theme: "Conceiving “Availability” 1. It seems like the basic objective “All” a network does is make stuff available. – We view with suspicion networks that transform."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Conceiving “Availability” 1. It seems like the basic objective “All” a network does is make stuff available. – We view with suspicion networks that transform.

Similar presentations

Presentation on theme: "Conceiving “Availability” 1. It seems like the basic objective “All” a network does is make stuff available. – We view with suspicion networks that transform."— Presentation transcript:

Similar presentations

About project

Feedback