When Is Agreement Possible

When Is Agreement Possible
When Is Agreement Possible? CS 188 Distributed Systems February 24, 2015

Introduction Basics of agreement protocols
Impossibility of agreement in asynchronous system with failures When is agreement possible?

Basics of Agreement Protocols
What is agreement? What are the necessary conditions for agreement?

What Do We Mean By Agreement?
In simplest case, can n processors agree that a variable takes on value 0 or 1? Only non-faulty processors need agree More complex agreements can be built from this simple agreement

Conditions for Agreement Protocols
Consistency All participants agree on same value and decisions are final Validity Participants agree on a value at least one of them wanted Termination All participants choose a value in a finite number of steps

Impossibility of Agreement in Async System With Failures
Assume a reliable, but asynchronous, message passing system Any message may face arbitrary delays Can a set of processors reach agreement if one of the processors fails?

Agreement Isn’t Always Possible
In the general case for arbitrary systems Adding some special properties to the system may change that result But without those properties, provably impossible A result sometimes abbreviated FLP For Fischer, Lynch, and Patterson, who proved it

Model of the System The system consists of n processors
The goal is for all non-faulty processors to agree on value 0 or 1 Rule out the trivial case of always agreeing on 0 (or 1) Agreement depends on protocol, initial state, and inputs to each processor

Bivalent and Univalent States
A bivalent state is a system state that could lead to either value being decided A univalent state can only lead to one of the values being decided 0-valent or 1-valent Valency must take allowable failures into account!

System Configuration Processors have internal state
State of network is the set of messages sent, but not yet received Event e is the receipt of message m by a processor Which can lead to sending one or more new messages Events are deterministic A schedule is a sequence of events

Proving the Result Let’s assume the result is false
That we can reach agreement with one failure in these conditions Use an adversarial model Within rules of behavior, assume adversary can force any legal event Look for contradictions

What Can the Adversary Do?
Force any processor to perform an event at any moment Choose any message to be delivered to any processor when it requests a message Delay any message arbitrarily long Once, it can kill one processor permanently

The Necessity of Bivalency
There has to be an initial bivalent configuration for the system Why? If all processors started with value 1, the system would decide 1 If all processors started with value 0, the system would decide 0

Intermediate Initial States
If some processors start with value 0 and some with value 1 Some initial states lead to result 1 Some initial states lead to result 0 All initial states lead to one or the other So there is a 1-valent initial state that differs from a 0-valent initial state by one processor’s initial value

A Graphical Representation
What’s in these states? State x State y Node 1:0 Node 2:1 Node 3: 1 . Node N: 0 Node 1:0 Node 2:1 Node 3: 1 . Node N: 1 They differ in only one value 0-valent initial states 1-valent initial states

Why Does This Imply Bivalence?
What if that one differing processor is the processor that fails? The system must still reach agreement from the remaining states Which are identical, now But on what value?

Is This Possible? State x Does the system decide on 1?
Looks like x and y must be bivalent Does the system decide on 0? State x State y Node 1:0 Node 2:1 Node 3: 1 . Node N: 0 Node 1:0 Node 2:1 Node 3: 1 . Node N: 1 Then State x wasn’t 0-valent, after all Then State y wasn’t 1-valent, after all 0-valent initial states 1-valent initial states

So What? So there has to be at least one bivalent initial state
Why’s that so bad? If the system never leaves a bivalent state, it never makes a decision We must show our adversary can’t perpetually force bivalency

The Persistence of Bivalency
Let’s assume bivalency doesn’t persist At some point, some bivalent state must transition to a univalent state Implying at least two events One to go to 0-valent One to go to 1-valent With no events leading to bivalent states

A Graphical Representation
D e’ D’ Remember, these events are each delivery of a message So m and m’ must have been in the message delivery system state simultaneously

Looking Closely at Events e and e’
What would happen if we executed e first, then e’? What would happen if we executed them in the opposite order? Well, why should I care? Would executing them in either order lead to the same state? If so, there’s a contradiction

Order of Events e and e’ C e D e’ D’ e’ e

Why Should They Lead to the Same State?
What if e and e’ occur on different processors? Then they’re independent events So they should produce the same result if executed in either order So e and e’ could not have occurred on different processors

Could the Events Occur on the Same Processor P?
If e was first, the state became 0-valent If e’ was first, the state became 1-valent But what if P then fails? Since the event happened only at P, only P sees the effects So we’re still in a bivalent state

Recapitulating the Argument
It’s possible to start in a bivalent state There must be some point at some processor P at which the bivalent state changes to univalent If P fails before anyone knows the valency, the system becomes bivalent And can never settle to univalency Perpetual bivalency implies no agreement

When Is Agreement Possible?
Didn’t we show in the last class that we can reach agreement if less than 1/3 of our processors are faulty? Yes, but only if the message passing system is synchronous Whether agreement is possible in a system depends on certain parameters

Parameters for Agreement In Distributed Systems
Synchronous vs. asynchronous processors Bounded vs. unbounded communications delay Ordered vs. unordered messages Point-to-point vs. broadcast communications

Synchronous vs. Asynchronous Processors
Synchronous processors imply that all processors make progress predictably More precisely, there is a constant s such that for every s+1 steps taken by Pi all Pj will take at least one step

Bounded vs. Unbounded Communications Delay
Delay is bounded if and only if all messages arrive at their destination within t steps Implies no lost messages Doesn’t imply messages arrive in the order sent

Ordered vs. Unordered Messages
Messages are ordered if they are received in the same real time order as their sending Using true real time In some cases, merely receiving all messages in same order at all processors is enough

Point-to-Point vs. Broadcast Communications
Point-to-point communications means a given message sent by Pi is seen only by its destination Pj Broadcast communications mean that Pi can send a message to all other processors in a single atomic step Most typically by hardware broadcast

So, When Can We Reach Agreement?
Case 1: Processors are synchronous and communications is bounded Case 2: Messages are ordered and the transmission medium is broadcast Case 3: Processors are synchronous and messages are ordered And that’s it (Case 1 covers Byzantine agreement)

What Does This Result Mean?
For practical systems we really build Not that we can never reach agreement Good systems almost always do But that we generally can’t guarantee it Which implies that our systems should tolerate disagreements At some times Under some conditions

When Is Disagreement OK?
For preference, when it doesn’t matter E.g., when reasonable results possible even without agreement Or when it eventually works itself out With possible inconsistencies in the meantime Or, at worst, when it is visible to people who can fix it

When Is Disagreement Not OK?
When the consequences of disagreement are dire When it results in unfixable problems When its consequences are invisible, but relevant Unfortunately, we don’t always get to choose when we can avoid it

Minimizing Chances of Disagreement
Understand when agreement is most critical In those cases, use protocols that are less likely to fail on agreement Which usually have heavy expenses So don’t always use them

A Classification of Faults
More detailed than previously discussed Produced by fault-tolerant computing community Divides faults into classes Stronger class is subset of weaker class

An Ordered Fault Classification
Byzantine Authenticated Byzantine Incorrect Computation Timing Omission Crash Fail Stop

Fail Stop Faults A processor ceases operation
But informs other processors in computation that it has stopped Relatively easy to deal with

Crash Fault A processor crashes or loses internal state and halts
Without notification to anyone else Hard to distinguish from a really slow processor

Omission Faults A processor fails to do something in time
Like respond to a message But otherwise it may still be operating correctly Or it may have crashed

Timing Fault A processor completes a task before or after the window when it should Or never A late acknowledgement to a message, e.g.

Incorrect Computation Fault
A processor fails to produce the correct results for a given set of input Which could be merely not producing the results soon enough Or could be sending back trash

Authenticated Byzantine Fault
Processor performs an arbitrary or malicious fault But authentication mechanisms note any alterations made to others’ messages

Byzantine Fault Any and every fault
Having arbitrarily bad consequences Possibly working in combination with other faults to produce really bad results In this classification, all other faults are subclasses of Byzantine faults

When Is Agreement Possible

Similar presentations

Presentation on theme: "When Is Agreement Possible"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

When Is Agreement Possible

Similar presentations

Presentation on theme: "When Is Agreement Possible"— Presentation transcript:

Similar presentations

About project

Feedback