Program Correctness. The designer of a distributed system has the responsibility of certifying the correctness of the system before users start using.

Program Correctness

The designer of a distributed system has the responsibility of certifying the correctness of the system before users start using it. This guarantee must hold as long as every hardware and software component works according to specifications.

History of Distributed System Computation that begins from the initial state A and ends in the final state L. Each arc corresponds to an atomic action that causes a state transition. In each of the states B and G, there are two possible actions: the choice may be either data-dependent, or due to the non- determinism of the scheduler(s). The history can be represented as the set of the following three state sequences {ABCDEFL, ABGHIFL, ABGJKIFL}. If a computation does no terminate, then some of the behaviors can be infinite.

State Transitions

It is tempting to prove correctness by enumerating all possible interleavings of atomic actions and testing or reasoning about each of these behaviors. Because of the explosive growth in the number of such behaviors, this approach soon turns out to be impractical—at least for nontrivial distributed systems. For example, with n processes each executing a sequence of m atomic actions, the total number of possible interleavings is (nm)!/(m!)n Even for modest values of m and n, this is a very large number. Therefore, to exhaustively test even a small system, one can easily exceed the computing capacity available with today’s largest and fastest computers.

Testing vs. Proof Testing: Apply inputs and observe if the outputs satisfy the specifications. Fool proof testing can be painfully slow, even for small systems. Most testing are partial. Proof: Has a mathematical foundation, and is a complete guarantee. Sometimes not scalable.

Testing vs. Proof To test this program, you have to test all possible interleavings. With n processes p 0, p 1, … p n-1, and m steps per process, the number of interleavings is (n.m)! (m!) n The state explosion problem p0p1p2p3 step1 step2 step3

Correctness criteria Safety properties Bad things never happen Example :1. Consider the history shown in Figure and let a safety property be specified by the statement: the value of a certain variable temperature should never exceed 100.Thus, if we find that in state G temperature = 107, then we immediately conclude that the safety property is violated — we need not wait for what will happen to temperature after state G 2. Mutual exclusion Here, a safety property is: at most one process can be inside its critical section. Accordingly, the safety invariant can be written as Ncs ≤1 where Ncs is the number of processes in the critical section at any time.

Example: Mutual Exclusion Process 0Process 1 do true  Entry protocol Critical section Exit protocol od Safety properties (1) There is no deadlock (2) At most one process is in its critical section. Liveness property A process trying to enter the CS must eventually succeed. (This is also called the progress property ) CS

Safety invariants Invariant means: something meaningful should always hold Example: Total no. of processes in CS ≤ 1 (mutual exclusion problem) Another safety property is Partial correctness. It implies that “If the program terminates then the postcondition will hold.” Consider the following : Safety invariant:  (G0  G1  G2  …  Gk)  postcondition It does not say if the program will terminate. (termination is a liveness property) Total correctness = partial correctness + termination. do G0  S1 [] G1  S1 [] … [] Gk  Sk od

Consider a system of four processes P0 through P3 as shown in Figure. Each process has a color c represented by an integer from the set {0,1,2,3}. We will represent the color of a process Pi by the symbol c[i]. The objective is to devise an algorithm, so that regardless of the initial colors of the different processes, the system eventually reaches a configuration where no two adjacent processes have the same color.

An Infinite Behavior of the system

Let N(i) denote the set of neighbors of process Pi. We propose the following program for every process Pi to get the job done: program colorme {for process Pi } do ∃ j:j ∈ N(i) :: (c[i] = c[j]) ➔ c[i] := (c[i] + 2) mod 4 od Is the program partially correct? we conclude that if the program terminates, then the following condition holds: ∀ i, j: j ∈ N(i) :: c[i] != c[j]) By definition, this is the desired postcondition. So the system is partially correct. However, it is easy to find out that the program may not terminate. Consider the initial state A in which the system returns to the starting state A without ever satisfying the desired postcondition. Therefore, the program is partially correct, but not totally correct. Note that it is possible for this program to reach termination if the schedulers choose an alternate sequence of action.

Liveness properties Good things eventually happen. Eventuality is tricky. There is no need to guarantee when the desired thing will happen, as long as it happens.. The criminal will eventually be caught Some examples  The message will eventually reach the receiver.  The process will eventually enter its critical section.  The faulty process will be eventually be diagnosed  Fairness (if an action will eventually be scheduled)  The program will eventually terminate.  The criminal will eventually be caught. Absence of liveness cannot be determined from finite prefix of the computation

Example of liveness properties : Progress: Progress toward the critical section Fairness: Fairness is a liveness property, since it determines whether the scheduler will schedule an action in a finite time. Reachability: Given a distributed system with an initial state S0, does there exist a finite behavior that changes the system state to St. If so, then St is said to be reachable from S0. Reachability is a liveness property Termination. Program termination is a liveness property.

Correctness Proofs The set of possible behaviors of a distributed system can be very large, and testing is not a feasible way of demonstrating the correctness of nontrivial system. Some form of mathematical reasoning or methods like proof by induction or proof by contradiction are widely applicable. The techniques for proving safety properties are thus different from the techniques for proving liveness properties. We particularly focus on the following four topics: Assertional methods of proving safety properties Use of well-founded sets for proving liveness properties Programming logic Predicate transformers

Program Correctness. The designer of a distributed system has the responsibility of certifying the correctness of the system before users start using.

Similar presentations

Presentation on theme: "Program Correctness. The designer of a distributed system has the responsibility of certifying the correctness of the system before users start using."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Program Correctness. The designer of a distributed system has the responsibility of certifying the correctness of the system before users start using.

Similar presentations

Presentation on theme: "Program Correctness. The designer of a distributed system has the responsibility of certifying the correctness of the system before users start using."— Presentation transcript:

Similar presentations

About project

Feedback