Presentation is loading. Please wait.

Presentation is loading. Please wait.

20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery.

Similar presentations


Presentation on theme: "20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery."— Presentation transcript:

1

2 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery Berger, UMass. Robert Berry, IBM Hursley. Hans Boehm, HP. Dave Detlefs, Sun Microsystems. Rick Hudson, Intel. Eliot Moss, UMass.

3 Birrell’s Reference Listing Revisited Richard Jones University of Kent R.E.Jones@ukc.ac.uk Peter Dickman University of Glasgow Luc Moreau University of Southampton

4 20 February 2004 UKC, February 2004 3 Outline 1.Distributed reference counting – benefits & issues. 2.Birrell’s algorithm – example. 3.Weaknesses of Birrell’s description. 4.Our approach Graphical notation Formalisation & Proof 5.Extensions Fault tolerance Optimisation 6.Conclusion.

5 20 February 2004 UKC, February 2004 4 Problems of a distributed world Concurrency everywhere must avoid race conditions, etc Communication is costly changing the reference count of a remote object may cost 10,000 times as much as changing the count of a local object Not easy to get complete knowledge of object graph synchronisation is expensive Faults everywhere communications, processes

6 20 February 2004 UKC, February 2004 5 Terminology Processes: partition computational and storage resources. Messages pass in point-to-point channels between processes. Channels have properties, such as FIFO or lossy. A reference is local if it refers to an object allocated in the same process; alternatively, it is remote (or global). The owner of a reference is the process that initially allocated the object to which the reference refers.

7 20 February 2004 UKC, February 2004 6 Distributed Reference Counting/Listing Most widely used DGC technique Maintain a count of remote references to each global object Reference listing alternative Benefits Scalable solution Easy to implement But… Cannot reclaim garbage cycles Easy to implement wrong!

8 20 February 2004 UKC, February 2004 7 dec inc 1 copy Race condition (1) 2 AB Log copy inc dec owner surrogate 1

9 20 February 2004 UKC, February 2004 8 dec inc 1 copy Race condition (2) AB Log copy inc dec Log copy dec inc @**!!

10 20 February 2004 UKC, February 2004 9 Birrell’s algorithm Birrell, Evers, Nelson, Owicki, and Wobber. Distributed Garbage Collection for Network Objects. DEC SRC technical report 116, 1993. Widely used: Modula-3 Network Objects; Java RMI. Based on reference listing, avoids race conditions of naïve implementations, fault tolerance.

11 20 February 2004 UKC, February 2004 10 Birrell’s description Object table w (o) concrete O o.dirtySet = {Q,…} surrogate for O weak ref Process P: owner of O Process Q: a client of O Object table w (o) Concrete and surrogate objects. Client invokes the surrogate, whose methods perform RPC to owner. WireRep: unique ID of owner, plus index of object at the owner. Marshalling Object table: maps a wirerep w(o) to the local instance of the object. Client has surrogate for o  concrete o in object table Dirty set: identifiers of processes that have surrogates. Dirty-set =   o can be removed from the object table.

12 20 February 2004 UKC, February 2004 11 P marshalls o to Q P pushes o onto its stack; sends w(o) to Q. 1. Q looks it up in its object table. Present: use the object w(o)=NIL: surrogate being created; suspend. 2. Absent: enter w(o)=NIL in object table; send dirty(o) to owner(o); 3.Owner adds Q to its dirtySet(o) and dirty(o) returns. 4. Q creates surrogate(o) and adds it to its object table. 5. Q deletes surrogate(o) and sends clean(o) to owner(o). P QO

13 20 February 2004 UKC, February 2004 12 ack dirty {A} copy Dirty calls {A,B} AB Log keep ref on stack copy dirty ack remove from stack

14 20 February 2004 UKC, February 2004 13 Strengths Fairly straightforward to implement Reference listing offers robustness Idempotent operations tolerate message duplication Owners know who holds remote references Copes with process failure (ping clients) without leaking (but cannot distinguish network delay from process failure)

15 20 February 2004 UKC, February 2004 14 Weaknesses Tightly bound to RPC Acknowledgement mechanism. Implementation specific. Assumes method invocation pushes arguments onto stack; Unique surrogate per process (object-listing) Under-specified Critical sections Race conditions Other scenarios Informal proof Depends on hard-to-formalise aspects (e.g. stack)

16 20 February 2004 UKC, February 2004 15 Our contribution Novel graphical notation. Formalisation. Discovered requirement for pivotal new states. Proof.

17 20 February 2004 UKC, February 2004 16 New graphical notation Intuitive. Precise. Uniformity of ‘direction’ of transitions. ‘Obvious’ where transitions are needed.

18 20 February 2004 UKC, February 2004 17 Lifecycle of references Obvious where transitions are needed E.g. Receive reference at state ccit. ccitnil critical for correctness.

19 20 February 2004 UKC, February 2004 18 Slicing

20 20 February 2004 UKC, February 2004 19 Fault tolerance Slicing Owner is aware we have a reference. Owner is not aware we have a reference.

21 20 February 2004 UKC, February 2004 20 Benefits Intuitive – fault-tolerant version literally encapsulates failure-free version. Identify precisely when failures can be detected. Define states reached after failures detected. Remedial actions.

22 20 February 2004 UKC, February 2004 21 Applicability: Birrell  Lermen-Maurer

23 20 February 2004 UKC, February 2004 22 Formalisation Abstract machine Processes communicating by asynchronous message passing. Atomic transitions involve 1 process at a time. Receipt of message changes only a process’ internal state Trigger sending of a another message? Store some info in a to do table?

24 20 February 2004 UKC, February 2004 23 Benefits Inputs and outputs desynchronised. Size of critical sections explicit and minimised. Asynchronous outputs (e.g. background daemon processes to do tables). Suitable for mechanical proof.

25 20 February 2004 UKC, February 2004 24 Formalisation Rule name: guard  pseudo-statements. make_copy (p 1,p 2,r): p 1  p 2  receive_T(p 1,r)=OK  locallyReachable(p 1,r)  { id := new Identifier; dirty_T(p 1,r) := dirty_T(p 1,r) U (p 1,p 2,id); post(p 1, p 2, copy(r,id)); } name guard table message

26 20 February 2004 UKC, February 2004 25 More formally Tables defined as functions whose first argument is a process. Channels are bags of messages between pairs of processes. A configuration of the abstract machine is a tuple of all tables and message channels. Pseudo-statements act as configuration transformers: Given a configuration, table_T(a 0,…a n ):=V denotes where table_T'(x 0,…x n ) = table_T(a 0,…a n ) if (x 0,…x n )  (a 0,…a n ) table_T'(a 0,…a n ) = V post(p 1,p 2,m) denotes where k'(p 1,p 2 ) = k(p 1,p 2 )  {m} k'(p i,p j ) = k(p i,p j ),  (p i,p j )  (p 1,p 2 )

27 20 February 2004 UKC, February 2004 26 Proof style Safety & Liveness Invariance-based proof Induction on length of transitions. Case analysis of transitions. Termination measure. Benefits Systematic. Less error prone than temporal reasoning. –E.g. establishing fine details such as mutual exclusivity complicated in a formalism based on temporal reasoning.

28 20 February 2004 UKC, February 2004 27 Example proof Lemma: For any processes p 1, p 2, for any reference r, for any identifier id and for any configuration, the following implication holds: If  dirty_T(p 1,r) then receive_T (p 1,r) = OK Proof: In the initial configuration, dirty tables are empty and the implication trivially holds. We consider the four rules that add/remove entries to/from dirty tables and that modify the content of receive tables to/from OK. make_copy (p 1, p 2,r): make_copy adds an entry, and its guard ensures that the receive-table is in the OK state. …

29 20 February 2004 UKC, February 2004 28 receive_copy_ack (p 2,p 1,r,id): receive_copy_ack removes the transient entry from the dirty table, and therefore trivially satisfies the Lemma. receive_dirty_ack (p 1,p 2,r): ]If the Lemma held before transition receive_dirty_ack, it also holds after transition since the dirty table is unchanged and the receive table is set to OK. do_clean_call (p 1,r): As the dirty table is a root for the local GC, finalize cannot be fired, and hence receive_T(p 1 ) will not be changed by do_clean_call.

30 20 February 2004 UKC, February 2004 29 Key Lemmas Safety Lemma 3: Unusable Reference For any process p 1, for any reference r and for any configuration, the following implication holds: If receive_T(p 1, r)=nil  receive_T(p 1, r)=ccitnil, then there exists p such that p  dirty_T(owner(r), r) or there exist p,id such that  dirty_T(owner(r), r). Safety Lemma 2: Reference in Transit For any processes p1, p2, for any reference r, for any identifier id and for any configuration, the following implication holds: If copy(r,id)  k(p 1,p 2 ), then p 1  dirty_T(owner(r),r), if p 1  owner(r) or  dirty_T(owner(r),r), if p 1 = owner(t) Safety Lemma 1: Usable Reference For any processes p 1 and p 2, for any reference r with p 1 =owner(r) and p 1  p 2, and for any configuration, the following implication holds: If receive_T(p 1,r)=OK, then p 1  dirty_T(p 2,r). permanent temporary

31 20 February 2004 UKC, February 2004 30 Birrell’s algorithm is Safe A DGC algorithm is safe if the collector cannot reclaim live objects. For Birrell's algorithm, there must be an entry in the owner's dirty table for every live object. The proof follows directly from the 3 safety lemmas. Birrell's Safety Requirement For all references r, and for all processes p 1 and p 2 and all identifiers id, If receive t( p 1,r)=OK  receive_T( p 1,r)=nil  receive_T( p 1,r)=ccitnil  copy (r,id)  k( p 1, p 2 ), then there exists p such that p  dirty_T(owner(r),r) or there exist p,id such that  dirty_T(owner(r),r).

32 20 February 2004 UKC, February 2004 31 Liveness Liveness guarantees that if all references to an object are deleted, the owner’s dirty table will eventually become empty. To prove this, We show that whenever there’s a message in a channel, a transition can be fired to consume it. We introduce a termination measure on the configurations that shows how far the abstract machine is from completing, and show that DGC transitions cause this measure to decrease. Hence all transition paths terminate.

33 20 February 2004 UKC, February 2004 32 Termination measures termination_measure(c) = tab_measure +    msg_measure(m) +  rt_measure(receive_T(p,r)) tab_measure = 9|dirty_call_todo_T| + 7|dirty_ack_todo_T| + 2|copy_ack_todo_T| + 2|clean_ack_todo_T| + 2|blocked_T| and rt_measure(OK) =5 rt_measure(ccitnil) =2 rt_measure(ccit) = 1 rt_measure(nil) = 1 rt_measure(  ) = 0 msg_measure(copy) = 14 msg_measure(dirty) = 8 msg_measure(dirtyack) = 6 msg_measure(clean) = 3 msg_measure(copyack)=1 msg_measure(cleanack)= 1 size of tables messages between pairs of processes states of references in processes values chosen ‘arbitrarily’

34 20 February 2004 UKC, February 2004 33 Example: receive_dirty_ack receive_dirty_ack ( p 1, p 2,r) : dirtyack(r)  k ( p 1, p 2 )  { receive( p 1, p 2,dirtyack(r));//-6 copyack_todo_T( p 2 ) := copyack_todo_T( p 2 )  blocked_T( p 2,r);//-X // Deserialisation code to be resumed for each entry in blocked_T( p 2,r) blocked_T( p 2,r) :=  ;//-X receive_T( p 2,r) := OK;//+5 } Thus, termination measure decreases by 1.  measure = -1

35 20 February 2004 UKC, February 2004 34 Optimisations FIFO channels Less synchronisation needed Fewer messages: no clean_ack Fewer tables. Sender is owner No need for dirty_call and copy_ack But need message ordering to avoid races Receiver is owner Fewer dirty table entries Again need message ordering

36 20 February 2004 UKC, February 2004 35 Future work Convince ourselves of appropriateness of Birrell’s remedial actions. Correctness proof of fault-free version. Explore applicability of our techniques Graphical notation Proof-techniques Generality Auto-generation of code from formalism.

37 20 February 2004 UKC, February 2004 36 Conclusion Intuitive graphical notation. Formal, implementation-independent specification and proof of a widely used algorithm. Discovered weaknesses in original presentation. A widely applicable technique?

38 Questions?

39 20 February 2004 UKC, February 2004 38 FINIS


Download ppt "20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery."

Similar presentations


Ads by Google