Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton Jonathan Appavoo Department.

Similar presentations


Presentation on theme: "1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton Jonathan Appavoo Department."— Presentation transcript:

1 1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton pmckenne@us.ibm.com, http://www.rdrop.com/users/paulmck Jonathan Appavoo Department of Electrical and Computer Engineering University of Toronto jonathan@eecg.toronto.edu Andi Kleen SuSE Labs ak@suse.de Orran Krieger IBM T. J. Watson Research Center okrieg@us.ibm.com, http://www.eecg.toronto.edu/~okrieg Rusty Russell RustCorp rusty@rustcorp.com.au Dipankar Sarma Linux Technology Center IBM India Software Lab dipankar.sarma@in.ibm.com Maneesh Soni Linux Technology Center IBM India Software Lab smaneesh@in.ibm.com Liao,Hsiao-Win

2 2 Outline  Introduce  Toy Example  Simple Infrastructure to Support RCU  Application

3 3 Outline  Introduce  Toy Example  Simple Infrastructure to Support RCU  Application

4 4 Traditional OS locking designs  very complex  poor concurrency  Fail to take advantage of event- driven nature of operating systems

5 5 Race Between Teardown and Use of Service code executed, Interrupts taken memory error- correction events

6 6 Read-Copy Update Handling Race quiescent state When

7 7 Read-copy update works best when  divide an update into two phases  proceed on stale data for common- case operations (e.g. continuing to handle operations by a module being unloaded)  destructive updates are very infrequent.

8 8 Implementations of Quiescent State  DYNIX/ptx 2.1 (1993) and Rusty Russell's first wait_for_rcu() patch [Russell01a] simply execute onto each CPU in turn.  DYNIX/ptx 4.0 (1994) and Dipankar Sarma's RCU patch for Linux use context switch, execution in the idle loop, execution in user mode, system call entry, trap from user mode, and CPU offline (this last for DYNIX/ptx only) as the quiescent states.

9 9 Implementations of Quiescent State  Rusty Russell's second wait_for_rcu() patch [Russell01b] uses voluntary context switch as the sole quiescent state  Tornado's and K42's "generation" facility tracks beginnings and ends of operations

10 10

11 11 Outline  Introduce  Toy Example  Simple Infrastructure to Support RCU  Application

12 12 Reference-count v.s Read-copy  search() and delete() read-copy functions avoid all cacheline bouncing for reading tasks read-copy functions can return references to deleted elements read-copy functions cannot hold a reference to elements across a voluntary context switch

13 13 Typical RCU update sequence  Remove pointers to a data structure.  Wait for all previous reader to complete their RCU read-side critical sections.  at this point, there cannot be any readers who hold reference to the data structure, so it now may safely be reclaimed.

14 14 Read-Copy Deletion (delete B)

15 15 the first phase of the update 18

16 16 Read-Copy Deletion first 18

17 17 Read-Copy Search The Task See Table data

18 18 Read-Copy Deletion Second 18

19 19 Read-Copy Deletion When

20 20 Read-Copy Deletion

21 21 Assumptions  Read intensive the update fraction f < 1/ |CPU|  Grace period reading tasks can see stale data  requires that the modification be compatible with lock-free access linked-list insertion, deletion, and replacement are compatible

22 22 Outline  Introduce  Toy Example  Simple Infrastructure to Support RCU  Application

23 23 Simple Implementation  Wait_for_rcu() waits for a grace period to expire  Kfree_rcu() waits for a grace period before freeing a specified block of memory.

24 24 Read-Copy Update Grace Period non-preemptible kernel execution Quiescentstate execution

25 25 Simple Grace-Period Detection

26 26 Rusty Russell's wait_for_rcu() I

27 27 Rusty Russell's wait_for_rcu() II

28 28 Shortcomings  Not work in a preemptible kernel unless preemption is suppressed in all read-side critical sections  Not be called from an interrupt handler  Not be called while holding a spinlock or with interrupts disabled  Relatively slow

29 29 Addressing  The K42 and Tornado implementations of RCU are such that read-side critical sections can block as well as being preempted — solve 1  Call_rcu() --solve 2 、 3  Kfree_rcu() --solve 2 、 3  High-Performance Design for RCU – solve 2 、 3 、 4

30 30 K42 and Tornado implementations of RCU  maintain two generation counters current generation non-current generation  Operations (next page)

31 31 Operation A Operation begins  increment the current counter  store a pointer to that counter in the task the operation ends  Decrement generation counter Periodically, non-current generation is checked to see if it is zero Reverse current and non-current generations A token is handed from one CPU to next The token returns to a given CPU  All operations across the entire system have terminated.

32 32 Non-Blocking Grace-Period Detection Queues callbacks onto a list invoke all the pending callbacks after forcing a grace period

33 33 High-Performance Design  defer frees of kmem_cache_alloc() memory  detects and identifies overly long lock-hold durations  “ Batching" grace-period-measurement requests  Maintaining per-CPU request lists  Providing a less-costly algorithm for measuring grace-period duration.

34 34 Simple Deferred Free  a simple implementation of a deferred-free function named kfree_rcu()  low performance kfree_rcu() → wait for rcu()

35 35 Outline  Introduce  Toy Example  Simple Infrastructure to Support RCU  Application

36 36 Application  Distributed lock manager  TCP/IP  Storage-area network (SAN)  Application regions manager (which is a workload-management subsystem)  Process management  LAN drivers

37 37 Thanks for your listening


Download ppt "1 Read-Copy Update Paul E. McKenney Linux Technology Center IBM Beaverton Jonathan Appavoo Department."

Similar presentations


Ads by Google