Presentation is loading. Please wait.

Presentation is loading. Please wait.

Presented by Neha Agrawal

Similar presentations


Presentation on theme: "Presented by Neha Agrawal"— Presentation transcript:

1 Presented by Neha Agrawal
User-Level Inter process Communication for Shared Memory Multiprocessors Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented by Neha Agrawal

2 Outline Motivation for RPC Motivation for URPC URPC concept
Components of URPC Processor Re-allocation Data Transfer Thread management Performance Conclusion 2019/2/22

3 Motivation for RPC Importance of IPC
Encourages system decomposition across address space boundaries Failure Isolation Extensibility Modularity Message Passing, a different paradigm RPC interface above messages 2019/2/22

4 Outline Motivation for RPC Motivation for URPC URPC concept
Components of URPC Processor Re-allocation Data Transfer Thread management Performance Conclusion 2019/2/22

5 Motivation for URPC Problems with Kernel based approach
Architectural performance barriers LRPC: kernel mediates every cross address space call Kernel based communication hampering high performance user-level threads Result: Coalesce Weekly related sub-systems Trading major benefits 2019/2/22

6 Outline Motivation for RPC Motivation for URPC URPC concept
Components of URPC Processor Re-allocation Data Transfer Thread management Performance Conclusion 2019/2/22

7 URPC Concept URPC (User-level Remote Procedure Call) designed for shared memory multiprocessor User-level management of cross-address space communication Uses shared memory to send messages directly between address spaces Avoids unnecessary Processor Reallocation 2019/2/22

8 URPC Concept Division of responsibility Kernel required ONLY
Processor Reallocation Thread management Data transfer Kernel required ONLY for processor reallocation. Exploits RPC definition Operation of channels of communication processor reallocation mechanisms 2019/2/22

9 Outline Motivation for RPC Motivation for URPC URPC concept
Components of URPC Processor Re-allocation Data Transfer Thread management Performance Conclusion 2019/2/22

10 Processor Reallocation : Why should it be avoided ?
High Immediate costs Scheduling costs Cost to update the virtual memory mapping registers Cost to transfer the processor between address spaces High Long-term costs Locality shift of address space Results in diminished performance of Cache TLB 2019/2/22

11 Processor reallocation
Optimistic reallocation policy assumes The client has more work to do Inexpensive context switch Server will have a processor for service Exploits parallelism Amortize cost of single processor reallocation Advantages over Handoff scheduling Logically centralized resources 2019/2/22

12 URPC timeline 2019/2/22

13 Why Optimistic approach will not always hold ?
Issues : Single threaded applications Real time applications High latency I/O operations Priority Invocations URPC Solution : Force processor reallocation 2019/2/22

14 The Kernel handles Processor Reallocation
URPC provides Processor.Donate call When? Underpowered address space discovered What parameters ? Identity of new address space How ? Through the kernel to the new address space Identity of donar made known to the receiver 2019/2/22

15 Voluntary Return of Processors : The Policy
Return the client’s processor when all outstanding messages from the client processed or The client has become ‘underpowered’ Policy cannot be ENFORCED URPC deals with Load balancing only for communicating applications Preemptive, work conserving policies required to avoid STARVATION No need for global processor re-allocator 2019/2/22

16 Outline Motivation for RPC Motivation for URPC URPC concept
Components of URPC Processor Re-allocation Data Transfer Thread management Performance Conclusion 2019/2/22

17 SAFE Data Transfer using shared memory
Traditional RPC Clients and servers can overload each other Need of higher level protocol Copying Data in and by the Kernel Unnecessary and insufficient for safety URPC approach Use of pair-wise shared memory Mapped during binding phase Stubs responsible for safety 2019/2/22

18 Controlling Channel Access
Bidirectional shared memory queue With a test and set lock on either end Lock must be NON SPINNING The protocol If the lock is free acquire it, else do something else The Rationale Sender Receiver 2019/2/22

19 Outline Motivation for RPC Motivation for URPC URPC concept
Components of URPC Processor Re-allocation Data Transfer Thread management Performance Conclusion 2019/2/22

20 Thread management Interaction between : Authors argue :
Send/Receive of MESSAGES & Start/Stop of Threads Authors argue : High performance thread management facilities Needed for fine-grained parallel programs Can be provided only at the user level Have strong interaction with communication Hence, implement together at user level 2019/2/22

21 Concurrent programming and thread management
Problems with Kernel level threads Kernel deals with thread management Expensive kernel traps General interface for all applications Avoids customization for applications Kernel needs to ensure its protection Adds to the expense of kernel traps User level threads attacks these costs Two level scheduling 2019/2/22

22 Threads and communication at two different levels!
Threads at user level and communication at kernel level Loss of Performance benefits of Inexpensive scheduling Context switching Synchronization would be implemented and executed at two levels Advantage of both functions implemented at user level 2019/2/22

23 Summarizing the Rationale
Advantages of a user-level approach to communication Minimal involvement of the kernel Avoid unnecessarily reallocation of processors between address spaces. Amortization of the processor reallocation cost. Integration of user-level communication with user-level thread management facilities for parallelism and performance. 2019/2/22

24 Outline Motivation for RPC Motivation for URPC URPC concept
Components of URPC Processor Re-allocation Data Transfer Thread management Performance Conclusion 2019/2/22

25 URPC Performance with no processor reallocation required
Work done during URPC Send Poll Receive Dispatch 2019/2/22

26 Call latency and Throughput
Both are Load dependant Both depend on Number of Client Processors (C) Number of Server Processors (S) Number of runnable threads in the client’s Address Space (T) Call Latency: Time from which a thread calls into the stub until control returns from the stub 2019/2/22

27 Call Latency Latency is proportional to the number of threads per cpu’s increases when T > C+S Pure Latency: 93 sec, S=T=C=1 2019/2/22

28 Throughput Improves until T > C+S Does not doubles Contention
2019/2/22

29 LRPC Vs URPC LRPC uses pair-wise mapped shared memory like UPRC
LRPC uses kernel level threads hence URPC outperforms LRPC expect for The case S=0 : LRPC outperforms Though both LRPC and URPC require 2 processor reallocations and two kernel invocations Latency NULL URPC 375 microseconds NULL LRPC 157 microseconds 2019/2/22

30 Why LRPC wins the S=0 case?
Processor reallocation in URPC based on LRPC URPC integrated with two level scheduling Is there an idle processor ? and is there an underpowered address space to which it can be reallocated ? 2019/2/22

31 Outline Motivation for RPC Motivation for URPC URPC concept
Components of URPC Processor Re-allocation Data Transfer Thread management Performance Conclusion 2019/2/22

32 Conclusion Moving the Communication layer from the kernel to user space can be of advantage SMMP need specially designed OS’s so that the its processing power can be exploited URPC is designed for SMMP directly leading to major performance improvements. 2019/2/22


Download ppt "Presented by Neha Agrawal"

Similar presentations


Ads by Google