Presentation is loading. Please wait.

Presentation is loading. Please wait.

User Level Interprocess Communication for Shared Memory Multiprocessor by Bershad, B.N. Anderson, A.E., Lazowska, E.D., and Levy, H.M.

Similar presentations


Presentation on theme: "User Level Interprocess Communication for Shared Memory Multiprocessor by Bershad, B.N. Anderson, A.E., Lazowska, E.D., and Levy, H.M."— Presentation transcript:

1 User Level Interprocess Communication for Shared Memory Multiprocessor by Bershad, B.N. Anderson, A.E., Lazowska, E.D., and Levy, H.M.

2 Introduction RPC RPC Help in implementing distributed applications by eliminating the need to implement communication mechanism. Help in implementing distributed applications by eliminating the need to implement communication mechanism. Decomposed system provides advantages of failure isolation, extensibility and modularity. So RPC is used even when the call is in the same machine. Decomposed system provides advantages of failure isolation, extensibility and modularity. So RPC is used even when the call is in the same machine.

3 Introduction RPC Costs RPC Costs Stub overhead Stub overhead Message buffer overhead (4 copies) Message buffer overhead (4 copies) Access validation Access validation Message transfer Message transfer Scheduling Scheduling Context switch Context switch Dispatch Dispatch

4 Introduction LRPC Costs LRPC Costs Stub overhead Stub overhead Message buffer overhead (1 copy) Message buffer overhead (1 copy) Only necessary access validation Only necessary access validation Message transfer Message transfer Only necessary scheduling Only necessary scheduling Context switch is minimized by using domain caching Context switch is minimized by using domain caching

5 Introduction IPC IPC Main components (All work in Kernel) Main components (All work in Kernel) Processor reallocation (process context switch) Processor reallocation (process context switch) Data transfer Data transfer Thread management Thread management Problems Problems Processor reallocation is expensive Processor reallocation is expensive Parallel applications need user-level thread management Parallel applications need user-level thread management

6 URPC User-Level Remote Procedure Call User-Level Remote Procedure Call Shared memory multiprocessors Shared memory multiprocessors Processor reallocation - minimize Processor reallocation - minimize Data transfer - user-level (Package called URPC) Data transfer - user-level (Package called URPC) Thread management - user-level (Package called FastThreads) Thread management - user-level (Package called FastThreads)

7 User-level components

8 Processor Reallocation Limit the frequency of processor reallocation Limit the frequency of processor reallocation Why Why Cost of process context switch is more expensive than thread context switch Cost of process context switch is more expensive than thread context switch Cost of invoking kernel Cost of invoking kernel -Client makes procedure call in server address space -Invoke kernel -Kernel reallocates processor to server address space -Server finishes the job -Invoke kernel -Kernel reallocates processor to client address space -Client resumes the work

9 Processor Reallocation Limit the frequency of processor reallocation Limit the frequency of processor reallocation How How Optimistic reallocation policy Optimistic reallocation policy Client has other works Client has other works Server has or will soon has a processor to do the job Server has or will soon has a processor to do the job Uniprocessor can delay processor reallocation Uniprocessor can delay processor reallocation -Client makes procedure call in server address space -Client does something else -Server finishes the job -Client resumes the work

10 Processor Reallocation Problems Problems Inappropriate situations Inappropriate situations Single-threaded client, real time applications & high- latency I/O applications Single-threaded client, real time applications & high- latency I/O applications Solve: Allow client to force processor reallocation Solve: Allow client to force processor reallocation Underpowered Underpowered No processor to handle the pending request from client No processor to handle the pending request from client Solve: Donate – idle processor donates itself to underpowered address space Solve: Donate – idle processor donates itself to underpowered address space

11 Processor Reallocation Problems Problems Voluntary return of processor Voluntary return of processor Processor working in server never return to client because it is too busy working on the request of other clients. Processor working in server never return to client because it is too busy working on the request of other clients. Solve: enforce the process reallocation when necessary such as high priority waiting while low priority job is running and processor is idling Solve: enforce the process reallocation when necessary such as high priority waiting while low priority job is running and processor is idling

12 Processor Reallocation LRPC VS URPC LRPC VS URPC Domain caching looks for idle processor in server context Domain caching looks for idle processor in server context Optimistic reallocation assume there will be an available processor in server context and queue the request to be done later Optimistic reallocation assume there will be an available processor in server context and queue the request to be done later URPC needs two level scheduling decisions including looking for idle processor and underpowered address space while LRPC does not. URPC needs two level scheduling decisions including looking for idle processor and underpowered address space while LRPC does not.

13 Data Transfer Use pair-wise shared memory to avoid the need of copying in kernel. Use pair-wise shared memory to avoid the need of copying in kernel. Both give the same level of security since data need to be passed into stubs before it can be used Both give the same level of security since data need to be passed into stubs before it can be used

14 Thread Management Arguments Arguments Fine-grained parallel application needs high performance thread management which could only be achieved by implementing in user-level Fine-grained parallel application needs high performance thread management which could only be achieved by implementing in user-level Communication & Thread management can achieve very good performances when both are implemented at user-level Communication & Thread management can achieve very good performances when both are implemented at user-level

15 Thread Management Features of kernel such as time slicing degrade performance of applications Features of kernel such as time slicing degrade performance of applications To invoke thread management operation, kernel traps are required To invoke thread management operation, kernel traps are required Thread management policy implemented in kernel is unlikely to be efficient for all parallel applications Thread management policy implemented in kernel is unlikely to be efficient for all parallel applications

16 Thread Management Threads block in order to Threads block in order to Synchronize their activities in same address space Synchronize their activities in same address space Wait for external events from different address space Wait for external events from different address space Communication implemented at kernel level will result in synchronization at both user level and kernel level Communication implemented at kernel level will result in synchronization at both user level and kernel level

17 URPC

18 Performance Thread management faster at user level Thread management faster at user level Component breakdown Component breakdown

19 Performance Call latency & throughput is at worst when S=0 Call latency & throughput is at worst when S=0

20 Conclusion Moving the possible functionality from kernel into user-lever to improve performance Moving the possible functionality from kernel into user-lever to improve performance In order to achieve great performance on multiprocessors, system need to be designed to support its functionality In order to achieve great performance on multiprocessors, system need to be designed to support its functionality


Download ppt "User Level Interprocess Communication for Shared Memory Multiprocessor by Bershad, B.N. Anderson, A.E., Lazowska, E.D., and Levy, H.M."

Similar presentations


Ads by Google