Reducing Interprocess Communication Overhead in Concurrent Programs Erik Stenman Kostis Sagonas.

Slides:

Advertisements

Similar presentations

OPERATING SYSTEMS PROCESSES

Advertisements

Operating Systems Lecture 7.

P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.

 2004 Deitel & Associates, Inc. All rights reserved. 1 Chapter 3 – Process Concepts Outline 3.1 Introduction 3.1.1Definition of Process 3.2Process States:

Computer Systems/Operating Systems - Class 8

G Robert Grimm New York University Cool Pet Tricks with… …Virtual Memory.

Static Analysis of Embedded C Code John Regehr University of Utah Joint work with Nathan Cooprider.

1 Tuesday, November 07, 2006 “If anything can go wrong, it will.” -Murphy’s Law.

Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.

CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.

Extensibility, Safety and Performance in the SPIN Operating System Bershad et al Presentation by norm Slides shamelessly “borrowed” from Stefan Savage’s.

4/23/09Prof. Hilfinger CS 164 Lecture 261 IL for Arrays & Local Optimizations Lecture 26 (Adapted from notes by R. Bodik and G. Necula)

CS533 Concepts of Operating Systems Class 6 The Duality of Threads and Events.

© 2006 Pearson Addison-Wesley. All rights reserved4-1 Chapter 4 Data Abstraction: The Walls.

Previous finals up on the web page use them as practice problems look at them early.

3.5 Interprocess Communication Many operating systems provide mechanisms for interprocess communication (IPC) –Processes must communicate with one another.

3.5 Interprocess Communication

Ceng Operating Systems Chapter 2.1 : Processes Process concept Process scheduling Interprocess communication Deadlocks Threads.

Intermediate Code. Local Optimizations

User-Level Interprocess Communication for Shared Memory Multiprocessors Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented.

Schedule Midterm out tomorrow, due by next Monday Final during finals week Project updates next week.

1/26/2007CSCI 315 Operating Systems Design1 Processes Notice: The slides for this lecture have been largely based on those accompanying the textbook Operating.

Inline Function. 2 Expanded in a line when it is invoked Ie compiler replace the function call with function code To make a function inline the function.

Karolina Muszyńska Based on: S. Wrycza, B. Marcinkowski, K. Wyrzykowski „Język UML 2.0 w modelowaniu SI”

1 Lecture 4: Threads Operating System Fall Contents Overview: Processes & Threads Benefits of Threads Thread State and Operations User Thread.

LOGO OPERATING SYSTEM Dalia AL-Dabbagh

Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.

Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.

JIT in webkit. What’s JIT See time_compilation for more info. time_compilation.

Adaptive Optimization in the Jalapeño JVM Matthew Arnold Stephen Fink David Grove Michael Hind Peter F. Sweeney Source: UIUC.

CS533 Concepts of Operating Systems Jonathan Walpole.

CSE 425: Object-Oriented Programming I Object-Oriented Programming A design method as well as a programming paradigm –For example, CRC cards, noun-verb.

Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,

1 Chapter 2.1 : Processes Process concept Process concept Process scheduling Process scheduling Interprocess communication Interprocess communication Threads.

1 Programming Languages and the Software Production Process Informal Cardelli’s metrics of programming languages fitness to real-time applications: Economy.

ICOM Noack Scheduling For Distributed Systems Classification – degree of coupling Classification – granularity Local vs centralized scheduling Methods.

SPECULATIVE EXECUTION IN A DISTRIBUTED FILE SYSTEM E. B. Nightingale P. M. Chen J. Flint University of Michigan.

Threads G.Anuradha (Reference : William Stallings)

Virtual Machines, Interpretation Techniques, and Just-In-Time Compilers Kostis Sagonas

Maria Christakis National Technical University of Athens, Greece Joint work with Kostis Sagonas Detection of Asynchronous Message Passing Errors Using.

Processes CSCI 4534 Chapter 4. Introduction Early computer systems allowed one program to be executed at a time –The program had complete control of the.

Static Detection of Race Conditions in Erlang Maria Christakis National Technical University of Athens, Greece Joint work with Kostis Sagonas.

CS 598 Scripting Languages Design and Implementation 14. Self Compilers.

© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Processes and Threads.

13-1 Chapter 13 Concurrency Topics Introduction Introduction to Subprogram-Level Concurrency Semaphores Monitors Message Passing Java Threads C# Threads.

CS533 – Spring Jeanie M. Schwenk Experiences and Processes and Monitors with Mesa What is Mesa? “Mesa is a strongly typed, block structured programming.

Operating Systems Unit 2: – Process Context switch Interrupt Interprocess communication – Thread Thread models Operating Systems.

3/2/2016© Hal Perkins & UW CSES-1 CSE P 501 – Compilers Optimizing Transformations Hal Perkins Autumn 2009.

Testing Concurrent Programs Sri Teja Basava Arpit Sud CSCI 5535: Fundamentals of Programming Languages University of Colorado at Boulder Spring 2010.

Threads, SMP, and Microkernels Chapter 4. Processes and Threads Operating systems use processes for two purposes - Resource allocation and resource ownership.

CPS110: Implementing threads on a uni-processor Landon Cox January 29, 2008.

The Structuring of Systems Using Upcalls By David D. Clark Presented by Samuel Moffatt.

A Single Intermediate Language That Supports Multiple Implemtntation of Exceptions Delvin Defoe Washington University in Saint Louis Department of Computer.

Remote Procedure Calls

Processes and threads.

Parallel Programming By J. H. Wang May 2, 2017.

Processes Overview: Process Concept Process Scheduling

CGS 3763 Operating Systems Concepts Spring 2013

CSCI1600: Embedded and Real Time Software

Optimizing Transformations Hal Perkins Autumn 2011

Inlining and Devirtualization Hal Perkins Autumn 2011

Optimizing Transformations Hal Perkins Winter 2008

Instruction Level Parallelism (ILP)

Threads Chapter 5 2/17/2019 B.Ramamurthy.

Threads Chapter 5 2/23/2019 B.Ramamurthy.

Computer Science 312 Concurrent Programming I Processes and Messages 1.

CSCI1600: Embedded and Real Time Software

Dynamic Binary Translators and Instrumenters

Presentation transcript:

Reducing Interprocess Communication Overhead in Concurrent Programs Erik Stenman Kostis Sagonas

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 2 Motivation Concurrency as an abstraction is important.  Systems that need to interact with the outside world are hard to model without concurrency. Unfortunately concurrency costs.  There are two types of runtime overheads:  ”Direct overhead” of concurrency primitives.  ”Indirect overhead” from hiding the data-flow from the optimizing compiler.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 3 Goal of this work Reduce the overhead of concurrency in concurrent programs. Idea Optimize the code that implements process communication. We call this interprocess optimization, and we will present three techniques: 1. Rescheduling send. 2. Direct dispatch send 3. Interprocess inlining.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 4 Rescheduling Send Typically, when a process sends a message, it is because it wants the receiver to act on that message. It is therefore in the interest of the sender to yield to the receiver and allow it to promptly act on the sent message. We call this type of send, a rescheduling send.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 5 Implementation: The send operation suspends the currently active (sending) process. Benefits:  lower message passing latency.  better cache behavior (the receiver has access to the message while it is still hot in the cache). Rescheduling Send

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 6 Direct Dispatch Send The sender contributes the remaining part of its time-slice to the receiver (hoping this would lead to a faster response). The receiver then is woken up directly (ignoring the ready queue).  Overhead of the scheduler is eliminated. If the receiver also performs a direct dispatch send back to the sender, interprocess communication becomes as fast as a function call.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 7 Interprocess Inlining Merge code from the sender with code from the receiver.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 8 Process  The receiver Process  The sender A message

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 9

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 10

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 11 Known communication protocol. Can be optimized. Process  only needs to be ready to receive the communication.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 12 The state of process  has changed. Without really participating in the actual communication.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 13 Interprocess Inlining Merge code from the sender with code from the receiver. In the merged code, the overhead of the communication protocol can be optimized away. We suggest using profiling to find situations where this is feasible. This requires that the optimized code is guarded by a runtime test.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 14 An example in Erlang… ref_server(V) -> receive {‘++’,From}-> From ! {self(),V+1}, ref_server(V+1); {‘:=’,From,NewV}-> From ! {self(),NewV}, ref_server(NewV); {‘=’,From}-> From ! {self(),V}, ref_server(V) end inc(Beta) -> Beta ! {‘++’,self()}, receive {Beta,Answer} -> Answer; end.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 15 …could be rewritten as ref_server(V) -> receive {‘++’,From}-> From ! {self(),V+1}, ref_server(V+1); {‘:=’,From,NewV}-> From ! {self(),NewV}, ref_server(NewV); {‘=’,From}-> From ! {self(),V}, ref_server(V) end inc´(Beta) -> if ready(Beta)-> B_V=get(‘V’,Beta)+1, save(‘V’,B_V,Beta), B_V; true -> inc(Beta) end.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 16 Code merging: the general case  g  - Head Create message Send  -Tail g - Head Receive g – tail ’’  - Head Create message Send  -Tail Extracted g – tail Ready Copy Restore β-State Save β-State Copy of  -Tail Yes

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 17 Code merging Can only be done if:  Code explosion does not occur.  The code does not suspend.  The control flow does not escape the included code.  The extracted code terminates (o/w, the sending process might hang after code merging.)

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 18 Code merging These instructions are not extracted:  A call or a return.  Instructions that lead to the suspension of the process.  Throwing an exception.  Some BIFs are large and uncommon and not worth the effort to adapt for external execution. Non-terminating code is unacceptable.  Hence, we do not accept any loops in the control flow graph.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 19 Code merging g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). Create message tail sen d f(Beta) -> Beta ! {hi,self()}, receive {Beta,fine} -> right; {Beta,_Other} -> wrong end.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 20 Code merging g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). Test f-tail f(Beta) -> Mess = {hi,self()}, if ready(Beta)-> NEWCODE ; true -> Beta ! Mess, receive {Beta,fine} -> right; {Beta,_Answer} -> wrong end end.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 21 Code merging g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). NEWCODE :  -Mbox = [{hi,self()}], case  -Mbox of [{hi,  -From}]->  -From !  -{  -self(),fine}; _ ->  -ignore  -end, save_state(Beta), COPY-OF-f-TAIL

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 22 Code merging g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). NEWCODE :  -Mbox = [{hi,self()}], case [{hi,self()}]of [{hi,  -From}]->  -From !  -{  -self(),fine}; _ ->  -ignore  -end, save_state(Beta), COPY-OF-f-TAIL

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 23 Code merging g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). NEWCODE :  -Mbox = [{hi,self()}],  -From = self(),  -From !  -{  -self(),fine},  save_state(Beta), COPY-OF-f-TAIL

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 24 Code merging g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). NEWCODE :  -Mbox = [{hi,self()}],  -From = self(), self()!  -{  -self(),fine},  save_state(Beta), COPY-OF-f-TAIL

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 25 Return messages An important special case Handled by applying the same type of rewriting. The ready test is replaced by a test that checks that both mailboxes are empty.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 26 More Code merging g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). NEWCODE :  -Mbox = [{hi,self()}],  -From = self(), self()!  -{  -self(),fine},  save_state(Beta), COPY-OF-f-TAIL

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 27 More Code merging g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). NEWCODE :  -Mbox = [{hi,self()}],  -From = self(),  -Mbox = {  -self(),fine}, save_state(Beta), COPY-OF-f-TAIL

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 28 More Code merging g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). NEWCODE :  -Mbox = {  -self(),fine}, save_state(Beta), COPY-OF-f-TAIL

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 29 More Code merging g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). NEWCODE :  -Mbox = {  -self(),fine}, save_state(Beta), case  -Mbox of {Beta,fine} -> right; {Beta,_Other} -> wrong end

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 30 More Code merging g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). NEWCODE :  -Mbox = {  -self(),fine}, save_state(Beta), case {  -self(),fine} of {Beta,fine} -> right; {Beta,_Other} -> wrong end

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 31 More Code merging g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). NEWCODE :  -Mbox = {  -self(),fine}, save_state(Beta), right

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 32 More Code merging g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). NEWCODE : save_state(Beta), right

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 33 More Code merging g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). f´(Beta) -> if ready(Beta)-> save_state(Beta), right; true -> f(Beta) end.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 34 This optimization gives Access to variables in another process with almost no overhead. (Two reads, one test, and two writes.) The overhead of the communication protocol can be reduced. ( Creating a tuple and switching on it. )

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 35 Profiling We have profiled parts of Eddie and the AXD code and found:  That each send goes to one particular receive.  The receiving process is typically suspended with an empty mailbox. The same profiling tool could be used in a dynamic compiler to find pairs of senders/receivers whose inter-process communication can be optimized.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 36 Conclusions Presented several different methods for cross-process optimization that reduce the overhead of interprocess communication. No modifications to existing code are required. Open issue is their integration into the Erlang development environment

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 37 Praeterea censeo "0xCA" scribere Erlang posse. Happi That’s all folks!

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 38 Problem How does one find senders and receivers that communicate?  Static analysis  Problematic in a language with dynamic typing, dynamic process creation, communication through mailboxes, and asynchronous communication.  Profiling & dynamic recompilation

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 39 Interprocess inlining Do:  Find two communicating processes.  Merge code from the sender with code from the receiver.  Optimize the merged code. Get:  Reduced message passing.  Short-circuited code for switching on messages.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 40 An example in Erlang… g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). f(Beta) -> Beta ! {hi,self()}, receive {Beta,fine} -> right; {Beta,_Other} -> wrong end.

Informationsteknologi ASTEC, Half-Day-Seminar:Reducing Interprocess Communication Overhead in Concurrent Programs| Slide 41 …could be rewritten as g() -> receive {hi,From}-> From ! {self(),fine}; _ -> ignore end, g(). f´(Beta) -> if ready(Beta)-> save_state(Beta), right; true -> Beta ! {hi,self()}, receive {Beta,fine} -> right; {Beta,_Other} -> wrong end end.