Multithreaded and Distributed Programming -- Classes of Problems ECEN5053 Software Engineering of Distributed Systems University of Colorado Foundations.

Multithreaded and Distributed Programming -- Classes of Problems ECEN5053 Software Engineering of Distributed Systems University of Colorado Foundations of Multithreaded, Parallel, and Distributed Programming, Gregory R. Andrews, Addison-Wesley, 2000

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado2 The Essence of Multiple Threads -- review  Two or more processes that work together to perform a task  Each process is a sequential program  One thread of control per process  Communicate using shared variables  Need to synchronize with each other, 1 of 2 ways  Mutual exclusion  Condition synchronization

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado3 Opportunities & Challenges  What kinds of processes to use  How many parts or copies  How they should interact  Key to developing a correct program is to ensure the process interaction is properly synchronized

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado4 Focus in this course  Imperative programs  Programmer has to specify the actions of each process and how they communicate and synchronize. (Java, Ada)  Declarative programs (not our focus)  Written in languages designed for the purpose of making synchronization and/or concurrency implicit  Require machine to support the languages, for example, “massively parallel machines.”  Asynchronous process execution  Shared memory, distributed memory, networks of workstations (message-passing)

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado5 Multiprocessing monkey wrench  The solutions we addressed last semester presumed a single CPU and therefore the concurrent processes share coherent memory  A multiprocessor environment with shared memory introduces cache and memory consistency problems and overhead to manage it.  A distributed-memory multiprocessor/multicomputer/network environment has additional issues of latency, bandwidth, administration, security, etc.

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado6 Recall from multiprogram systems  A process is a sequential program that has its own thread of control when executed  A concurrent program contains multiple processes so every concurrent program has multiple threads, one for each process.  Multithreaded usually means a program contains more processes than there are processors to execute them  A multithreaded software system manages multiple independent activities

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado7 Why write as multithreaded?  To be cool (wrong reason)  Sometimes, it is easier to organize the code and data as a collection of processes than as a single huge sequential program  Each process can be scheduled and executed independently  Other applications can continue to execute “in the background”

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado8 Many applications, 5 basic paradigms  Iterative parallelism  Recursive parallelism  Producers and consumers (pipelines)  Clients and servers  Interacting peers Each of these can be accomplished in a distributed environment. Some can be used in a single CPU environment.

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado9 Iterative parallelism  Example?  Several, often identical processes  Each contains one or more loops  Therefore each process is iterative  They work together to solve a single program  Communicate and synchronize using shared variables  Independent computations – disjoint write sets

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado10 Recursive parallelism  One or more independent recursive procedures  Recursion is the dual of iteration  Procedure calls are independent – each works on different parts of the shared data  Often used in imperative languages for  Divide and conquer algorithms  Backtracking algorithms (e.g. tree-traversal)  Used to solve combinatorial problems such as sorting, scheduling, and game playing  If too many recursive procedures, we prune.

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado11 Producers and consumers  One-way communication between processes  Often organized into a pipeline through which info flows  Each process is a filter that consumes the output of its predecessor and produces output for its successor  That is, a producer-process computes and outputs a stream of results  Sometimes implemented with a shared bounded buffer as the pipe, e.g. Unix stdin and stdout  Synchronization primitives: flags, semaphores, monitors

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado12 Clients & Servers  Producer/consumer -- one-way flow of information  independent processes with own rates of progress  Client/server relationship is most common pattern  Client process requests a service & waits for reply  Server repeatedly waits for a request; then acts upon it and sends a reply.  Two-way flow of information

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado13 Distributed “procedures” and “calls”  Client and server relationship is the concurrent programming analog of the relationship between the caller of a subroutine and the subroutine itself.  Like a subroutine that can be called from many places, the server has many clients.  Each client request must be handled independently  Multiple requests might be handled concurrently

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado14 Common example  Common example of client/server interactions in operating systems, OO systems, networks, databases, and others -- reading and writing a data file.  Assume file server module provides 2 ops: read and write; client process calls one or other.  Single CPU or shared-memory system:  File server implemented as set of subroutines and data structures that represent files  Interaction between client process and a file typically implemented by subroutine calls

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado15 Client/Server example  If the file is shared  Probably must be written to by at most one client process at a time  Can safely be read concurrently by multiple clients  Example of what is called the readers/writers problem

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado16 Readers/Writers -- many facets  Has a classic solution using mutexes (in chapter 2 last semester) when viewed as a mutual exclusion problem  Can also be solved with  a condition synchronization solution  different scheduling policies  Distributed system solutions include  with encapsulated database  with replicated files  just remote procedure calls & local synchronization  just rendezvous

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado17 Consider a query on the WWW  A user opens a new URL within a Web browser  The Web browser is a client process that executes on a user’s machine.  The URL indirectly specifies another machine on which the Web page resides.  The Web page itself is accessed by a server process that executes on the other machine.  May already exist; may be created  Reads the page specified by the URL  Returns it to the client’s machine  Add’l server processes may be visited or created at intermediate machines along the way

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado18 Clients/Servers -- on same or separate  Clients are processes regardless of # machines  Server  On a shared-memory machine is a collection of subroutines  With a single CPU, programmed using  mutual exclusion to protect critical sections  condition synchronization to ensure subroutines are executed in appropriate orders  Distributed-memory or network -- processes executing on different machine than clients  Often multithreaded with one thread per client

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado19 Communication in client/server app  Shared memory --  servers as subroutines;  use semaphores or monitors for synchronization  Distributed --  servers as processes  communicate with clients using  message passing  remote procedure call (remote method inv.)  rendezvous

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado20 Interacting peers  Occurs in distributed programs, not single CPU  Several processes that accomplish a task  executing the copies of same code (hence, “peers”)  exchanging messages  example: distributed matrix multiplication  Used to implement  Distributed parallel programs including distributed versions of iterative parallelism  Decentralized decision making

Among the 5 paradigms are certain characteristics common to distributed environments. Distributed memory Properties of parallel applications Concurrent computation

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado22 Distributed memory implications  Each processor can access only its own local memory  Program cannot use global variables  Every variable must be local to some process or procedure and can be accessed only by that process or procedure  Processes have to use message passing to communicate with each other

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado23 Example of a parallel application  Remember concurrent matrix multiplication in a shared memory environment -- last semester? Sequential solution first: for [i = 0 to n-1] { for [j = 0 to n-1] { # compute inner product of a[i,*] and b[*, j] c[i, j] = 0.0; for [k = 0 to n-1] c[i, j] = c[i, j] + a[i, k]* b[k, j]; }

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado24 Properties of parallel applications  Two operations can be executed in parallel if they are independent.  Read set contains variables it reads but does not alter  Write set contains variables it alters (and possibly also reads)  Two operations are independent if the write set of each is disjoint from both the read and write sets of the other.

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado25 Concurrent computation Computing rows of result-matrix in parallel. cobegin [i = 0 to n-1] { for [j = 0 to n-1 { c[i, j] = 0.0; for [k = 0 to n-1] c[i, j] = c[i, j] + a[i, k] * b[k, j]; } } # coend

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado26 Differences: sequential vs. concurrent  Syntactic:  cobegin is used in place of for in the outermost loop  Semantic:  cobegin specifies that its body should be executed concurrently -- at least conceptually -- for each value of index i.

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado27  Previous example implemented matrix multiplication using shared variables  Now -- two ways using message passing as means of communication  1. Coordinator process & array of independent worker processes  2. Workers are peer processes that interact by means of a circular pipeline

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado28 Worker 0 Worker n-1 Coordinator data Results... Worker 0 Worker n-1 Peers

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado29  Assume n processors for simplicity  Use an array of n worker processes, one worker on each processor, each worker computes one row of the result matrix process worker[i = 0 to n-1] { double a[n]; # row i of matrix a double b[n,n]; # all of matrix b double c[n]; # row i of matrix c receive initial values for vector a and matrix b; for [j = 0 to n-1] { c[j] = 0.0; for [k = 0 to n-1] c[j] = c[j + a[k] * b[k, j]; } send result c to coord}

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado30  Aside -- if not standalone:  The source matrices might be produced by a prior computation and the result matrix might be input to a subsequent computation.  Example of distributed pipeline.

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado31 Role of coordinator  Initiates the computation and gathers and prints the results.  First sends each worker the appropriate row of a and all of b.  Waits to receive row of c from every worker.

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado32 process coordinator { #source matrix a, b, and c are declared initialize a and b; for [i = 0 to n-1] { send row i of a to worker [i]; send all of b to worker [i]; } for [i = 0 to n-1] receive row i of c from worker [i]; print results which are now in matrix c; }

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado33 Message passing primitives  Send packages up a message and transmits it to another process  Receive waits for a message from another process and stores it in local variables.

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado34 Peer approach Each worker has one row of a & is to compute one row of c Each worker has only one column of b at a time instead of the entire matrix Worker i has column i of matrix b. With this much source data, worker i can compute only the result for c[i, i]. For worker i to compute all of row i of matrix c, it must acquire all columns of matrix b. We circulate the columns of b among the worker processes via the circular pipeline Each worker executes a series of rounds in which it sends its column of b to the next worker and receives a different column of b from the previous worker

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado35  See handout  Each worker executes the same algorithm  Communicates with other workers in order to compute its part of the desired result.  In this case, each worker communicates with just two neighbors  In other cases of interacting peers, each worker communicates with all the others.

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado36 Worker algorithm Process worker [I = 0 to n-1] { double a[n]; #row i of matrix a double b[n]; #one column of matrix b double c[n]; #row i of matrix c double sum = 0.0;# storage for inner products int nextCol = i;# next column of results

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado37 Worker algorithm (cont.) receive row i of matrix a and column i of matrix b; #compute c[i,i] = a[i,*] x b[*,i] for [k = 0 to n-1] sum = sum + a[k] * b[k]; c[nextCol] = sum; # circulate columns and compute rest of c[i,*] for [j = 1 to n-1] { send my column of b to next worker; receive a new column of b from previous worker

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado38 Worker algorithm (cont. 2) sum = 0.0; for [k = 0 to n-1] sum = sum + a[k] * b[k]; if (nextCol == 0) nextCol = n-1; else nextCol = nextCol – 1; c[nextCol] = sum; } send result vector c to coordinator process; }

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado39 Comparisons  In first program, values of matrix b are replicated  In second, each has one row of a and one column of b at any point in time -  First requires more memory but executes faster.  This is a classic time/space tradeoff.

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado40 Summary  Concurrent programming paradigms in a shared-memory environment  Iterative parallelism  Recursive parallelism  Producers and consumers  Concurrent programming paradigms in a distributed-memory environment  Client/server  Interacting peers

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado41 Shared-memory programming

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado42 Shared-Variable Programming  Frowned on in sequential programs, although convenient (“global variables”)  Absolutely necessary in concurrent programs  Must communicate to work together

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado43 Need to communicate  Communication fosters need for synchronization  Mutual exclusion – need to not access shared data at the same time  Condition synchronization – one needs to wait for another  Communicate in distributed environment via messages, remote procedure call, or rendezvous

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado44 Some terms  State – values of the program variables at a point in time, both explicit and implicit. Each process in a program executes independently and, as it executes, examines and alters the program state.  Atomic actions -- A process executes sequential statements. Each statement is implemented at the machine level by one or more atomic actions that indivisibly examine or change program state.  Concurrent program execution interleaves sequences of atomic actions. A history is a trace of a particular interleaving.

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado45 Terms -- continued  The next atomic action in any ONE of the processes could be the next one in a history. So there are many ways actions can be interleaved and conditional statements allow even this to vary.  The role of synchronization is to constrain the possible histories to those that are desirable.  Mutual exclusion combines atomic actions into sequences of actions called critical sections where the entire section appears to be atomic.

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado46 Terms – continued further  Property of a program is an attribute that is true of every possible history.  Safety – never enters a bad state  Liveness – the program eventually enters a good state

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado47 How can we verify?  How do we demonstrate a program satisfies a property?  A dynamic execution of a test considers just one possible history  Limited number of tests unlikely to demonstrate the absence of bad histories  Operational reasoning -- exhaustive case analysis  Assertional reasoning – abstract analysis  Atomic actions are predicate transformers

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado48 Assertional Reasoning  Use assertions to characterize sets of states  Allows a compact representation of states and their transformations  More on this later in the course

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado49 Warning  We must be wary of dynamic testing alone  it can reveal only the presence of errors, not their absence.  Concurrent and distributed programs are difficult to test & debug  Difficult (impossible) to stop all processes at once in order to examine their state!  Each execution in general will produce a different history

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado50 Why synchronize?  If processes do not interact, all interleavings are acceptable.  If processes do interact, only some interleavings are acceptable.  Role of synchronization: prevent unacceptable interleavings  Combine fine-grain atomic actions into coarse-grained composite actions (we call this....what?)  Delay process execution until program state satisfies some predicate

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado51  Unconditional atomic action  does not contain a delay condition  can execute immediately as long as it executes atomically (not interleaved)  examples:  individual machine instructions  expressions we place in angle brackets  await statements where guard condition is constant true or is omitted

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado52  Conditional atomic action - await statement with a guard condition  If condition is false in a given process, it can only become true by the action of other processes.  How long will the process wait if it has a conditional atomic action?

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado53 How to implement synchronization  To implement mutual exclusion  Implement atomic actions in software using locks to protect critical sections  Needed in most concurrent programs  To implement conditional synchronization  Implement synchronization point that all processes must reach before any process is allowed to proceed -- barrier  Needed in many parallel programs -- why?

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado54 Desirable Traits and Bad States  Mutual exclusion -- at most one process at a time is executing its critical section  its bad state is one in which two processes are in their critical section  Absence of Deadlock (livelock) -- If 2 or more processes are trying to enter their critical sections, at least one will succeed.  its bad state is one in which all the processes are waiting to enter but none is able to do so  two more on next slide

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado55 Desirable Traits and Bad states (cont.)  Absence of Unnecessary Delay -- If a process is trying to enter its c.s. and the other processes are executing their noncritical sections or have terminated, the first process is not prevented from entering its c.s.  Bad state is one in which the one process that wants to enter cannot do so, even though no other process is in the c.s.  Eventual entry -- process that is attempting to enter its c.s. will eventually succeed.  liveness property, depends on scheduling policy

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado56 Logical property of mutual exclusion  When process1 is in its c.s., set property1 true.  Similarly, for process2 where property2 is true.  Bad state is where property1 and property2 are both true at the same time  Therefore  want every state to satisfy the negation of the bad state --  mutex: NOT(property1 AND property2)  Needs to be a global invariant True in the initial state and after each event that affects property1 or property2

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado57 Coarse-grain solution process process1 { while (true) { critical section; property1 = false; noncritical section; } process process2 { while (true) { critical section; property2 = false; noncritical section; } bool property1 = false; property2 = false; COMMENT: mutex: NOT(property1 AND property2) -- global invariant

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado58 Does it avoid the problems?  Deadlock: if each process were blocked in its entry protocol, then both property1 and property2 would have to be true. Both are false at this point in the code.  Unnecessary delay: One process blocks only if the other one is not in its c.s.  Liveness -- see next slide

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado59 Liveness guaranteed?  Liveness property -- process trying to enter its critical section eventually is able to do so  If process1 trying to enter but cannot, then property2 is true;  therefore process2 is in its c.s. which eventually exits making property2 false; allows process1’s guard to become true  If process1 still not allowed entry, it’s because the scheduler is unfair or because process2 again gains entry -- (happens infinitely often?)  Strongly-fair scheduler required, not likely.

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado60 Three “spin lock” solutions  A “spin lock” solution uses busy-waiting  Ensure mutual exclusion, are deadlock free, and avoid unnecessary delay  Require a fairly strong scheduler to ensure eventual entry  Do not control the order in which delayed processes enter their c.s.’s when >= 2 try  Busy-waiting solutions were tolerated on a single CPU when the critical section was bounded.  What about busy-waiting solutions in a distributed environment? Is there such a thing?

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado61 Distributed-memory programming

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado62 Distributed-memory architecture  Synchronization constructs we examined last semester were based on reading and writing shared variables.  In distributed architectures, processors  have their own private memory  interact using a communication network  without a shared memory, must exchange messages

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado63 Necessary first steps to write programs for a dist.-memory arch. 1. Define the interfaces with the communication network  If they were read and write ops like those that operate on shared variables, Programs would have to employ busy- waiting synchronization. Why?  Better to define special network operations that include synchronization -- message passing primitives

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado64 2. Message-passing is extending semaphores to convey data as well as to provide synchronization 3. Processes share channels - a communication path Necessary first steps to write programs for a dist.-memory arch. – cont.

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado65 Characteristics  Distributed program may be  distributed across the processors of a distributed-memory architecture  can be run on a shared-memory multiprocessor (Just like a concurrent program can be run on a single, multiplexed processor.)  Channels are the only items that processes share in a distributed program  Each variable is local to one process

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado66 Implications of no shared variables  Variables are never subject to concurrent access  No special mechanism for mutual exclusion is required  Processes must communicate in order to interact  Main concern of distributed programming is synchronizing interprocess communication  How this is done depends on the pattern of process interaction

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado67 Patterns of process interaction  Vary in way channels are named and used  Vary in way communication is synchronized  We’ll look at asynchronous and synchronous message passing, remote procedure calls, and rendezvous.  Equivalent: a program written using one set of primitives can be rewritten using any of the others  However: message passing is best for programming producers and consumers and interacting peers;  RPC and rendezvous best for programming clients and servers

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado68 How related Busy waiting Semaphores Message passing Rendezvous RPC Monitors

revised 9/8/2002 ECEN5053 SW Eng of Distributed Systems, University of Colorado69 Match Examples with Paradigms and Process Interaction categories  ATM  Web-based travel site  Stock transaction processing system  Search service

Multithreaded and Distributed Programming -- Classes of Problems ECEN5053 Software Engineering of Distributed Systems University of Colorado Foundations.

Similar presentations

Presentation on theme: "Multithreaded and Distributed Programming -- Classes of Problems ECEN5053 Software Engineering of Distributed Systems University of Colorado Foundations."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Multithreaded and Distributed Programming -- Classes of Problems ECEN5053 Software Engineering of Distributed Systems University of Colorado Foundations.

Similar presentations

Presentation on theme: "Multithreaded and Distributed Programming -- Classes of Problems ECEN5053 Software Engineering of Distributed Systems University of Colorado Foundations."— Presentation transcript:

Similar presentations

About project

Feedback