Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parallel Programming. Introduction Idea has been around since 1960’s –pseudo parallel systems on multiprogram-able computers True parallelism –Many processors.

Similar presentations


Presentation on theme: "Parallel Programming. Introduction Idea has been around since 1960’s –pseudo parallel systems on multiprogram-able computers True parallelism –Many processors."— Presentation transcript:

1 Parallel Programming

2 Introduction Idea has been around since 1960’s –pseudo parallel systems on multiprogram-able computers True parallelism –Many processors connected to run in concert Multiprocessor system Distributed system –stand-alone systems connected –More complex with high-speed networks

3 Programming Languages Used to express algorithms to solve problems presented by parallel processing systems Used to write OSs that implement these solutions Used to harness capabilities of multiple processors efficiently Used to implement and express communication across networks

4 Two kinds of parallelism Existing in underlying hardware As expressed in programming language –May not result in actual parallel processing –Could be implemented with pseudo parallelism –Concurrent programming – expresses only potential for parallelism

5 Some Basics Process –An instance of a program or program part that has been scheduled for independent execution Heavy-weight process –full-fledged independent entity with all the memory and other resources that are ordinarily allocated by OS Light-weight process or thread –shares resources with program it came from

6 Process states Blocked – waiting for a resource Executing – in possession of processor Waiting – waiting to be schedule

7 Primary requirements for organization Must be a way for processors to synchronize their activities –1 st processor input and sorts data –2 nd processor waits to perform computations on sorted data Must be a way for processors to communicate data among themselves –2 nd processor needs data

8 Architectures SIMD (single-instruction, multiple-data) –One processor is controller –All processors execute same instructions on respective registers or data sets –Multiprocessing –Synchronous (all processors operate at same speed) –Implicit solution to synchronization problem MIMD (multiple-instruction, multiple-data) –All processors act independently –Multiprocessor or distributed processor systems –Asynchronous (synchronization critical problem)

9 Memory Shared-memory (one central memory) –Used in multiprocessor systems Distributed-memory –Each processor has its own independent memory –Communication critical here

10 More terms Mutual exclusion (synchronize access to shared memory) Without mutual exclustion  Race condition (interleaved modifications to memory, for example) Deadlock (processes waiting on each other to unblock)

11 OS requirements for Parallelism Means of creating and destroying processes Means of managing the number of processors used by processes Mechanism for ensuring mutual exclusion on shared-memory systems Mechanism for creating and maintaining communication channels between processors on distributed-memory systems

12 Language requirements Machine independence Adhere to language design principles Some languages use shared-memory model and provide facilities for mutual exclusion through a library Some assume distributed-memory model and provide communication facilities A few include both

13 Common mechanisms Threads Semaphores Monitors Message passing

14 2 common sample problems Bounded buffer problem –similar to producer-consumer problem Parallel matrix multiplication –N 3 algorithm –Assign a process to compute each element, each process on a separate processor  N steps

15 Without explicit language facilities One approach is not to be explicit –Possible in some functional, logical, and OO languages –Certain inherent parallelism implicit Language translators use optimization techniques to make use automatically of OS utilities to assign different processors to different parts of program Suboptimal

16 Another alternative without explicit language facilities Translator offers compiler options to allow explicit indicating of areas where parallelism is called for. Most effective in nested loops Example: Fortran

17 integer a(100, 100), b(100, 100), c(100,100) integer i, j, k, numprocs, err numprocs = 10 C code to read in a and b goes here err = m_set_procs (numprocs) C$doacross share (a, b, c), local (j, k) do 10 i = 1, 100 do 10 j = 1, 100 c(i,j) = 0 do 10 k = 1, 100 c(i, j) = c(i,j) + a(i, k) * b (k, j) 10 continue call m_kill_procs C code to write out c goes here end compiler directive synchronizes the processes, all processes wait for entire loop to finish; one process continues after loop local – local to process share – access by all processes m_set_procs –sets the number of processes

18 3 rd way with explicit constructs Provide a library of functions This passes facilities provided by OS directly to programmer (This is the same as providing it in language) Example: C with library parallel.h

19 #include #define size 100 #define NUMPROCS 10 shared int a[SIZE][SIZE], b[SIZE][SIZE], c [SIZE] [SIZE] void multiply (void) { int i, j, k; for (i=m_get_myid(); i < SIZE; i += NUMPROCS) for (j=0; j < SIZE; j++) for (k=0; k < SIZE; k++) c(i, j) += a(i, k) * b (k, j); } main () { int err; // code to read in a and b goes here m_set_procs (NUMPROCS); m_fork (multiply); m_kill_procs (); // C code to write out c goes here return 0; } m_set_procs –creates the 10 processes, all instances of multiply

20 4 th final alternative Simply rely on OS Example: –pipes in Unix OS ls | grep “java” –runs ls and grep in parallel –output of ls is piped to grep

21 Language with explicit mechanism 2 basic ways to create new processes –SPMD (single program multiple data) split the current process into 2 or more that execute copies of the same program –MPMD (multiple program multiple data) a segment of code associated with each new process typical case fork-join model, in which a process creates several child processes, each with its own code (a fork), and then waits for the children to complete their execution (a join) last example similar, but m_kill_procs takes place of join

22 Granularity Size of code assignable to separate processes –fine-grained: statement-level parallelism –medium-grained: procedure-level parallelism –large-grained: program-level parallelism Can be an issue in program efficiency –small-grained: overhead –large-grained: may not exploit all opportunities for parallelism

23 Thread fine-grained or medium-grained without overhead of full-blown process creation

24 Issues Does parent suspend execution while child processes are executing, or does it continue to execute alongside them? What memory, if any, does a parent share with its children or the children share among themselves?

25 Answers in Last example parent process suspended execution indicate explicitly global variables shared by all processes

26 Process Termination Simplest case –a process executes its code to completion then ceases to exist Complex case –process may need to continue executing until a certain condition is met and then terminate

27 Statement-Level Parallelism (Ada) parbegin S1; S2; … Sn; parend;

28 Statement-Level Parallelism (Fortran95) FORALL (I = 1:100, J=1:100) C(I,J) = 0; DO 10 K = 1,100 C(I,J) = C(I,J) + A(I,k) * B(K,j) 10 CONTINUE END FORALL

29 Procedure-Level Parallelism (Ada) x = newprocess(p); … killprocess(x); where p is declared procedure and x is a process designator similar to tasks in Ada

30 Program-Level Parallelism (Unix) fork creates a process that is an exact copy of calling process if (fork ( ) == 0) { /*..child executes this part */} else { /*..parent executes this part */} a returned 0-value indicates process is the child

31 Java threads built into Java Thread class part of java.lang package reserved word synchronize –establish mutual exclusion create an instance of Thread object define its run method that will execute when thread starts

32 Java threads 2 ways (I’ll show you second more versatile way) Define a class that implements Runnable interface (define run method) Then pass an object of this class to the Thread constructor Note: Every Java program is already executing inside a thread whose run method is main.

33 Java Thread Example class MyRunner implements Runnable { public void run() { … } } MyRunner m = new MyRunner (); Thread t = new Thread (m); t.start (); //t will now execute the run //method

34 Destroying threads let each thread run to completion wait for other threads to finish t.start (); //do some other work t.join () //wait for t to finish interrupt it t.start (); //do some other work t.interrupt() //tell t we are waiting… t.join () //wait for t to finish

35 Mutual exclusion class Queue { … synchronized public Object dequeue () { if (empty()) throw … } synchronized public Object enqueue (Object obj) { … } … }

36 Mutual exclusion class Remover implements Runnable { public Remover (Queue q) {..} public void run( ) { …q.dequeue() …} } class Insert implements Runnable { public Insert (Queue q) {…} public void run () { …q.enqueue (…) …} }

37 Mutual exclusion Queue myqueue = new Queue(..); … Remover r = new Remover (q); Inserter i = new Insert (q); Thread t1 = new Thread (r); Thread t2 = new Thread (i); t1.start(); t2.start();

38 Manually stalling a thread and then reawakening it class Queue { … synchronized public Object dequeue () { try { while (empty()) wait(); } catch (InterruptedException e) //reset interrupt { … } } synchronized public Object enqueue (Object obj) { … notifyAll(); } … }


Download ppt "Parallel Programming. Introduction Idea has been around since 1960’s –pseudo parallel systems on multiprogram-able computers True parallelism –Many processors."

Similar presentations


Ads by Google