Presentation is loading. Please wait.

Presentation is loading. Please wait.

COSC513 Operating System Research Paper Fundamental Properties of Programming for Parallelism Student: Feng Chen (134192)

Similar presentations

Presentation on theme: "COSC513 Operating System Research Paper Fundamental Properties of Programming for Parallelism Student: Feng Chen (134192)"— Presentation transcript:

1 COSC513 Operating System Research Paper Fundamental Properties of Programming for Parallelism Student: Feng Chen (134192)

2 Conditions of Parallelism Needs in three key areas: Computation models Inter-processor communication System integration Tradeoffs exist among time, space, performance, cost factors

3 Data and resource dependences Flow dependence: if an execution path exists from S1 to S2 and at least one output of S1 feeds in as input to S2 Antidependence: if S2 follows S1 and the output of S2 overlaps the input to S1 Output dependence: S1 and S2 produce the same output variable I/O dependence: the same file is referenced by more than one I/O statements Unknown dependence: index itself indexed (indirect addressing), no loop variable in the index, nonlinear loop index, etc.

4 Example of data dependence S1: Load R1, A/move mem(A) to R1 S2: Add R2, R1/R2 = (R1) + (R2) S3: Move R1, R3/move (R3) to R1 S4: Store B, R1/move (R1) to mem(B) S2 is flow-dependent on S1 S3 is antidependent on S2 S3 is output-dependent on S1 S2 and S4 are totally independent S4 is flow-dependent on S1 and S3

5 Example of I/O dependence S1: Read(4), A(i)/read array A from tape unit 4 S2: Rewind(4)/rewind tape unit 4 S3: Write(4), B(i)/write array B into tape unit 4 S4: Rewind(4)/rewind tape unit 4 S1 and S3 are I/O dependent on each other This relation should not be violated during execution; otherwise, errors occur.

6 Control dependence The situation where the order of execution of statements cannot be determined before run time Different paths taken after a conditional branch may change data dependences May exist between operations performed in successive iterations of a loop Control dependence often prohibits parallelism from being exploited

7 Example of control dependence Successive iterations of this loop are control- independent: For (I=0; I

8 Example of control dependence The following loop has control- dependent iterations: For (I=1; I

9 Resource dependence Concerned with the conflicts in using shared resources, such as integer units, floating-point units, registers, and memory areas ALU dependence: ALU is the conflicting resource Storage dependence: each task must work on independent storage locations or use protected access to share writable memory area Detection of parallelism requires a check of the various dependence relations

10 Bernstein’s conditions for parallelism Define: I i as the input set of a process P i O i as the output set of a process P i P 1 and P 2 can execute in parallel (denoted as P 1 || P 2 ) under the condition: I 1 ∩ O 2 = 0 I 2 ∩ O 1 = 0 O 1 ∩ O 2 = 0 Note that I 1 ∩ I 2 <> 0 does not prevent parallelism

11 Bernstein’s conditions for parallelism Input set: also called read set or domain of a process Output set: also called write set or range of a process A set of processes can execute in parallel if Bernstein’s conditions are satisfied on a pairwise basis; that is, P 1 ||P 2 || … ||P K if and only if P i ||P j for all i<>j

12 Bernstein’s conditions for parallelism The parallelism relation is commutative: Pi || Pj implies that Pj || Pi The relation is not transitive: Pi || Pj and Pj || Pk do not necessarily mean Pi || Pk Associativity: Pi || Pj || Pk implies that (Pi || Pj) || Pk = Pi || (Pj || Pk)

13 Bernstein’s conditions for parallelism For n processes, there are 3n(n-1)/2 conditions; violation of any of them prohibits parallelism collectively or partially Statements or processes which depend on run-time conditions are not transformed to parallelism. (IF or conditional branches) The analysis of dependences can be conducted at code, subroutine, process, task, and program levels; higher-level dependence can be inferred from that of subordinate levels

14 Example of parallelism using Bernstein’s conditions P1: C = D * E P2: M = G + C P3: A = B + G P4: C = L + M P5: F = G / E Assume no pipeline is used, five steps are needed in sequential execution

15 Example of parallelism using Bernstein’s conditions * / E E D C G B A C L M C G F Time P1 P2 P3 P4 P5 * ++/ + DE C G L BGE M CAF P1 P2P3P5 P4

16 Example of parallelism using Bernstein’s conditions There are 10 pairs of statements to check against Bernstein’s conditions Only P2 || P3 || P5 is possible because P2 || P3, P3 || P5 and P2 || P5 are all possible If two adders are available simultaneously, the parallel execution requires only three steps

17 Implementation of parallelism We need special hardware and software support to implement parallelism There is a distinguish between hardware and software parallelism Parallelism cannot be achieved free

18 Hardware parallelism Often a function of cost and performance tradeoffs If a processor issues k instructions per machine cycle, it is called a k-issue processor Conventional processor takes one or more machine cycles to issue a single instruction: one-issue processor A multiprocessor system built with n k-issue processors should be able to handle maximum nk threads of instructions simultaneously

19 Software parallelism Defined by the control and data dependence of programs A function of algorithm, programming styles, and compiler optimization Two most cited types of parallel programming: Control parallelism: in the form of pipelining and multiple functional units Data parallelism: similar operations performed over many data elements by multiple processors; practiced in SIMD and MIMD systems

20 Hardware vs. Software parallelism Software parallelism Totally eight instructions: 4 loads (L), 2 multiplication (X), 1 addition (+) and 1 subtraction (-) Theoretically, the computation will be accomplished in 3 cycles (steps) LLLL XX +- AB Step 1 Step 2 Step 3

21 Hardware vs. Software parallelism Hardware parallelism (Example 1) By a 2-issue processor which can execute one memory access and one arithmetic operation simultaneously The computation needs 7 cycles (steps) Mismatch between HW and SW parallelism L L L L X\XX\X X + - A B Step 1 Step 2 Step 3 Step 4 Step 5 Step 7 Step 6

22 Hardware vs. Software parallelism Hardware parallelism (example 2) Using a dual-processor system, each processor is single-issue 6 cycles are needed to execute the 12 instructions, where 2 store operations and 2 load operations are inserted for inter- processor communication through the shared memory L L X S L + L L X S L - BA Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 S statements: added instructions for inter- processor communication

Download ppt "COSC513 Operating System Research Paper Fundamental Properties of Programming for Parallelism Student: Feng Chen (134192)"

Similar presentations

Ads by Google