Presentation is loading. Please wait.

Presentation is loading. Please wait.

Concurrency Idea. 2 Concurrency idea Challenge –Print primes from 1 to 10 10 Given –Ten-processor multiprocessor –One thread per processor Goal –Get ten-fold.

Similar presentations


Presentation on theme: "Concurrency Idea. 2 Concurrency idea Challenge –Print primes from 1 to 10 10 Given –Ten-processor multiprocessor –One thread per processor Goal –Get ten-fold."— Presentation transcript:

1 Concurrency Idea

2 2 Concurrency idea Challenge –Print primes from 1 to 10 10 Given –Ten-processor multiprocessor –One thread per processor Goal –Get ten-fold speedup (or close)

3 3 Load Balancing Split the work evenly Each thread tests range of 10 9 … … 10 910 2·10 9 1 P0P0 P1P1 P9P9

4 4 Procedure for Thread i void primePrint { int i = ThreadID.get(); // IDs in {0..9} for (j = i*10 9 +1, j<(i+1)*10 9 ; j++) { if (isPrime(j)) print(j); }

5 5 Issues Higher ranges have fewer primes Yet larger numbers harder to test Thread workloads –Uneven –Hard to predict

6 6 Issues Higher ranges have fewer primes Yet larger numbers harder to test Thread workloads –Uneven –Hard to predict Need dynamic load balancing rejected

7 7 17 18 19 Shared Counter each thread takes a number

8 8 Procedure for Thread i int counter = new Counter(1); void primePrint { long j = 0; while (j < 10 10 ) { j = counter.getAndIncrement(); if (isPrime(j)) print(j); }

9 9 Counter counter = new Counter(1); void primePrint { long j = 0; while (j < 10 10 ) { j = counter.getAndIncrement(); if (isPrime(j)) print(j); } Procedure for Thread i Shared counter object

10 10 Where Things Reside cache Bus cache 1 shared counter shared memory void primePrint { int i = ThreadID.get(); // IDs in {0..9} for (j = i*10 9 +1, j<(i+1)*10 9 ; j++) { if (isPrime(j)) print(j); } code Local variables

11 11 Procedure for Thread i Counter counter = new Counter(1); void primePrint { long j = 0; while (j < 10 10 ) { j = counter.getAndIncrement(); if (isPrime(j)) print(j); } Stop when every value taken

12 12 Counter counter = new Counter(1); void primePrint { long j = 0; while (j < 10 10 ) { j = counter.getAndIncrement(); if (isPrime(j)) print(j); } Procedure for Thread i Increment & return each new value

13 13 Counter Implementation public class Counter { private long value; public long getAndIncrement() { return value++; }

14 14 Counter Implementation public class Counter { private long value; public long getAndIncrement() { return value++; } OK for single thread, not for concurrent threads

15 15 What It Means public class Counter { private long value; public long getAndIncrement() { return value++; }

16 16 What It Means public class Counter { private long value; public long getAndIncrement() { return value++; } temp = value; value = temp + 1; return temp;

17 17 time Not so good… Value… 1 read 1 read 1 write 2 read 2 write 3 write 2 232

18 18 Is this problem inherent? If we could only glue reads and writes together… read write read write !!

19 19 Challenge public class Counter { private long value; public long getAndIncrement() { temp = value; value = temp + 1; return temp; }

20 20 Challenge public class Counter { private long value; public long getAndIncrement() { temp = value; value = temp + 1; return temp; } Make these steps atomic (indivisible)

21 21 Hardware Solution public class Counter { private long value; public long getAndIncrement() { temp = value; value = temp + 1; return temp; } ReadModifyWrite() instruction

22 22 An Aside: Java™ public class Counter { private long value; public long getAndIncrement() { synchronized { temp = value; value = temp + 1; } return temp; }

23 23 An Aside: Java™ public class Counter { private long value; public long getAndIncrement() { synchronized { temp = value; value = temp + 1; } return temp; } Synchronized block

24 24 An Aside: Java™ public class Counter { private long value; public long getAndIncrement() { synchronized { temp = value; value = temp + 1; } return temp; } Mutual Exclusion

25 25 Why do we care? We want as much of the code as possible to execute concurrently (in parallel) A larger sequential part implies reduced performance Amdahl’s law: this relation is not linear…

26 26 Amdahl’s Law Speedup= …of computation given n CPUs instead of 1

27 27 Amdahl’s Law Speedup=

28 28 Amdahl’s Law Speedup= Parallel fraction

29 29 Amdahl’s Law Speedup= Parallel fraction Sequential fraction

30 30 Amdahl’s Law Speedup= Parallel fraction Number of processors Sequential fraction

31 31 Example Ten processors 60% concurrent, 40% sequential How close to 10-fold speedup?

32 32 Example Ten processors 60% concurrent, 40% sequential How close to 10-fold speedup? Speedup = 2.17=

33 33 Example Ten processors 80% concurrent, 20% sequential How close to 10-fold speedup?

34 34 Example Ten processors 80% concurrent, 20% sequential How close to 10-fold speedup? Speedup = 3.57=

35 35 Example Ten processors 90% concurrent, 10% sequential How close to 10-fold speedup?

36 36 Example Ten processors 90% concurrent, 10% sequential How close to 10-fold speedup? Speedup = 5.26=

37 37 Example Ten processors 99% concurrent, 01% sequential How close to 10-fold speedup?

38 38 Example Ten processors 99% concurrent, 01% sequential How close to 10-fold speedup? Speedup = 9.17=

39 Back to Real-World Multicore Scaling 39 1.8x 2x 2.9x User code Multicore Speedup Must not be managing to reduce sequential % of code

40 Back to Real-World Multicore Scaling 40 1.8x 2x 2.9x User code Multicore Speedup Not reducing sequential % of code

41 Shared Data Structures 75% Unshared 25% Shared cc cc cc cc Coarse Grained c c c c c c c c cc cc cc cc Fine Grained c c c c c c c c The reason we get only 2.9 speedup 75% Unshared 25% Shared Fine grained parallelism has huge performance benefit

42 Diminishing Returns

43 43 Multiprocessor Programming This is what this course is about… –The % that is not easy to make concurrent yet may have a large impact on overall speedup


Download ppt "Concurrency Idea. 2 Concurrency idea Challenge –Print primes from 1 to 10 10 Given –Ten-processor multiprocessor –One thread per processor Goal –Get ten-fold."

Similar presentations


Ads by Google