Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multicore Hardware and Software Engineering

Similar presentations


Presentation on theme: "Multicore Hardware and Software Engineering"— Presentation transcript:

1 Multicore Hardware and Software Engineering
Bruce K Botcher II

2 Introduction Definitions Why Multi-Core? Why not earlier?
What is being done?

3 Definitions Multi-Core Hardware: Parallel Computing Concurrency
Multi-core vs. Multiprocessor Parallel Computing Use of multiple processors to solve a single problem Concurrency Parallel is a form of concurrency Concurrency not necessarily parallel

4 Why Multi-Core? Moore's Law
The number of transistors places on an integrated circuit doubles every 2 years Still applicable. Lead to higher clock speeds

5 Why Multi-Core? Moore's law used to apply to clock speed as well as transistors. This no longer applies Thermal problems Clock speeds frozen at 4GHz

6 Why Multi-Core? Multi-Core to replace clock speed.
Some say cores will double every 2 years Sandia National Laboratories predicts thousand core processors by middle of next decade Possibly unlikely

7 Why Not Earlier? Parallel computing has been around for decades.
Parallel programming is difficult Thread manangement Memory management

8 Why Not Earlier? Amdahl's Law
Observed in 1967 that only so much speed up Nearly no program can be entirely parallel If 50% can be parallel only a speed up of 2 Much easier for programmers to rely on hardware designers for speed up.

9 What is Being Done By Software Developers Now?
Herb Sutter wrote "The free lunch is over" meaning no more reliance on hardware Platforms Languages Tools

10 Platforms

11 Platforms CUDA CILK++ .NET

12 CUDA Designed by NVIDIA for GeForce graphics cards
One of the most widely used general purpose use of GPU. Builit on top of C and C++

13 Cuda Scalable - Can use as many cores as can be put on the card.
Automatic thread management Easy to write.

14 CUDA This example is to fill a vector with the computation of y = ax+y where x and y are vectors. n is the number of calculations to do.

15 CILK++ Originated in the Cilk Project at MIT
Cilk Arts created Cilk++ from the Cilk Project Cilk Arts has merged with Intel

16 CILK++ Built on top of C++ Three new keywords cilk_spawn cilk_sync
cilk_for

17 CILK++ Benefits Can take out keywords and have a functioning C++ program Can parallelize legacy C++ code Work stealing Cilk screen Integration with Visual Studio

18 CILK++

19 .NET .NET Framework 4 added parallelism
Applies to all Visual langueges C#, VB, F#, C++ PLINQ Parallel Language Integrated Query Allows data queries to be parallelized.

20 .NET Task Libraries Allows for the developer to create new tasks to be run on available cores Allows for easy parallelization Because of the popularity of .NET this is a major step in the right direction for parallel computing.

21 Languages C/C++ Java Ruby Others

22 C/C++ Does not have direct parallelism built in.
Many different platforms have been built on top of C++ CUDA Cilk Visual C++

23 C/C++ Used very often as a base language because
Is already widely used Is efficient for high performance computing Operating system neutral

24 Java Java has supported concurrency for years
Recently began to support true parallelism. Java SE 7 Prior to Java SE 7 threads were created but had to wait until a different thread was no longer running.

25 Java Progression toward Parallelism java.util.concurrent
Java SE 5 and 6 added java.util.concurrent class Java SE 7 added Fork/Join java.util.concurrent provides many different types of locks Atomic variables Sychronization patterns such as semaphores

26 Java Fork/Join fork() - launches a child ForkJoinTask to be execuated asynchronously join() - waits for forks to complete and bring them back together.

27 Java Fork/Join cont. Two types of ForkJoinTask
RecursivAction does not return a value RecursiveTask returns a value Used for divide and conquer algorithms Traveling Salesman Map and Reduce

28 Java Benefits Scalability - can be used on as many cores as are available Portability - Java is used on nearly all platforms Speedup

29 Ruby Much like early Java supports concurrency but
does not support true parallelism. Can spawn threads but they are blocking, in other words they have a wait state. Some "gems" that claim to add parallelism, but may be unreliable.

30 Ruby Ruby on Rails Parallelism is much easier if ruby is used with rails due to the nature of web. Well suited for multiple processes Processes can be run on multiple cores

31 Other Languages Python Functional Languages
Like many of the others does not itself implement parallel, but has modules that allow it. Parallel Python Open source module which allows multiprocessor and multicore. Functional Languages Some think that this may be the future direction of parallel computing Allow for map and reduce, which helps with divide and conqure

32 Tools There is a clear need for more tools for Parallel Programming.
Debugging can be very difficult. Tools Visual Studion 2010/11 Beta Intel Parallel Studio

33 Visual Studio Visual Studio 2010 Parallel Stacks Window
Shows each active thread Where each active thread is in code

34 Visual Studio Parallel Stacks Cont.

35 Visual Studio Parallel Tasks Window Allows for more in depth debugging
Will show if any threads are deadlocked or not running Will allow you to freeze a certain thread Or better freeze all but the selected thread

36 Visual Studio

37 Visual Studio Visual Studio 11 Beta Concurrency Visualizer
Shows utilization of your program Threads Cores Runs your program and does an analysis while you run the program Outputs graphs showing cores used at certain times while running the program.

38 Visual Studio

39 Visual Studio

40 Visual Studio

41 Intel Parallel Studio Parallel Studio is a collection of parallel programming tools for debugging as well as optimization and updating legacy code. It also integrates into Visual Studio Parallel Studio Programs Parallel Advisor Parallel Amplifier Parallel Inspector Security Analysis

42 Intel Parallel Studio Parallel Advisor Parallel Amplifier
Series of steps analyzes legacy code Shows where possible threading can be added Does experiments on the new threaded code Checks for race conditions and deadlock in new code Parallel Amplifier Used to optimize parallel code Shows performance bottlenecks Acts like the concurrency visualizer showing utilization of cores

43 Intel Parallel Studio Parallel Inspector Security Analysis
Will find memory and threading defects Finds the hard to find intermittent and non- deterministic errors Shows where in the code and call stack your errors are, which decreases debug and development time Security Analysis Finds memory and resource leaks Pointer and Array errors Buffer overflows Creates a more secure end product

44 Conclusion The "Free Lunch" is over, and software engineers will now be responsible for performance increases. There are many tools that are now hitting the market that will help make the transition. Languages are being modified or new API's added to aid this transition to parallel computing.

45 References Pedretti, K., Kelly, S., & Levenhagen, M. U.S. Department of Energy, Office of Scientific and Technical Information. (2008). Sandia report: Summary of multi-core hardware and programming model investigations. Alburquerque: Sandia National Laboratories. Sutter, H. (2005). The free lunch is over: A fundamental turn toward concurrency in software. Dr. Dobb's Journal, 30(3), Retrieved from Kim, H., & Bond, R. (2009). Multicore software technologies.IEEE, 26(6), Buck, I., Nickolls, J., Garland, M., & Skadron, K. (2008). Scalable parallel programming with cuda. ACM Queue, 6(2), Retrieved from Leiserson, C. E., & Mirman, I. B. (2008). How to survive the multicore software revolution (or at least survive the hype). (p. 24). Cilk Arts. Campbell, C., & Miller, A. (2011). Parallel programming with microsoft visual c : design patterns for decomposition and coordination on multicore architectures (patterns & practices). (1 ed.). Microsoft Press. Retrieved from Ponge, J. (2011, July). Retrieved from Hansson, D. H. (n.d.). Interview by R. Seidner. Is ruby on rails a crown jewel?. Intelligence in software, Retrieved from /is_ruby_on_rails_a_crown_jewel/ Moth, D. & Toub, S. (2009, September). Debugging task-based parallel applications in visual studio 2010.MSDN Magazine, Retrieved from The ultimate all-in-one performance toolkit: Intel® parallel studio (2011). Retrieved from /Intel_Parallel_Studio_Brief_081610_HighRes.pdf


Download ppt "Multicore Hardware and Software Engineering"

Similar presentations


Ads by Google