Lecture 21 Parallel Programming Richard Gesick. Parallel Computing Parallel computing is a form of computation in which many operations are carried out.

Slides:



Advertisements
Similar presentations
Paweł Łukasik Wroc.NET mail:
Advertisements

James Kolpack, InRAD LLC popcyclical.com. CodeStock is proudly partnered with: Send instant feedback on this session via Twitter: Send a direct message.
Parallel Extensions to the.NET Framework Daniel Moth Microsoft
A Dynamic World, what can Grids do for Multi-Core computing? Daniel Goodman, Anne Trefethen and Douglas Creager
IAP C# Lecture 3 Parallel Computing Geza Kovacs. Sequential Execution So far, all our code has been executing instructions one after another, on a single.
CSE 1302 Lecture 21 Exception Handling and Parallel Programming Richard Gesick.
CIM2564 Introduction to Development Frameworks 1 Overview of a Development Framework Topic 1.
Revisiting a slide from the syllabus: CS 525 will cover Parallel and distributed computing architectures – Shared memory processors – Distributed memory.
Virtual techdays INDIA │ 9-11 February 2011 Parallelism in.NET 4.0 Parag Paithankar │ Technology Advisor - Web, Microsoft India.
Chapter Hardwired vs Microprogrammed Control Multithreading
PARALLEL PROGRAMMING ABSTRACTIONS 6/16/2010 Parallel Programming Abstractions 1.
Parallel Programming in.NET Kevin Luty.  History of Parallelism  Benefits of Parallel Programming and Designs  What to Consider  Defining Types of.
Processes Part I Processes & Threads* *Referred to slides by Dr. Sanjeev Setia at George Mason University Chapter 3.
Introducing Enterprise Technologies David Dischiave Syracuse University School of Information Studies “The original iSchool” June 3, 2013 Information School,
1 Advanced Computer Programming Concurrency Multithreaded Programs Copyright © Texas Education Agency, 2013.
CS 221 – May 13 Review chapter 1 Lab – Show me your C programs – Black spaghetti – connect remaining machines – Be able to ping, ssh, and transfer files.
Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.
CC02 – Parallel Programming Using OpenMP 1 of 25 PhUSE 2011 Aniruddha Deshmukh Cytel Inc.
OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Multi-core architectures. Single-core computer Single-core CPU chip.
Multi-Core Architectures
Threads by Dr. Amin Danial Asham. References Operating System Concepts ABRAHAM SILBERSCHATZ, PETER BAER GALVIN, and GREG GAGNE.
Lecture Set 2 Part B – Configuring Visual Studio; Configuration Options and The Help System (scan quickly for future reference)
Support for Debugging Automatically Parallelized Programs Robert Hood Gabriele Jost CSC/MRJ Technology Solutions NASA.
Parallel Extensions A glimpse into the parallel universe By Eric De Carufel Microsoft.NET Solution Architect at Orckestra
Chapter 34 Java Technology for Active Web Documents methods used to provide continuous Web updates to browser – Server push – Active documents.
Computer Concepts 2014 Chapter 12 Computer Programming.
Task Parallel Library (TPL)
About Me Microsoft MVP Intel Blogger TechEd Israel, TechEd Europe Expert C++ Book
Super computers Parallel Processing By Lecturer: Aisha Dawood.
Dean Tullsen UCSD.  The parallelism crisis has the feel of a relatively new problem ◦ Results from a huge technology shift ◦ Has suddenly become pervasive.
Module 8 Enhancing User Interface Responsiveness.
CS-303 Introduction to Programming
Data Parallelism Task Parallel Library (TPL) The use of lambdas Map-Reduce Pattern FEN 20141UCN Teknologi/act2learn.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE 498AL, University of Illinois, Urbana-Champaign 1 ECE 498AL Spring 2010 Lecture 13: Basic Parallel.
Visual Studio 2010 and.NET Framework 4 Training Workshop.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
HParC language. Background Shared memory level –Multiple separated shared memory spaces Message passing level-1 –Fast level of k separate message passing.
SMP Basics KeyStone Training Multicore Applications Literature Number: SPRPxxx 1.
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
Page 1 2P13 Week 1. Page 2 Page 3 Page 4 Page 5.
V 1.0 OE-NIK HP 1 Advanced Programming Fundamentals of parallel execution Processes Threads.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Productive Performance Tools for Heterogeneous Parallel Computing
Multithreaded Programming in Java
Async or Parallel? No they aren’t the same thing!
Computer Engg, IIT(BHU)
Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures Topic 22 Similarities & Differences between Vector Arch & GPUs Prof. Zhang Gang.
Functional Programming with Java
Multi-Processing in High Performance Computer Architecture:
Staying Afloat in the .NET Async Ocean
Chapter 4: Threads.
Compiler Back End Panel
Compiler Back End Panel
F# for Parallel and Asynchronous Programming
Threads Chapter 4.
CHAPTER 4:THreads Bashair Al-harthi OPERATING SYSTEM
Multithread Programming
Prof. Leonardo Mostarda University of Camerino
Chapter 4: Threads & Concurrency
Lecture 2 The Art of Concurrency
Process Management -Compiled for CSIT
Lecture 20 Parallel Programming CSE /27/2019.
Lecture 20 Parallel Programming CSE /8/2019.
Presentation transcript:

Lecture 21 Parallel Programming Richard Gesick

Parallel Computing Parallel computing is a form of computation in which many operations are carried out simultaneously. Many personal computers and workstations have two or four cores which enable them to execute multiple threads simultaneously. Computers in the near future are expected to have significantly more cores. To take advantage of the hardware of today and tomorrow, software developers can parallelize their code to distribute work across multiple processors. In the past, parallelization required low-level manipulation of threads and locks. 14-2

Task Parallel Library The Task Parallel Library (TPL) is a set of public types and APIs in the System.Threading namespace in the.NET Framework. This relies on a task scheduler that is integrated with the.NET ThreadPool. The purpose of the TPL is to make developers more productive by simplifying the process of adding parallelism and concurrency to applications. 14-3

Task Parallel Library The TPL scales the degree of concurrency dynamically to most efficiently use all the processors that are available. Parallel code that is based on the TPL not only works on dual-core and quad-core computers. It will also automatically scale, without recompilation, to many-core computers. When you use the TPL, writing a multithreaded for loop closely resembles writing a sequential for loop. The following code automatically partitions the work into tasks, based on the number of processors on the computer. 14-4

The Parallel for and foreach loops Parallel.For(startIndex, endIndex, (currentIndex) => DoSomeWork(currentIndex)); // Sequential version foreach (var item in sourceCollection) { Process(item); } // Parallel equivalent Parallel.ForEach(sourceCollection, item => Process(item)); 14-5

Code segments Use.Net Parallel Extensions (Task Parallel Library) and then see the sample/tutorial on how to get this working in the VS IDE. The code sample uses lambda expressions, but you can assume this is "magic syntax" for now. You could also use delegates syntax if you'd like. 14-6

Code segments You should see increased performance (reduced time to complete) relative to the number of processors/cores you have on the machine. The final sample is using a shared "results" collection, so there is significant blocking and reduced performance. How might we solve this? 14-7