Parallel Computing with MATLAB

Slides:



Advertisements
Similar presentations
MATLAB Parallel Computing Toolbox A.Hosseini Course : Professional Architecture.
Advertisements

Parallel Matlab Vikas Argod Research Computing and Cyberinfrastructure
Parallel Computing in Matlab
Introduction to Matlab Workshop Matthew Johnson, Economics October 17, /13/20151.
Partitioning and Divide-and-Conquer Strategies ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 23, 2013.
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
Parallel Computing in Matlab An Introduction. Overview Offload work from client to workers Run as many as eight workers (newest version) Can keep client.
Condor and GridShell How to Execute 1 Million Jobs on the Teragrid Jeffrey P. Gardner - PSC Edward Walker - TACC Miron Livney - U. Wisconsin Todd Tannenbaum.
Reference: Message Passing Fundamentals.
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. Chapter 5 - Functions Outline 5.1Introduction 5.2Program.
Chapter 11: Distributed Processing Parallel programming Principles of parallel programming languages Concurrent execution –Programming constructs –Guarded.
Asynchronous Solution Appendix Eleven. Training Manual Asynchronous Solution August 26, 2005 Inventory # A11-2 Chapter Overview In this chapter,
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 8 Introduction to Printers in a Windows Server 2008 Network.
Chapter 11 ASP.NET JavaScript, Third Edition. 2 Objectives Learn about client/server architecture Study server-side scripting Create ASP.NET applications.
Client Server Model and Software Design TCP/IP allows a programmer to establish communication between two application and to pass data back and forth.
1 Chapter Overview Introduction to Windows XP Professional Printing Setting Up Network Printers Connecting to Network Printers Configuring Network Printers.
Printing Terminology. Requirements for Network Printing At least one computer to operate as the print server Sufficient RAM to process documents Sufficient.
Matlab ® Distributed Computing Server CBI Laboratory Release: R2012a Sept. 28, 2012 By: CBI Development Team.
INTRODUCTION TO WEB DATABASE PROGRAMMING
DIANE Overview Germán Carrera, Alfredo Solano (CNB/CSIC) EMBRACE COURSE Monday 19th of February to Friday 23th. CNB-CSIC Madrid.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
Parallelization with the Matlab® Distributed Computing Server CBI cluster December 3, Matlab Parallelization with the Matlab Distributed.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
© 2008 The MathWorks, Inc. ® ® Parallel Computing with MATLAB ® Silvina Grad-Freilich Manager, Parallel Computing Marketing
1 Chapter Client-Server Interaction. 2 Functionality  Transport layer and layers below  Basic communication  Reliability  Application layer.
The Glidein Service Gideon Juve What are glideins? A technique for creating temporary, user- controlled Condor pools using resources from.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
 Introduction to Operating System Introduction to Operating System  Types Of An Operating System Types Of An Operating System  Single User Single User.
Objectives Understand what MATLAB is and why it is widely used in engineering and science Start the MATLAB program and solve simple problems in the command.
UNIT - 1Topic - 2 C OMPUTING E NVIRONMENTS. What is Computing Environment? Computing Environment explains how a collection of computers will process and.
Jozef Goetz, Application Layer PART VI Jozef Goetz, Position of application layer The application layer enables the user, whether human.
1 Functions 1 Parameter, 1 Return-Value 1. The problem 2. Recall the layout 3. Create the definition 4. "Flow" of data 5. Testing 6. Projects 1 and 2.
1 Lab of COMP 406 Teaching Assistant: Pei-Yuan Zhou Contact: Lab 1: 12 Sep., 2014 Introduction of Matlab (I)
DCE (distributed computing environment) DCE (distributed computing environment)
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. C How To Program - 4th edition Deitels Class 05 University.
BLU-ICE and the Distributed Control System Constraints for Software Development Strategies Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
Parallel Computing with MATLAB Jemmy Hu SHARCNET HPC Consultant University of Waterloo May 24,
Parallel Computing with Matlab CBI Lab Parallel Computing Toolbox TM An Introduction Oct. 27, 2011 By: CBI Development Team.
1 © 2012 The MathWorks, Inc. Parallel computing with MATLAB.
Linux Operations and Administration
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February Session 11.
Parallel Computing with MATLAB Jemmy Hu SHARCNET HPC Consultant University of Waterloo Feb. 1, 2012.
Parallel execution Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Interactive Workflows Branislav Šimo, Ondrej Habala, Ladislav Hluchý Institute of Informatics, Slovak Academy of Sciences.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Lab 2 Parallel processing using NIOS II processors
Chapter 1 Computers, Compilers, & Unix. Overview u Computer hardware u Unix u Computer Languages u Compilers.
Page 1 Printing & Terminal Services Lecture 8 Hassan Shuja 11/16/2004.
By Christos Giavroudis Dissertation submitted in partial fulfilment for the degree of Master of Science in Communication & Information Systems Department.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Enabling the use of e-Infrastructures with.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
TOPIC 7.0 LINUX SERVICES AND CONFIGURATION. ROOT USER Root user is called “super user” because it has power far beyond those of mortal user. As root,
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
PROGRESS: GEW'2003 Using Resources of Multiple Grids with the Grid Service Provider Michał Kosiedowski.
Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.
Debugging Lab Antonio Gómez-Iglesias Texas Advanced Computing Center.
Parallel Computing with MATLAB Modified for 240A UCSB Based on Jemmy Hu University of Waterloo
Active-HDL Server Farm Course 11. All materials updated on: September 30, 2004 Outline 1.Introduction 2.Advantages 3.Requirements 4.Installation 5.Architecture.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Overview: Using Hardware.
Configuring Print Services Lesson 7. Print Sharing Print device sharing is another one of the most basic applications for which local area networks were.
Chapter 1: Introduction
OpenPBS – Distributed Workload Management System
Scripts & Functions Scripts and functions are contained in .m-files
Distributed System Structures 16: Distributed Structures
CCR Advanced Seminar: Running CPLEX Computations on the ISE Cluster
Oct. 27, By: CBI Development Team
Multithreaded Programming
Presentation transcript:

Parallel Computing with MATLAB Jemmy Hu SHARCNET HPC Consultant University of Waterloo SHARCNET Summer School 2011 https://www.sharcnet.ca/~jemmyhu/tutorials/ss2011

Agenda PART 1 - Parallel Computing Toolbox - Task Parallelization - Data Parallelization - Batch mode - Interactive with pmode PART 2 - Configure MATLAB and PCT on PC - Batch script - Run PCT on remote clusters (hound, orca) - More examples and Demos

Parallel Computing Toolbox Goal? Parallel Computing Toolbox lets you solve computationally and data-intensive problems using MATLAB on multi-core and multiprocessor computers. The way it does: Parallel processing constructs such as parallel for-loops and code blocks, distributed arrays, parallel numerical algorithms, and message-passing functions let you implement task- and data-parallel algorithms in MATLAB at a high level without programming for specific hardware and network architectures. Results Converting serial MATLAB applications to parallel MATLAB applications requires few code modifications and no programming in a low-level language. You can run your applications interactively on a local machine, or in batch environments on a remote server.

Parallel Computing Toolbox Key Features Support for data-parallel and task-parallel application development Ability to annotate code segments with parfor (parallel for-loops) and spmd (single program multiple data) for implementing task- and data-parallel algorithms High-level constructs such as distributed arrays, parallel algorithms, and message-passing functions for processing large data sets on multiple processors Ability to run 8 workers locally on a multi-core desktop (R2010b) (default to the number of cores available on a PC) Integration with MATLAB Distributed Computing Server for cluster-based applications that use any scheduler or any number of workers Interactive and batch execution modes

PCT Architecture (client-server)

matlabpool Open or close pool of MATLAB sessions for Parallel mode on a MATLAB Pool matlabpool Open or close pool of MATLAB sessions for parallel computation parfor Execute code loop in parallel spmd Execute code in parallel on MATLAB pool batch Run MATLAB script as batch job Interactive Functions help Help for toolbox functions in Command Window pmode Interactive Parallel Command Window

Key Function List Job Creation createJob Create job object in scheduler and client createTask Create new task in job dfeval Evaluate function using cluster Interlab Communication Within a Parallel Job labBarrier Block execution until all labs reach this call labBroadcast Send data to all labs or receive data sent to all labs labindex Index of this lab labReceive Receive data from another lab labSend Send data to another lab numlabs Total number of labs operating in parallel on current job Job Management cancel Cancel job or task destroy Remove job or task object from parent and memory getAllOutputArguments Output arguments from evaluation of all tasks in job object submit Queue job in scheduler wait Wait for job to finish or change states

Typical Use Cases Parallel for-Loops (parfor) allowing several MATLAB workers to execute individual loop iterations simultaneously restriction on parallel loops is that no iterations be allowed to depend on any other iterations. Large Data Sets allows you to distribute that array among multiple MATLAB workers, so that each worker contains only a part of the array Each worker operates only on its part of the array, and workers automatically transfer data between themselves when necessary Batch Jobs offload work to a MATLAB worker session to run as a batch job. the MATLAB worker can run either on the same machine as the client, or if using MATLAB Distributed Computing Server, on a remote cluster machine.

Parallel mode-I: matlabpool Open or close a pool of MATLAB sessions for parallel computation Syntax: MATLABPOOL MATLABPOOL OPEN MATLABPOOL OPEN <poolsize> MATLABPOOL CLOSE MATLABPOOL CLOSE FORCE …… Work on local client PC Without open matlabpool, parallel code will still run but runs sequentially

Task Parallel applications parallel problems by organizing them into independent tasks (units of work) - parallelize Monte Carlo simulations Parallel for-Loops (parfor) parfor (i = 1 : n) % do something with i end - Mix task parallel and serial code in the same function - Run loops on a pool of MATLAB resources - Iterations must be order-independent

matlabpool -Demo

Iterations run in parallel in the MATLAB pool (local workers)

Run a .m file -Demo

Data Parallel applications Single Program Multiple Data (spmd) spmd (n) <statements> end create different sized arrays depending on labindex: matlabpool open spmd (2) if labindex==1 R = rand(4,4); else R = rand(2,2); end matlabpool close For example, create a random matrix on four labs: matlabpool open spmd (2) R = rand(4,4); end matlabpool close

Demo

Demo

Distributed arrays and operations (matlabpool mode) codistributor()

codistributed()

Batch mode Name a .m file as ‘mybatch’ with for i=1:1024 A(i) = sin(i*2*pi/1024); end run in batch mode job = batch('mybatch')                                                The batch command does not block MATLAB, so you must wait for the job to finish before you can retrieve and view its results: wait(job) The load command transfers variables from the workspace of the worker to the workspace of the client, where you can view the results: load(job, 'A') plot(A) When the job is complete, permanently remove its data: destroy(job)

A batch parallel loop % mybatch parfor i=1:1024 A(i) = sin(i*2*pi/1024); end % run job in batch job = batch('mybatch', ‘configuration, ‘local’, 'matlabpool', 2) % To view the results: wait(job) load(job, 'A') plot(A) % remove its data: destroy(job)

Parellel mode-II: pmode >> pmode start P>> pmode exit

pmode demo P>> help magic % ask for help on a function P>> PI = pi % set a variable on all the labs P>> myid = labindex % lab ID P>> all = numlabs % total No. of labs P>> segment = [1 2; 3 4; 5 6] % create a replicated array on all the labs P>> segment = segment + 10*labindex % perform on different labs P>> x = magic(4) % replicated on every lab P>> y=codistirbuted(x) % partitioned among the lab P>> z = y + 10*labindex % operate on the distributed array whole P>> combined = gather(y) % entire array in one workspace The combined is now a 4-by-4 array in the client workspace. whos combined To see the array, type its name. combined

Demo: distributed array operations (repeat)

Parallel pi in pmode use the fact that                                        to approximate pi by approximating the integral on the left. divide the work between the labs by having each lab calculate the integral the function over a subinterval of [0, 1] as shown in the picture

Steps All labs/workers will compute the same function: F=4/(1+x^2) Each worker/lab will calculate over a subinterval [a,b] of [0, 1], for 2 labs, the subinterval will be: [0, 0.50] [0.50, 1.0] a = (labindex-1)/numlabs b = labindex/numlabs Use a MATLAB quadrature method to compute the integral myIntegral = quadl(F, a, b) Add together to form the entire integral over [0,1] piApprox = gplus(myIntegral)

Parallel pi in matlabpool-mode

PART-2 - Configure MATLAB and PCT on PC - Batch script - Run PCT on remote clusters (hound, orca) - Examples and Demos

Where is the MATLAB client? No Shared File System between clients and cluster Login Node Client Machine Distributed Computing Toolbox MATLAB Simulink Blocksets Toolboxes submit jobs ssh Cluster Worker Worker Scheduler CPU Worker Client Machine Distributed Computing Toolbox MATLAB Simulink Blocksets Toolboxes submit jobs ssh Shared File System

Configure MATLAB and PCT on PC Cluster server side - setup MATLAB distributed computing server engine - setup ‘matlab’ queue - command/script for job submission -* create data directory (scratch/userid/matlab) Client side - client configuration - create MATLAB batch job script - create local data directory ‘C:\temp’ Install and configure instruction in the online document https://www.sharcnet.ca/help/index.php/Using_MATLAB

SHARCNET MATLAB Version info ------------------------------------------------------------------------------------- MATLAB Version 7.11.0.584 (R2010b) MATLAB License Number: 359672 Operating System: Microsoft Windows XP Version 5.1 (Build 2600: Service Pack 3) Java VM Version: Java 1.6.0_17-b04 with Sun Microsystems Inc. Java HotSpot(TM) Client VM mixed mode MATLAB Version 7.11 (R2010b) Parallel Computing Toolbox Version 5.0 (R2010b) >> May upgrade to R2011b at the end of the year. 64 server side license seeds (max. 64 MATLAB workers shared by users)

Program Development Guidelines Run code normally on your local machine. First verify all your functions so that as you progress, you are not trying to debug the functions and the distribution at the same time. Run your functions in a single instance of MATLAB® software on your local computer. Decide whether you need a distributed or parallel job. If your application involves large data sets on which you need simultaneous calculations performed, you might benefit from a parallel job with distributed arrays. If your application involves looped or repetitive calculations that can be performed independently of each other, a distributed job might be appropriate. Modify your code for division. Decide how you want your code divided. For a distributed job, determine how best to divide it into tasks; for example, each iteration of a for-loop might define one task. For a parallel job, determine how best to take advantage of parallel processing; for example, a large array can be distributed across all your labs.

Program Development Guidelines - continue Use interactive parallel mode (pmode) to develop parallel functionality. Use pmode with the local scheduler to develop your functions on several workers (labs) in parallel. As you progress and use pmode on the remote cluster, that might be all you need to complete your work. Run the distributed or parallel job with a local scheduler. Create a parallel or distributed job, and run the job using the local scheduler with several local workers. This verifies that your code is correctly set up for batch execution, and in the case of a distributed job, that its computations are properly divided into tasks. Run the distributed job on a cluster node with one task. Run your distributed job with one task to verify that remote distribution is working between your client and the cluster, and to verify file and path dependencies. 7) Run the distributed or parallel job on multiple cluster nodes. Scale up your job to include as many tasks as you need for a distributed job, or as many workers (labs) as you need for a parallel job.

First example – dfeval function The dfeval function allows you to evaluate a function in a cluster of workers without having to individually define jobs and tasks yourself. When you can divide your job into similar tasks, using dfeval might be an appropriate way to run your job. The following code uses a local scheduler on your client computer for dfeval. results = dfeval(@sum, {[1 1] [2 2] [3 3]}, 'Configuration', 'local') results = [2] [4] [6] This example runs the job as three tasks in three separate MATLAB worker sessions, reporting the results back to the session from which you ran dfeval. DEMO

How to run it as a batch job? – Batch script Steps taken in the toolbox to execute the dfeval function results = dfeval(@sum, {[1 1] [2 2] [3 3]}, 'Configuration', 'local') Finds a job manager or scheduler 2) Creates a job 3) Creates tasks in that job 4) Submits the job to the queue in the job manager or scheduler 5) Retrieves the results from the job 6) Destroys the job

Batch script Create a job scheduler / manager object: sched = findResource('scheduler', 'type', 'generic'); Create a job j = createJob(sched); Create three tasks within the job j createTask(j, @sum, 1, {[1 1]}); createTask(j, @sum, 1, {[2 2]}); createTask(j, @sum, 1, {[3 3]}); Submit the job submit(j); Wait for the job to complete waitForState(j); results = getAllOutputArguments(j); Destroy the job destroy(j);

Batch script –for SHARCNET cluster clusterHost = ‘hound.sharcnet.ca'; remoteDataLocation = '/scratch/jemmyhu/matlab'; % change to your own /scratch directory sched = findResource('scheduler', 'type', 'generic'); % Create a job scheduler % Use a local directory as the DataLocation set(sched, 'DataLocation', 'C:\Temp'); set(sched, 'ClusterMatlabRoot', '/opt/sharcnet/matlab/R2010b'); set(sched, 'DestroyJobFcn', @destroyJobFcn); set(sched, 'GetJobStateFcn', @getJobStateFcn); set(sched, 'HasSharedFilesystem', false); set(sched, 'ClusterOsType', 'unix'); % Set the submit function for serial jobs set(sched, 'SubmitFcn', {@distributedSubmitFcn, clusterHost, remoteDataLocation}); j = createJob(sched) createTask(j, @sum, 1, {[1 1]}); createTask(j, @sum, 1, {[2 2]}); createTask(j, @sum, 1, {[3 3]}); submit(j) waitForState(j, 'finished') r = getAllOutputArguments(j) destroy(j)

Demo - test serial example in the online document. stest_hound_10 Demo - test serial example in the online document stest_hound_10.m - task distribution

Second Example – a parallel job In this example, four labs run a parallel job with a 4-by-4 magic square. The first lab broadcasts the matrix with labBroadcast to all the other labs , each of which calculates the sum of one column of the matrix. All of these column sums are combined with the gplus function to calculate the total sum of the elements of the original magic square. The function for this example is shown below. function total_sum = colsum if labindex == 1 % Send magic square to other labs A = labBroadcast(1,magic(numlabs)) else % Receive broadcast on other labs A = labBroadcast(1) end % Calculate sum of column identified by labindex for this lab column_sum = sum(A(:,labindex)) % Calculate total sum by combining column sum from all labs total_sum = gplus(column_sum) DEMO: colsum.m, colsum_submit.m

paralleltestfunction.m

ptest_orca_10.m

Compute pi in batch mode (quadpi.m)

quadpi_submit.m

Monte Carlo Multidimensional Integration (monte3org.m)

montemd.m

References http://www.mathworks.com/products/parallel-computing/?s_cid=HP_FP_ML_parallelcomptbx http://www.mathworks.com/products/parallel-computing/demos.html