Download presentation

Presentation is loading. Please wait.

Published byJayde Brunswick Modified about 1 year ago

1
C ENTER FOR I NTEGRATED R ESEARCH C OMPUTING MATLAB

2
Outline Part I – Interacting with Matlab Running Matlab interactively Accessing the GUI Using the terminal for command entry Using just the terminal Running Matlab in batch mode Using PBS job arrays to do embarrassingly parallel computations Part II – Speeding up Matlab Computations Symmetric Multi-Processing with Matlab Accelerating Matlab computations with GPUs Running Matlab in distributed memory environments Using the Parallel Computing Toolbox Using the Matlab Distributed Compute Engine Server Using pMatlab Part III – Mixing Matlab and Fortran/C code Compiling MEX code from C/Fortran Turning Matlab routines into C code

3
Running Matlab Interactively To use Matlab's GUI you must connect through suitable environment Why NX? Faster than using X11 forwarding (compresses data) Has clients for all major operating systems Saves your session when you are disconnected You don’t have to restart Matlab if your network connection drops. Instructions for obtaining/installing/connecting through NX can be found at:

4

5

6
Running Matlab Interactively To use GUI you must connect through suitable environment Why NX? Faster than using X11 forwarding (compresses data) Has clients for all major operating systems Saves your session when you are disconnected You don’t have to restart Matlab if your network connection drops. The link to Matlab on the NX desktop menu bar actually launches a script that submits a job to the blue hive cluster. It does not run Matlab locally, but instead uses X11 forwarding between compute nodes and the NX server.

7
Running Matlab Interactively We could also launch a terminal on the NX desktop and submit an interactive job from there.

8

9
Running Matlab Interactively We could also launch a terminal on the NX desktop and submit an interactive job from there. qsub -I -X -q interactive -l walltime=1:00:00,nodes=1:ppn=1,vmem=4gb,pvmem=-1 module load matlab-R2013a-local matlab -singlecompthread

10

11
Running Matlab Interactively We could also launch a terminal on the NX desktop and submit an interactive job from there. Occasionally the Matlab Desktop will respond slowly to commands which can be VERY frustrating. One work around is to use the terminal window as the "desktop" – while still retaining the ability to plot windows / access help etc... qsub -I -X -q interactive -l walltime=1:00:00,nodes=1:ppn=1,vmem=4gb,pvmem=-1 module load matlab matlab -singlecompthread matlab -nodesktop -nosplash

12

13
Running Matlab Interactively We could also launch a terminal on the NX desktop and submit an interactive job from there. Occasionally the Matlab Desktop will respond slowly to commands which can be VERY frustrating. One work around is to use the terminal window as the Desktop – while still retaining the ability to plot windows / access help etc... And finally you may not need to plot anything on the screen – or use any of the GUI features. In that case you can... qsub -I -X -q interactive -l walltime=1:00:00,nodes=1:ppn=1,vmem=4gb,pvmem=-1 module load matlab matlab -singlecompthread matlab -nodesktop -nosplash matlab -nodisplay

14

15
Running Matlab Interactively If you are running Matlab without a connected display you can still make plots directly to a file in Matlab You may also find it useful to enter many commands into a script file and then execute the script – so you can do something else while Matlab creates several figures etc... This is also a good way to develop a script for batch jobs. H=hilb(1000); Z=fft2(H); f=figure('Visible','off'); imagesc(log(abs(Z))); print('-dpdf','-r300', 'fig1.pdf')

16

17
Running Matlab Interactively If you are running a machine that has an X-server – you can bypass NX and just use X11 Forwarding. Though if your connection drops – your Matlab session (and your interactive job) will terminate Also if you do use NX and you finish using Matlab – please terminate your session instead of just disconnecting. This will cleanup any jobs you have running and free up resources for other users. ssh -X qsub -I -X -q interactive –l walltime=1:00:00,nodes=1:ppn=1,vmem=4gb,pvmem=-1

18

19
Outline Part I – Interacting with Matlab Running Matlab interactively Accessing the GUI Using the terminal for command entry Using just the terminal Running Matlab in batch mode Using PBS job arrays to do embarrassingly parallel computations Part II – Speeding up Matlab Computations Symmetric Multi-Processing with Matlab Accelerating Matlab computations with GPUs Running Matlab in distributed memory environments Using the Parallel Computing Toolbox Using the Matlab Distributed Compute Engine Server Using pMatlab Part III – Mixing Matlab and Fortran/C code Compiling MEX code from C/Fortran Turning Matlab routines into C code

20
Running Matlab in Batch Mode To submit a job in batch mode we need to create a batch script And a Matlab script containing the commands to run And we should place both files in a folder on /scratch where we will submit the job from. #PBS -N Matlab #PBS -q standard #PBS -l walltime=1:00:00,nodes=1:ppn=1,vmem=4gb,pvmem=-1. /usr/local/modules/init/bash module load matlab matlab -nodisplay -r "sample_script" H=hilb(1000); Z=fft2(H); imagesc(log(abs(Z))); print('-dpdf','-r300','fig1-batch.pdf'); qsub sample_script.pbs sample_script.pbs sample_script.m

21
Outline Part I – Interacting with Matlab Running Matlab interactively Accessing the GUI Using the terminal for command entry Using just the terminal Running Matlab in batch mode Using PBS job arrays to do embarrassingly parallel computations Part II – Speeding up Matlab Computations Symmetric Multi-Processing with Matlab Accelerating Matlab computations with GPUs Running Matlab in distributed memory environments Using the Parallel Computing Toolbox Using the Matlab Distributed Compute Engine Server Using pMatlab Part III – Mixing Matlab and Fortran/C code Compiling MEX code from C/Fortran Turning Matlab routines into C code

22
Using Job Arrays in Batch Mode To use a job array we need to use the “-t” PBS option And turn our Matlab script into a function that takes arguments. (sample_function.m) #PBS -t 0-3 #PBS -N Matlab #PBS -q standard #PBS -l walltime=1:00:00,nodes=1:ppn=1,vmem=4gb,pvmem=-1. /usr/local/modules/init/bash module load matlab matlab -nodisplay -r "sample_function($PBS_ARRAYID)" function sample_function(n) H=hilb(n); Z=fft2(H); imagesc(log(abs(Z))); print('-dpdf','-r300', sprintf('%s%03d%s','fig1-batch_',n,'.pdf')); sample_function.m sample_script.pbs

23
Outline Part I – Interacting with Matlab Running Matlab interactively Accessing the GUI Using the terminal for command entry Using just the terminal Running Matlab in batch mode Using PBS job arrays to do embarrassingly parallel computations Part II – Speeding up Matlab Computations Symmetric Multi-Processing with Matlab Accelerating Matlab computations with GPUs Running Matlab in distributed memory environments Using the Parallel Computing Toolbox Using the Matlab Distributed Compute Engine Server Using pMatlab Part III – Mixing Matlab and Fortran/C code Compiling MEX code from C/Fortran Turning Matlab routines into C code

24
Symmetric Multi-Processing By default Matlab uses all cores on a given node for operations that can be threaded – regardless of the submission script. Arrays and matrices Basic information: ISFINITE, ISINF, ISNAN, MAX, MIN Operators: +, -,.*,./,.\,.^, *, ^, \ (MLDIVIDE), / (MRDIVIDE) Array operations: PROD, SUM Array manipulation: BSXFUN, SORT Linear algebra Matrix Analysis: DET, RCOND Linear Equations: CHOL, INV, LDL, LINSOLVE, LU, QR Eigenvalues and singular values: EIG, HESS, SCHUR, SVD, QZ Elementary math Trigonometric: ATAN2, COS, CSC, HYPOT, SEC, SIN, TAN, including variants for radians, degrees, hyperbolics, and inverses. Exponential: EXP, POW2, SQRT Complex: ABS Rounding and remainder: CEIL, FIX, FLOOR, MOD, REM, ROUND LOG, LOG2, LOG10, LOG1P, EXPM1, SIGN, BITAND, BITOR, BITXOR Special Functions ERF, ERFC, ERFCINV, ERFCX, ERFINV, GAMMA, GAMMALN Data Analysis CONV2, FILTER, FFT and IFFT of multiple columns or long vectors, FFTN, IFFTN

25
Symmetric Multi-Processing To be sure you only use the resources you request, you should either request an entire node and all of the CPU’s... Or request a single cpu and specify that Matlab should only use a single thread qsub -I -X -q interactive -l walltime=1:00:00,nodes=1:ppn=8,vmem=16gb,pvmem=-1. /usr/local/modules/init/bash module load matlab matlab qsub -I -X -q interactive -l walltime=1:00:00,nodes=1:ppn=1,vmem=4gb,pvmem=-1. /usr/local/modules/init/bash module load matlab matlab -singleCompThread

26
Outline Part I – Interacting with Matlab Running Matlab interactively Accessing the GUI Using the terminal for command entry Using just the terminal Running Matlab in batch mode Using PBS job arrays to do embarrassingly parallel computations Part II – Speeding up Matlab Computations Symmetric Multi-Processing with Matlab Accelerating Matlab computations with GPUs Running Matlab in distributed memory environments Using the Parallel Computing Toolbox Using the Matlab Distributed Compute Engine Server Using pMatlab Part III – Mixing Matlab and Fortran/C code Compiling MEX code from C/Fortran Turning Matlab routines into C code

27
Using GPUs with Matlab Matlab can use GPUs to do calculations, provided a GPU is available on the node Matlab is running on. We can query the connected GPUs from within Matlab using qsub -I -X -q blugpu -l walltime=1:00:00,nodes=1:ppn=1:gpus=1,vmem=16gb,pvmem=-1. /usr/local/modules/init/bash module load matlab module load cuda matlab gpuDeviceCount() gpuDevice()

28

29
Using GPUs with Matlab Matlab can use GPUs to do calculations, provided a GPU is available on the node Matlab is running on. We can query the connected GPUs from within Matlab using And obtain a list of GPU supported functions using qsub -I -X -q blugpu -l walltime=1:00:00,nodes=1:ppn=1:gpus=1,vmem=16gb,pvmem=-1. /usr/local/modules/init/bash module load matlab module load cuda matlab gpuDeviceCount() gpuDevice() methods('gpuArray')

30

31
Using GPUs with Matlab So there is a 2D FFT – but no Hilbert function... We could do the log and abs functions on the GPU as well. H=hilb(1000); H_=gpuArray(H); Z_=fft2(H_); Z=gather(Z_); imagesc(log(abs(Z))); H=hilb(1000); H_=gpuArray(H); Z_=fft2(H_); imagesc(gather(log(abs(Z_))); Distribute data to GPU FFT performed on GPU Gather data from GPU onto CPU

32
Using GPUs with Matlab For our example, doing the FFT on the GPU is 7x faster. (4x if you include moving the data to the GPU and back) >> H=hilb(5000); >> tic; A=gather(gpuArray(H)); toc Elapsed time is seconds. >> tic; A=gather(fft2(gpuArray(H))); toc Elapsed time is seconds. >> tic; A=fft2(H); toc Elapsed time is seconds.

33
Using GPUs with Matlab Matlab has no built in hilb() function that can run on the GPU – but we can write our own function(kernel) in cuda – and save it to hilbert.cu And compile it with nvcc to generate a Parallel Thread eXecution file __global__ void HilbertKernel( double * const out, size_t const numRows, size_t const numCols) { const int rowIdx = blockIdx.x * blockDim.x + threadIdx.x; const int colIdx = blockIdx.y * blockDim.y + threadIdx.y; if ( rowIdx >= numRows ) return; if ( colIdx >= numCols ) return; size_t linearIdx = rowIdx + colIdx*numRows; out[linearIdx] = 1.0 / (double)(1+rowIdx+colIdx) ; } nvcc -ptx hilbert.cu

34
Using GPUs with Matlab We have to initialize the kernel and specify the grid size before executing the kernel. The default for matlab kernel’s is 1 thread per block, but we could create fewer blocks that were each 10 x 10 threads. In any event, our speedup is a factor of 50 compared to 1 CPU. H_=gpuArray.zeros(1000); hilbert_kernel=parallel.gpu.CUDAKernel('hilbert.ptx','hilbert.cu'); hilbert_kernel.GridSize=size(H_); H_=feval(hilbert_kernel, H_, 1000,1000); Z_=fft2(H_); imagesc(gather(log(abs(Z_)))); hilbert_kernel.ThreadBlockSize=[10,10,1]; hilbert_kernel.GridSize=[100,100];

35
Outline Part I – Interacting with Matlab Running Matlab interactively Accessing the GUI Using the terminal for command entry Using just the terminal Running Matlab in batch mode Using PBS job arrays to do embarrassingly parallel computations Part II – Speeding up Matlab Computations Symmetric Multi-Processing with Matlab Accelerating Matlab computations with GPUs Running Matlab in distributed memory environments Using the Parallel Computing Toolbox Using the Matlab Distributed Compute Engine Server Using pMatlab Part III – Mixing Matlab and Fortran/C code Compiling MEX code from C/Fortran Turning Matlab routines into C code

36
Parallel Computing Toolbox As an alternative you can also use the Parallel Computing Toolbox. This supports parallelism via MPI You can enable a pool of matlab workers using matlabpool You should create a pool that is the same size as the number of processors you requested in your job submission. Matlab also sells licenses for using a Distributed Computing Server which allows for matlabpools that use more than just the local node. qsub -I -X -q interactive -l walltime=1:00:00,nodes=1:ppn=8,vmem=16gb,pvmem=-1. /usr/local/modules/init/bash module load matlab matlab -singleCompThread matlabpool(8)

37
Parallel Computing Toolbox You can achieve parallelism in several ways: parfor loops – execute for loops in parallel smpd – execute instructions in parallel (using ‘labindex’ or ‘drange’) pmode – interactive version of smpd distributed arrays – very similar to gpuArrays.

38
Parallel Computing Toolbox You can achieve parallelism in several ways: parfor loops – execute for loops in parallel smpd – execute instructions in parallel (using ‘labindex’ or ‘drange’) pmode – interactive version of smpd distributed arrays – very similar to gpuArrays. matlabpool(4) parfor n=1:100 H=hilb(n); Z=fft2(H); f=figure('Visible','off'); imagesc(log(abs(Z))); print('-dpdf','-r300', sprintf('%s%03d%s','fig1-batch_',n,'.pdf')); end matlabpool close

39
Parallel Computing Toolbox You can achieve parallelism in several ways: parfor loops – execute for loops in parallel smpd – execute instructions in parallel (using ‘labindex’ or ‘drange’) pmode – interactive version of smpd distributed arrays – very similar to gpuArrays. matlabpool(4) spmd for n=drange(1:100) H=hilb(n); Z=fft2(H); f=figure('Visible','off'); imagesc(log(abs(Z))); end matlabpool close matlabpool(4) spmd for n=labindex:numlabs:100 H=hilb(n); Z=fft2(H); f=figure('Visible','off'); imagesc(log(abs(Z))); end matlabpool close

40
Parallel Computing Toolbox You can achieve parallelism in several ways: parfor loops – execute for loops in parallel smpd – execute instructions in parallel (using ‘labindex’ or ‘drange’) pmode – interactive version of smpd distributed arrays – very similar to gpuArrays. pmode start 4 pmode lab2client H 3 H3 H3 pmode close n=labindex; H=hilb(n); Z=fft2(H); f=figure('Visible','off'); imagesc(log(abs(Z))); print('-dpdf','-r300', sprintf('%s%03d%s','fig1-batch_',n,'.pdf'));

41

42
Parallel Computing Toolbox You can achieve parallelism in several ways: parfor loops – execute for loops in parallel smpd – execute instructions in parallel (using ‘labindex’ or ‘drange’) pmode – interactive version of smpd distributed arrays – very similar to gpuArrays H=hilb(1000); H_=gpuArray(H); Z_=fft2(H_); Z=gather(Z_); imagesc(log(abs(Z))); matlabpool(8) H=hilb(1000); H_=distributed(H); Z_=fft(fft(H_,[],1),[],2); Z=gather(Z_); imagesc(log(abs(Z))); matlabpool close Example using gpuArray Example using distributed arrays

43
Parallel Computing Toolbox matlabpool(4) spmd codist=codistributor1d(1,[250,250,250,250],[1000,1000]); [i_lo, i_hi]=codist.globalIndices(1); H_local=zeros(250,1000); for i=i_lo:i_hi for j=1:1000 H_local(i-i_lo+1,j)=1/(i+j-1); end H_ = codistributed.build(H_local, codist); end Z_=fft(fft(H_,[],1),[],2); Z=gather(Z_); imagesc(log(abs(Z))); matlabpool close What about building hilbert matrix in parallel? Define partition Get local indices in x-direction Allocate space for local part Initialize local array with Hilbert values. Assemble codistributed array Now it's distributed like before!

44
Outline Part I – Interacting with Matlab Running Matlab interactively Accessing the GUI Using the terminal for command entry Using just the terminal Running Matlab in batch mode Using PBS job arrays to do embarrassingly parallel computations Part II – Speeding up Matlab Computations Symmetric Multi-Processing with Matlab Accelerating Matlab computations with GPUs Running Matlab in distributed memory environments Using the Parallel Computing Toolbox Using the Matlab Distributed Compute Engine Server Using pMatlab Part III – Mixing Matlab and Fortran/C code Compiling MEX code from C/Fortran Turning Matlab routines into C code

45
Using the Matlab Distributed Compute Engine To get started, first cd into an empty directory and run This will generate 4 files: mdce_job.pbs – pbs submission script mdce_script.m – sample matlab script that uses parallel computing toolbox mdce_profile.m – matlab function that uses your environment variables to locate the matlab compute cluster for your job mdce_cleanup is an epilogue script that cleans up the matlab distributed compute server when your job terminates Then you can submit the sample job with mdce_init qsub mdce_job.pbs

46
Here is the job submission script This script loads the matlab module, starts the cluster with pbs_mdce_start, and runs the matlab script "mdce_script.m" #!/bin/bash #PBS -N Matlab_mdce #PBS -j oe #PBS -l nodes=2:ppn=8,pvmem=2000mb #PBS -l walltime=1:00:00 #PBS -l epilogue=mdce_cleanup #PBS -q standard #PBS -o matlab.log. /usr/local/modules/init/bash module load matlab-R2013a-local cd $PBS_O_WORKDIR pbs_mdce_start matlab -nodisplay -r "mdce_script" mdce_job.pbs Note that other versions of matlab could take hours to start the matlab cluster!!! This epilogue script is important to ensure that the cluster is taken down when your job terminates Using the Matlab Distributed Compute Engine

47
And here is the sample matlab script The mdce_profile() function returns a profile that can be used to connect to the mdce cluster for your job. You can then use matlabpool or pmode, or spmd etc... to startup parallel computations across the matlab cluster. profile=mdce_profile() matlabpool('open', profile) parfor n=1:matlabpool('size') H=hilb(n); Z=fft2(H); imagesc(log(abs(Z))); print('-dpdf','-r300',sprintf('%s%03d%s','fig1-batch',n,'.pdf')); end matlabpool('close') mdce_script.m Using the Matlab Distributed Compute Engine

48
For interactive mode, you can use the qMatlab_mdce script. This script will inherit your matlab path from your environment, so be sure to load the matlab-R2013a-local module to speed up the initilization of the cluster. This will create a matlab cluster which in this case consists of 4 nodes each with 8 workers and 16 GB of memory per. To use the matab cluster, load the profile using the mdce_profile() function and then open the pool of workers with matlabpool – or pmode etc... mkdir /scratch/jcarrol5/matlab_mdce cd /scratch/jcarrol5/matlab_mdce module load matlab-R2013a-local qMatlab_mdce profile=mdce_profile() matlabpool('open', profile) Using the Matlab Distributed Compute Engine

49
Outline Part I – Interacting with Matlab Running Matlab interactively Accessing the GUI Using the terminal for command entry Using just the terminal Running Matlab in batch mode Using PBS job arrays to do embarrassingly parallel computations Part II – Speeding up Matlab Computations Symmetric Multi-Processing with Matlab Accelerating Matlab computations with GPUs Running Matlab in distributed memory environments Using the Parallel Computing Toolbox Using the Matlab Distributed Compute Engine Server Using pMatlab Part III – Mixing Matlab and Fortran/C code Compiling MEX code from C/Fortran Turning Matlab routines into C code

50
Using pMatlab pMatlab is an alternative method to get distributed matlab functionality without relying on Matlab’s Distributed Computing Server. It is built on top of MapMPI (an MPI implementation for matlab – written in matlab - that uses file I/O for communication) It supports various operations on distributed arrays (up to 4D) Remapping, aggregating, finding non-zero entries, transposing, ghosting Elementary math functions (trig, exponential, complex, remainder/rounding) 2D Convolutions, FFTs, Discrete Cosine Transform FFT's need to be properly mapped (cannot be distributed along transform dimension). It does not have as much functionality as the parallel computing toolbox – but it does support ghosting and more flexible partitioning!

51
Using pMatlab Since pMatlab works by launching other Matlab instances – we need them to startup with pMatlab functionality. To do so we need to add a few lines to our startup.m file in our matlab path. addpath('/usr/local/pMatlab/MatlabMPI/src'); addpath('/usr/local/pMatlab/src'); rehash; pMatlabGlobalsInit;

52
Running pMatlab in Batch Mode To submit a job in batch mode we need to create a batch script And a Matlab script to launch the pMatlab script #PBS -N Matlab #PBS -q standard #PBS -l walltime=1:00:00,nodes=2:ppn=8,vmem=32gb,pvmem=-1. /usr/local/modules/init/bash module load matlab matlab -nodisplay -r "pmatlab_launcher" [sreturn, nProcs]=system('cat $PBS_NODEFILE | wc -l'); nProcs=str2num(nProcs); [sreturn, machines]=system('cat $PBS_NODEFILE | uniq'); machines=regexp(machines, '\n', 'split'); machines=machines(1:size(machines,2)-1); eval(pRUN('pmatlab_script',nProcs,machines)); sample_script.pbs pmatlab_launcher.m

53
Running pMatlab in Batch Mode And finally we have our pmatlab script. Xmap=map([Np 1],{},0:Np-1); H_=zeros(1000,1000,Xmap); [I1,I2]=global_block_range(H_); H_local=zeros(I1(2)-I1(1)+1,I2(2)-I2(1)+1); for i=I1(1):I1(2) for j=I2(1):I2(2) H_local(i-I1(1)+1,j-I2(1)+1)=1/(i+j-1); end H_=put_local(H_,H_local); Z_=fft(fft(H_,[],2),[],1); Z=agg(Z_); if (pMATLAB.my_rank == pMATLAB.leader) f=figure('Visible','off'); imagesc(log(abs(Z))); print('-dpdf','-r300', 'fig1.pdf'); end map for distributing array Distributed matrix constructor Indices for local portion of array Allocate and populate local portion of array with Hilbert matrix values Copy local values into distributed array Do y-fft and do x-fft. Z_ has different map Collect resulting matrix onto 'leader' Plot result from 'leader' matlab process pmatlab_script.m X = put_local(X, fft(local(X),[],2)); Z = transpose_grid(X); Z = put_local(Z, fft(local(Z),[],1));

54
Using pMatlab PBS is unaware of matlab sessions launched from 'pRUN' and therefore cannot properly clean up if something goes wrong (job runs out of walltime etc...) To avoid leaving orphaned Matlab processes on other machines, modify your PBS script to run this epilogue script which must have user-execute permissions #!/bin/bash cd $PBS_O_WORKDIR/MatMPIa echo "running prologue" pwd; for i in `ls pid.*`; do machine=`echo $i | awk -F '.' '{print $2}'`; pid=`echo $i | awk -F '.' '{print $3}'`\; ssh $machine "(kill -9 $pid)" && rm -rf $i; done epilogue_script.sh #PBS -l epilogue=epilogue_script.sh

55
Outline Part I – Interacting with Matlab Running Matlab interactively Accessing the GUI Using the terminal for command entry Using just the terminal Running Matlab in batch mode Using PBS job arrays to do embarrassingly parallel computations Part II – Speeding up Matlab Computations Symmetric Multi-Processing with Matlab Accelerating Matlab computations with GPUs Running Matlab in distributed memory environments Using the Parallel Computing Toolbox Using the Matlab Distributed Compute Engine Server Using pMatlab Part III – Mixing Matlab and Fortran/C code Compiling MEX code from C/Fortran Turning Matlab routines into C code

56
Compiling Mex Code There is a configuration file for mex that you can place in your ~/.matlab/R2012b/ folder – or whatever version of matlab you are using. The file can be downloaded from the CIRC wiki

57
Compiling Mex Code C, C++, or Fortran routines can be called from within Matlab. #include "fintrf.h" subroutine mexfunction(nlhs, plhs, nrhs, prhs) mwPointer :: plhs(*), prhs(*) integer :: nlhs, nrhs mwPointer :: mxGetPr mwPointer :: mxCreateDoubleMatrix real(8) :: mxGetScalar mwPointer :: pr_out integer :: n n = nint(mxGetScalar(prhs(1))) plhs(1) = mxCreateDoubleMatrix(n,n, 0) pr_out = mxGetPr(plhs(1)) call compute(%VAL(pr_out),n) end subroutine mexfunction subroutine compute(h, n) integer :: n real(8) :: h(n,n) integer :: i,j do i=1,n do j=1,n h(i,j)=1d0/(i+j-1d0) end do end subroutine compute mex hilbert.f90 >> H=hilbert(10)

58
Outline Part I – Interacting with Matlab Running Matlab interactively Accessing the GUI Using the terminal for command entry Using just the terminal Running Matlab in batch mode Using PBS job arrays to do embarrassingly parallel computations Part II – Speeding up Matlab Computations Symmetric Multi-Processing with Matlab Accelerating Matlab computations with GPUs Running Matlab in distributed memory environments Using the Parallel Computing Toolbox Using the Matlab Distributed Compute Engine Server Using pMatlab Part III – Mixing Matlab and Fortran/C code Compiling MEX code from C/Fortran Turning Matlab routines into C code

59
Turning Matlab code into C First we create a log_abs_fft_hilb.m function And then we run This will produce a mex file that we can test. We could have specified the type of 'n' in our matlab function function result = log_abs_fft_hilb(n) result=log(abs(fft2(hilb(n)))); >> codegen log_abs_fft_hilb.m –args {uint32(0)} >> A=log_abs_fft_hilb_mex(uint32(16)); >> B=log_abs_fft_hilb(16); >> max(max(abs(A-B))) ans = e-16 function result = log_abs_fft_hilb(n) assert(isa(n,'uint32')); result=log(abs(fft2(hilb(n))));

60
Turning Matlab code into C Now we can also export a static library that we can link to: This will create a subdirectory codegen/lib/log_abs_fft_hilb that will have the source files '.c and.h' as well as a compiled object files '.o' and a library 'log_abs_fft_hilb.a' The source files are portable to any platform with a 'C' compiler (ie BlueStreak). We can rebuild the library on BlueStreak by running >> codegen log_abs_fft_hilb.m -config coder.config('lib') -args {'uint32(0)'} mpixlc –c *.c ar rcs log_abs_fft_hilb.a *.o

61
Turning Matlab code into C To use the function, we still need to write a main subroutine that links to it. This requires working with matlab's variable types (which include dynamically resizable arrays) #include "stdio.h" #include "rtwtypes.h" #include "log_abs_fft_hilb_types.h" void main() { uint32_T n=64; emxArray_real_T *result; int32_T i,j; emxInit_real_T(&result, 2); log_abs_fft_hilb(n, result); for(i=0;i size[0];i++) { for(j=0;j size[1];j++) { printf("%f ",result->data[i+result->size[0]*j]); } printf("\n"); } emxFree_real_T(&result); } Matlab type definitions Argument to Matlab function Return value of Matlab function Initialize Matlab array to have rank 2 Call matlab function Free up memory associated with return array Output result in column major order Exported code was 2x slower.

62
Turning Matlab code into C And here is another example of calling 2D fft's on real data void main() { int32_T q0; int32_T i; int32_T n=8; emxArray_creal_T *result; emxArray_real_T *input; emxInit_creal_T(&result, 2); emxInit_real_T(&input, 2); q0 = input->size[0] * input->size[1]; input->size[0]=n; input->size[1]=n; emxEnsureCapacity((emxArray__common *)input, q0, (int32_T)sizeof(real_T)); for(j=0;j size[1];j++ { for(i=0;i size[0];i++) { input->data[i+input->size[0]*j]=1.0 / (real_T)(i+j+1); } my_fft(input, result); for(i=0;i size[0];i++) { for(j=0;j size[1];j++) { printf("[% 10.4f,% 10.4f] ", result->data[i+result->size[0]*j].re, result->data[i+result->size[0]*j].im); } printf("\n"); } emxFree_creal_T(&result); emxFree_real_T(&input); }

63
Turning Matlab code into C Exported FFT's only work on vectors of length 2 N Error checking is disabled in exported C code Mex code will have the same functionality as exported C code, but will also have error checking. It will warn about doing FFT's on arbitrary length vectors, etc... Always test your mex code!

64
Matlab code is not that different from C code #include void main() { int n=4096; int i,j; double complex temp[n][n], input[n][n]; double result[n][n]; fftw_plan p; p=fftw_plan_dft_2d(n, n, &input[0][0], &temp[0][0], FFTW_FORWARD, FFTW_ESTIMATE); for (i=0;i

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google