Presentation is loading. Please wait.

Presentation is loading. Please wait.

18.337: Image Median Filter Rafael Palacios Aeronautics and Astronautics department. Visiting professor (IIT-Institute for Research in Technology, University.

Similar presentations


Presentation on theme: "18.337: Image Median Filter Rafael Palacios Aeronautics and Astronautics department. Visiting professor (IIT-Institute for Research in Technology, University."— Presentation transcript:

1 18.337: Image Median Filter Rafael Palacios Aeronautics and Astronautics department. Visiting professor (IIT-Institute for Research in Technology, University Pontificia Comillas, Madrid, Spain) 1

2 MEDIAN FILTER 2

3 Median Filter 3

4 Median filter algorithm Median filter is a nonlinear operation for noise reduction (dust or spikes). Eliminates noise while preserving edges. Assigns to each point the median value of the neighborhood n*ns log(ns) Matlab function: – C=medfilt2(cn); % 3x3 neighborhood – C=medfilt2(cn,[r c]); % rxc neighborhood 4

5 MATRIX PREPARATION 5

6 Size adjustment 1024x1600x3 5 MB 2048x3200x3 20 MB Original image 6

7 Noise added cn=imnoise(c,'salt & pepper'); 7

8 EXPERIMENTAL RESULTS 8

9 Sensitivity to Image size ~O(n) 9

10 Sensitivity to Neighborhood size Unexpected ! 10

11 Basic experiments Original matrix size: 2048x3200x3=20M Matrix sizes: n=[20M, 80M, 320M, 1280M]  x4 steps Neighborhood sizes: nn=[3 5 9 17 33 65];  2^n + 1 neighborhood Partitioning strategies: 11

12 Computer systems Dell (Xeon 2.67 GHz 8M L3, 12 GB DDR3 1066MHz) – Matlab single core – Matlab parallel toolbox – Matlab with pMatlab Cluster (beagle, beowulf) – MPI 12

13 SINGLE-CORE RESULTS 13

14 Matlab Single-Core 14

15 PARALLEL COMPUTING TOOLBOX 15

16 Matlab Multi-Core Parallel computing toolbox using ‘spmd’ Image size=80MB, neighborhood=65 Worker time matches prediction 16

17 Matlab Multi-Core with spmd there is an overhead of 1.5s for the 80MB matrix (transfer rate 200 MB/s) There are no memory conflict because each lab works on its own copy of the image Parallelization by rows or columns are equivalent 17

18 Matlab Multi-Core 8 core computer, slower memory 2x Xeon Quad 2.26GHz, 8GB 667MHz More overhead 18

19 pMATLAB 19

20 pMatlab Allows to run Matlab in parallel by launching several Matlab processes that communicate using MPI Communications are transparent to the user, since pMatlab uses a distributed matrix approach

21 How it works Several Matlab processes are started The leader process loads the image into a shared matrix Each subprocess receives its corresponding section of the image in X Each subprocess applies median filter and stores results in Y The leader process aggregates results 21

22 Results Computing time does not decrease significantly using double. It scales well using uint8  less data to be moved 22 double uint8

23 Testing remarks Initially the pMatlab algorithm was implemented using 2D double matrices – Filtering was performed in three steps (R, G, B) – The conversion to double, involved multiplying by 8 the size of the matrices (affecting communications) The final implementation involved 3D uint8 matrices 23

24 CONCLUSION 24

25 Conclusion Performance may depend on the algorithm more that on parallelization. (5x5 neighborhood) Matlab’s Parallel Computing Toolbox does not use shared memory. Parallel toolbox uses a lot of memory and communication, because the whole matrix is propagated to all clients. – Algorithm implemented with spmd – It is possible to use distribute matrices to improve – It is possible to use sliced variables if parfor loops. pMatlab uses memory efficiently. MPI version was not developed.

26 Conclusion Speedup comparison

27 Conclusion pMatlab using double pMatlab using uint8

28 pMatlab (3D uint8) 320MB 28 For larger sizes, the impact of latencies is reduced. (computing time and transmission time are linear with size) Speedup is almost perfect in pMatlab, but worst in Toolbox. The amount of memory needed to be sent increases asymptotically to 320MB in the case of pMatlab, however it increases linearly with the number of processors in the case of Parallel Computing Toolbox. 320MB image matrix pMatlabToolbox total timespeeduptotal timespeedup 1 core138.81.01321.0 2 core71.61.972.11.8 4 core40.53.446.12.9 This slide shows the effect of data transfer

29 BACKUP SLIDES 29

30 Parallel computing toolbox: memory issues %Activate parallel computing %matlabpool(4) tic %Create treads spmd c = myfilterP(a,labindex,numlabs); end toc %gather results from treads (inefficient memory allocation) result=[]; for ii=1:length( c ) result=[result,c{ii}]; end toc %Close parallel computing %matlabpool close … spmd(4) if labindex==1 c = myfilterP(a1); end if labindex==2 c = myfilterP(a2); end if labindex==3 c = myfilterP(a3); end if labindex==4 c = myfilterP(a4); end 30 Same result All 4 matrices are sent to all threads

31 pMatlab: sending initial data to clients PARALLEL = 1; if (PARALLEL) %Create map for XL. The leader process owns all data mapL=map([1 1],{},0); %Create map for distributed matrices X and Y. Each processor gets a set of columns mapM=map([1 Np],{},0:Np-1); else mapL=1; mapM=1; end %Create matrices XL, X and Y XL=zeros(n,m,mapL); %owned by Pid 0 X=zeros(n,m,mapM); %distributed input Y=zeros(n,m,mapM); %distributed output if Pid==0 %only the main process makes the initialization load input_matrix XL(:,:)=a; %all data stored in Pid 0 end … X(:,:)=XL; %only leader process has a non-empty X, % so only leader process writes something to X. %Writing to X involves sending data to subproceses, since % different chunks of X belong to different Pids. %Get local part in a standard double matrix. It is faster to work with local matrices. Xloc=local(X); %code Y=put_local(Y,res) ; %After obtaining the resulting matrix res, store it in distributed matrix Y 31

32 pMatlab (double) computing%comm%total timespeedup 1 core34.793.8%2.36.2%371.0 2 core18.275.8%5.824.2%241.5 4 core8.452.5%7.647.5%162.3 32 More data transfer occur with 4 cores (75% of the matrix) than 2 cores (50% of the matrix is copied back and forth). Results are consistent. Conversions from uint8 to double is penalizing pMatlab tests. The 80MB image matrix is in fact 630MB in double format.

33 pMatlab (3D uint8) 33 Times are smaller Speedup is better because communication delays don’t penalize as much computing%comm%total timespeedup 1 core32.698.8%0.41.2%331.0 2 core1790.9%1.79.1%18.71.8 4 core981.8%218.2%113.0


Download ppt "18.337: Image Median Filter Rafael Palacios Aeronautics and Astronautics department. Visiting professor (IIT-Institute for Research in Technology, University."

Similar presentations


Ads by Google