Download presentation
Presentation is loading. Please wait.
1
HPC 01 Communication Models, Speedup and Scalability Schoenauer sec 8.2,8.4
2
Message Passing Time zTo send l bytes t comm ( l ) = t startup + (h-1) t start-hop +(l+l 0 ) t send + t block t startup :Total time in setting up the communication t start-hop :Time for switching each “hop” in wormhole routing h : no. of hops l : no. of bytes to transfer l 0 : extra header bytes that are also moved t send :time to actually transfer 1 byte t block : time used in blocked messages en route
3
Communication Model zSpeed = l/t comm Actual << Theoretical hardware limit advertised zConsequences ySend messages in blocks -- avoid small single messages yArrange data distributions to get nearest neighbor communications e.g. use ring shift with direct neighbors
4
Communication Model zProgram with logical processor numbers
5
Communication Model zLatency Hiding: use asynchronous messaging to overlap communication and computation ( MPI_ISEND,MPI_IRECV ) yDomain decomposition in solving grid problems; Compute with first and communicate those while computing
6
Amdahl’s Law zConsider the execution of a program on p processors -- let the part q (0<q<1) of each operation be parallelized. Maximum speedup ysp false = t 1 /t p = 1/ [(q/p) +(1-q)] yIndicates the rapid loss of speedup if parallel fraction is not high enough as p increases yTo get 50% efficiency i.e. 256 on 512 q =0.998
7
Amdahl’s Law
8
zWhy False in speedup ? yAssumed that no. of ops are same for sequential and parallel -- usually algorithms and data structures are different yDid not account for parallelization cost -- communication and synchronization costs! yassumed that performance does not change for sequential/parallel code (diff. vector length...)
9
Speedup honest zsp hon = t 1 for best seq. algo./t p for real parallel algo x= [t1..]/[...+h bas +ph p ] (complex form -diff to use) zh p : communication time that depends on p yp --> infty ysp hon -->0
10
Scalability zThere is an optimal number of processors for each problem zFixed problem size with increasing numbers of processors is a poor use of parallel machine
11
Scalability zIncreasing problem size with increasing numbers of processors leads to better use of parallel machine
12
Scalability zNow let problem size m-->infty as p -->infty
13
Scalability zThus scalability is the desired measure of a parallel algoritthm/code and not speedup! zScalability is achieved if the quantity x[h p *p/m] is constant or increases very slowly as p increases
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.