Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction In the process of writing or optimizing High Performance Computing software, mostly using MPI these days, designers can inadvertently introduce.

Similar presentations


Presentation on theme: "Introduction In the process of writing or optimizing High Performance Computing software, mostly using MPI these days, designers can inadvertently introduce."— Presentation transcript:

1 Introduction In the process of writing or optimizing High Performance Computing software, mostly using MPI these days, designers can inadvertently introduce errors due to: (i) the richness of MPI, (ii) the associated complexity of the MPI semantics, (iii) and the well known fact that concurrent programming is tricky. Traditional debugging methods (such as software testing and running incomplete execution analyzers) are insufficiently incisive. The Utah Gauss group is researching several pragmatic adaptations of Formal Methods, and is developing methodological and tool support that: enhances designer understanding of MPI — parallel and distributed program semantics; helps them develop more efficient MPI programs by reducing communication overheads; and helps detect bugs such as deadlocks and other invariant violations early. Preliminary results include: a formal semantics of (a growing number of) MPI primitives; a program analysis framework based on the Microsoft Phoenix compiler that can analyze and help debug MPI programs; case studies involving tricky uses of MPI one-sided communication; and an in-situ model checker that can directly run MPI programs as if in a model checker. Materials and methods We begin with a formal and executable specification of MPI communication semantics expressed in TLA+ (Lamport, MSR). Using Phoenix, we extract a control skeleton of the MPI program, with its constituent MPI calls represented using the descriptions in our formal semantics. We are developing many abstraction methods that can reduce the fraction of the state-space that must be visited to verify properties about these models. Using TLC (a model checker for TLA+ from MSR), we apply finite-state model checking methods that can detect deadlocks and assertion violations. Many of these assertions are automatically inserted (e.g., is every MPI_Isend followed by an MPI_Wait or MPI_Test). An in-situ model checker (under construction) arranges to interrupt control-flow before any MPI call is attempted, and transfers control to a scheduler. In effect, each MPI process is forced to seek permission from this scheduler as to when it may go ahead and make this MPI call. The scheduler permits an interleaved execution to manifest; thereafter, it runs enough “execution interleaving order variants” to exhaust the execution space up to a certain depth bound. Abstraction procedures (under development) are expected to generate simpler programs to verify, containing the same classes of error. Ways to parallelize in-situ model checking are also under investigation. Acknowledgments Supported in part by Microsoft Corporation under the Microsoft HPC Institutes program, and NSF CNS 0509379. We also acknowledge our collaborations with Argonne National Labs (R. Thakur and W. Gropp). Conclusions As the stakes grow higher in HPC (expensive clusters, shortness of active professional lives), more incisive specification and verification techniques are essential. HPC is a rapidly growing area, being directly tied to growth in compute capabilities (e.g. threads and multicores) and simulation aspirations going toward Petaflop computing. Informal specification techniques and ad hoc validation techniques cannot serve as the foundation for build debugging tools in such a critical area. On the other hand, the use of non-scalable formal verification methods is also ineffectual. The Utah Gauss Group has expertise in formal methods and HPC. It collaborates with experts (e.g. Argonne National Labs). It is examining how best to make an impact using formal methods in HPC. Our experience has been that in such a fast moving area, agile de facto standards such as MPI provide ample opportunities to pursue the formal route. We recently formally verified a tricky byte-range locking algorithm that uses MPI 1-sided communication using model checking (one of three EuroPVM/MPI ’06 distinguished papers). Our formalization of MPI has already revealed several imprecise specifications in the standard. Our long-term goal is the demonstration, and ultimate incorporation of these ideas into the Microsoft Compute Cluster Software and also other open source releases of MPI. Formal Methods for handling Multithreading in MPI, as well as formal test generation for MPI libraries are also planned. Formal Verification Methods applied to Compute Cluster Software Ganesh Gopalakrishnan, Mike Kirby, Robert Palmer, Yu Yang, Salman Pervez, Geof Sawaya Subodh Sharma, Igor Melatti, Sonjong Hwang, Michael DeLisi School of Computing and SCI Institute, University of Utah, Salt Lake City, UT 84112 Literature cited Robert Palmer, Ganesh Gopalakrishnan, and Robert M. Kirby, “The Communication Semantics of the Message Passing Interface,” Technical Report UUCS-06-012, School of Computing, University of Utah, 2006. Robert Palmer, Steve Barrus, Yu Yang, Ganesh Gopalakrishnan, and Robert M. Kirby, “Gauss: A framework for verifying scientific computing software,” Workshop on Software Model Checking, 2005. Electronic Notes on Theoretical Computer Science (ENTCS), No. 953. Salman Pervez, Ganesh Gopalakrishnan, Robert M. Kirby, Rajeev Thakur, and William Gropp, “Formal verification of programs that use MPI one-sided communication.” Recent Advances in Parallel Virtual Machine and Message Passing Interface (EuroPVM/MPI, LNCS 4192, pages 30--39, 2006. (Outstanding Paper) For further information Please check our project website http://www.cs.utah.edu/formal_verification Email contacts: {ganesh, kirby} @ cs.utah.edu School of Computing and SCI Institute, University of Utah Model Generator MC Client … #include int main(int argc, char** argv){ int myid; int numprocs; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &numprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myid); if(myid == 0){ int i; for(i = 1; i < numprocs; ++i){ MPI_Send(&i, 1, MPI_INT, i, 0, MPI_COMM_WORLD); } printf("%d Value: %d\n", myid, myid); } else { int val; MPI_Status s; MPI_Recv(&val, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &s); printf("%d Value: %d\n", myid, val); } MPI_Finalize(); return 0; } MPI Program int y; active proctype T1(){ int x; x = 1; if :: x = 0; :: x = 2; fi; y = x; } active proctype T2(){ int x; x = 2; if :: y = x + 1; :: y = 0; fi; assert( y == 0 ); } Program Model Compiler 10010101000101010001010100101010010111 00100100111010101101101001001001001100 10011100100100001111001011001111000111 10010101000101010001010100101010010111 00100100111010101101101001001001001100 10011100100100001111001011001111000111 10010101000101010001010100101010010111 00100100111010101101101001001001001100 10011100100100001111001011001111000111 10010101000101010001010100101010010111 00100100111010101101101001001001001100 10011100100100001111001011001111000111 10010101000101010001010100101010010111 00100100111010101101101001001001001100 10011100100100001111001011001111000111 10010101000101010001010100101010010111 00100100111010101101101001001001001100 10011100100100001111001011001111000111 00100100111010101101101001001001001100 MPI Binary Error Simulator Result Analyzer Refinement OK proctype MPI_Send(chan out, int c){ out!c; } proctype MPI_Bsend(chan out, int c){ out!c; } proctype MPI_Isend(chan out, int c){ out!c; } typedef MPI_Status{ int MPI_SOURCE; int MPI_TAG; int MPI_ERROR; } … MPI Library Model + Model Checker Abstractor Environment Model + What is Model Checking? Navier-Stokes Equations are a mathematical model of fluid flow physics “V&V” – Validation and Verification “Validate Models, Verify Codes” “ Formal models” can be generated either automatically or by a modeler which translate and abstract algorithms and implementations. lock_acquire (start, end) { Stage 1 1 val[0] = 1; /* flag */ val[1] = start; val[2] = end; 2 while(1) { 3 lock_win 4 place val in win 5 get values of other processes from win 6unlock_win 7 for all i, if (Pi conflicts with my range) 8 conflict = 1; Stage 2 9 if(conflict) { 10 val[0] = 0 11 lock_win 12 place val in win 13 unlock_win 14 MPI_Recv(ANY_SOURCE) 15 } 16 else{ 17 /* lock is acquired */ 18break; 19} 20 }//end while Window: P0P0 P1P1 flag start end 0 -1 -1 0 -1 -1 0 -1 -1 Lock Release lock_release (start, end) { val[0] = 0; /* flag */ val[1] = -1; val[2] = -1; lock_win place val in win get values of other processes from win unlock_win for all i, if (P i conflicts with my range) MPI_Send(P i ); } Window: P0P0 P1P1 flag start end 0 -1 -1 0 -1 -1 0 -1 -1 Case Study 1 : Byte Range Locking using MPI 1-sided communication (EuroPVM/MPI 2006) Lock Acquire Abstraction / Refinement / Model Checking P0P0 P1P1 0 3 5 0 3 5 0 -1 -1 0 -1 -1 Process 0Process 1 lock_acquire(3,5) lock_release() lock_acquire(3,5) Deduces Conflict – Stage 2 Block on Receive Deduces Conflict – Stage 2 Block on Receive DEADLOCK Deadlock! Evolution of Line State “Execution Checking” “Model Checking” Worker ThreadCommunication Thread Hash Consumption Queue Communication Queue Take State Off Consumption Queue Expand State (get new set of states) Make decision about Set of states Receive and process inbound Messages Initiate Isends Check completion of Isends MPI Thread Funneled, Dist Mem Model Checking MPI User Program Instrumented at MPI Functions (e.g., MPI_SEND, RECV, WIN_LOCK) Scheduler that receives process requests, and permits one interleaving at a time Permesso? Avanti! P1P2P1P2 Results from Parallel / Distributed Model Checking WTBAActiveWTBSCBS Case Study 2 : Parallel / Distributed Model Checking in Eddy (SPIN 2006)


Download ppt "Introduction In the process of writing or optimizing High Performance Computing software, mostly using MPI these days, designers can inadvertently introduce."

Similar presentations


Ads by Google