Motivation (in words) Large datasets – geographic information systems – bioinformatics – chemistry – physics – environmental modeling A single Dell office computer can’t handle the load, but… what if we use more than one?!
Motivation, cont’d We have a plethora of computers that are idle a large majority of the time Let’s take advantage of the hardware investment that has already been made to provide computing power to enable research tasks on traditionally computationally intractable problems
What’s the catch? Sounds almost too good to be true It is and it isn’t – easy – providing an environment to connect computers together … BOINC! – challenge – creating parallel algorithms to run on the computing environment – challenge – making it easy for programmers and scientists to submit work
Concepts Technical term – non-dedicated cluster – set of computers whose idle time is harnessed to process jobs – individual nodes in the cluster function as standalone computers – in laymen’s terms: “let’s hook a bunch of lab computers together” Software environment – BOINC – powers many worldwide projects (e.g., SETI@home, World Community Grid, climateprediction.net, etc…) – step-by-step instructions (minus the details) of how to build a campus virtual supercomputer
Can we build a supercomputer? Yes – with campus-wide buy-in Let’s start on a smaller scale… – Here we have five computers – With 5 computers – Florida export ~1 day – Adding additional computers is easy(<5min setup) – With 40 computers - Florida export in ~2 hours
Usage scenarios Simplest scenario Non-parallel application with long runtime You don’t want your office or lab computer tied up running the computation Solution: submit your non-parallel app to the cluster using an easy-to-use web interface!
Usage scenarios Build or convert an existing parallel application Four components – Work generator – The client program – Result validator – Result assimilator
Usage scenarios Classroom tool – networking – databases – algorithms and data structures – parallel computing – hardware
Languages supported C/C++ have the best support Java Any arbitrary executable (with a catch!)
Demo Simple example (embarrassingly parallel) Calculate the sum of all the numbers between 1 and 100,000,000,000 No modifications necessary to original Java program as long as it already reads its starting and ending numbers from the command line
Getting Involved Become a beta tester for usage scenario 1 (i.e., the web application for uploading an app to run on the cluster) Suggest a project for collaboration, and I will assist in the conversion process
Thank you! More information about the project can be found on my faculty web page: http://faculty.samford.edu/~brtoone