Java as a cycle-stealing computational environment Acknowledgments Ames: Po Chung, Creon Levit, Subash Saini UCSC: John Lawton, Rich McClellan, Todd Wipke Al Globus, MRJ Technology Solutions, Inc. at NASA Ames Research Center
What got done Java implementation of genetic molecular design application No input (Parameters.java) No graphics or GUI Computationally intensive -- O(n 3 ) Many output files Used cycle scavenging batch system (Condor) on NAS workstations Approximately 150 runs so far
Genetic molecular design Randomly generate a set of molecules Many times: Select parent molecules at random with bias towards better performance Randomly rip copies of each parent in two Mate opposite halves Replace random molecules with bias towards worse performance Repeat until satisfied
Algorithm properties Stochastic, embarrassingly parallel Robust to failure No guaranteed outcome Fitness function is crucial and non-trivial Performs well as cycle-scavenger using Condor, University of Wisconsin,
Crossover
Time to find small molecules
Finding larger molecules
Resources NAS workstations are idle nights and weekends Provides an ideal resource if owner access is not degraded University of Wisconsin experience suggests each workstation is idle an average of 17 hours/day. Thus NAS should have approximately 6,800 workstation hours/day or 2,448,000 workstation hours/year available. This extra processing costs $0 for hardware
Condor Cycle-scavenging batch system Developed by University of Wisconsin In production since 1986 Unix workstations (NT port in progress) Free from
Some Classes Graph, vertex, edge Molecule, atom, bond Breeder, Population FitnessFunction Parameters, Reporter Sample, DataTable Predicate, Procedure, ExtendedVector IntegerInterval, DoubleInterval
Java advantages Cleaner code than C++ Dynamic loading eliminated input file parsing. CLASSPATH Parameters.class directory classes.jar Standard library Garbage collection priceless massive data structure manipulation cyclic data structures make reference counting ineffective
Java advantages continued Serialization eases checkpointing Virtual machine eases cross-platform development Development on WinTel Execution on SGI workstations Standard library (especially Vectors) Automatic html documentation integrated with code Reflection enables automation of Parameters.java toString() Automatic bounds checking
Java disadvantages JAVA with jit 50% slower than c on simple numerical code Symantec Visual Café unacceptably buggy debugger and crashed system very hard Supersede ok with a few bugs Lack of multiple inheritance sometimes irritating Condor cannot perform automatic checkpointing Condor requires relink to automate checkpointing
Checkpointing Condor jobs may be stopped at any arbitrary time Virtual machine checkpointing would allow automatic heterogeneous mobility stack format not defined heap format undefined and universal serialization potentially problematic jit and optimization causes problems Must hack Java Virtual Machine Java threads cannot be truly interrupted suspend(), resume(), stop() depreciated
Checkpointer.java class application implements Checkpointable start(String[] arguments); restart(); Application calls Checkpointer.ok() Checkpointer.checkpoint() Condor calls Checkpointer.prepareToDie() Checkpointer.areYouReadyToDie() Checkpointer.cancelDeath() Checkpointer.checkpointWhenPossible()
Summary It was fun It was productive 75 classes, 6389 lines of code one unique algorithm two University collaborations two conference presentations one conference poster one journal submission CPU hours per day added to NAS batch capability (so far) Java is my favorite programming language