Presentation is loading. Please wait.

Presentation is loading. Please wait.

O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Cluster Computing Applications Project Parallelizing BLAST Research Alliance of Minorities.

Similar presentations


Presentation on theme: "O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Cluster Computing Applications Project Parallelizing BLAST Research Alliance of Minorities."— Presentation transcript:

1 O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Cluster Computing Applications Project Parallelizing BLAST Research Alliance of Minorities (RAM), Computer Science and Mathematics Division William Burke York College, City University of New York John Mugler and Stephen Scott Oak Ridge National Laboratory

2 O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Parallelizing the BLAST Algorithm: Feasible or Not? Bioinformatics Research needs faster text string matching algorithms. The purpose of this project is to analyze the BLAST algorithm: Define the structure of BLAST. State why it is a valuable Bioinformatics tool. Explore parallelizations of BLAST. BLAST matches query string fragments against a target database. Eliminates need to run a full text string comparison. Speeds up search database search time. Several methods of parallelizing BLAST have been explored.

3 O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Introduction Cluster infrastructure Open Source Cluster Application Resources (OSCAR) Cluster, Command and Control (C3) eXtreme TORC (XTORC) Cluster applications Bioinformatics Toolsets Basic Local Alignment Sequence Tool (BLAST)

4 O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Infrastructure Overview Red Hat Linux 7.2 OSCAR 1.3 C3 - http://www.csm.ornl.gov/torc/C3/http://www.csm.ornl.gov/torc/C3/ LAM/MPI - http://www.lam-mpi.org/http://www.lam-mpi.org/ Maui Scheduler - http://supercluster.org/maui/http://supercluster.org/maui/ MPICH - http://www-unix.mcs.anl.gov/mpi/mpich/http://www-unix.mcs.anl.gov/mpi/mpich/ OpenSSH - http://www.openssh.com/http://www.openssh.com/ OpenSSL - http://www.openssl.org/http://www.openssl.org/ PBS - http://www.openpbs.org/http://www.openpbs.org/ PVM - http://www.csm.ornl.gov/pvm/http://www.csm.ornl.gov/pvm/ SIS - http://www.sisuite.org/http://www.sisuite.org/

5 O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Red Hat Linux 7.2 Installation Configuration Administration Network Configuration. Performance Monitoring. Creating Scripts.

6 O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY OSCAR 1.3 and C3 Tools OSCAR configures the head node. OSCAR builds and configures compute nodes. C3 reduces time and effort to operate and manage a cluster.

7 O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY eXtreme TORC eXtreme TORC powered by OSCAR 65 Pentium IV Machines Peak Performance: 129.7 GFLOPS RAM memory: 50.152 GB Disk Capacity: 2.68 TB Dual interconnects –Gigabit & Fast Ethernet

8 O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY The field of needs faster string Bioinformatics matching algorithms

9 O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Applications Overview BLAST a Bioinformatics tool. http://www.ncbi.nlm.nih.gov/BLAST/blast_overview.html Parallelize BLAST’s algorithm. BLAST

10 O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY BLAST a Bioinformatics Tool What is BLAST? A heuristic algorithm used for string matching query strings to a database. How does BLAST algorithm work? String fragmentation. Statistical means for comparison. How can you parallelize BLAST on a computational cluster?

11 O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Query word (W = 3) QUERY:GSVEDTTGSQSLAALLNKCKTPQGQRLVNQWIKWPLMDKNRIEERLNLVEAFVEDA PQG 18 neighborhood PEG 15 words PRG 14 PKG 14 PMG 13neighborhood PSG 13score threshold PQN 12( T = 13 ) Etc... QUERY STRING SLAALLNKCKTPQGQWLVNQWIKWPLMDKNRIEERLN365 ----L--++K-P-G--+-----+-------------N n DATABASE STRING GSWNLAALDKDPMGDKNRIEERLNLVEAIKWPLMDJN330 The BLAST Search Algorithm

12 O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Parallelization of BLAST NBLAST SLRI Bioinformatics Toolkit ParAlign MOBLAST www.usenix.org/publications/library/proceedings/ als2000/michalickova.html DNA sequence matching processor PARALIGN™

13 O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Conclusion  BLAST algorithm has a diverse family of programs.  Several implementations exist for parallelizing the BLAST algorithm.  Future work to include further exploration of the various parallelized BLAST algorithms on clusters.

14 O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Acknowledgements I would like to extend my thanks to Stephen L. Scott, John Mugler, Thomas Naughton, and Brian Luethke for their invaluable mentoring, Michaelangelo Salcedo for his guidance, Debbie McCoy and Cheryl Hamby for their support in the RAM program.

15 O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Disclaimer This research was performed under the Research Alliance for Minorities Program administered through the Computer Science and Mathematics Division, Oak Ridge National Laboratory. This Program is sponsored by the Mathematical, Information, and Computational Sciences Division; Office of Advanced Scientific Computing Research; U.S. Department of Energy. Oak Ridge National Laboratory is managed by UT-Battelle, LLC, for the U.S. Department of Energy under contract DE-AC05-00OR22725. This research used resources of the Center for Computational Sciences at Oak Ridge National Laboratory, which is supported by the Office of Science, U.S. Department of Energy. This work has been authored by a contractor of the U.S. Government under contract DE-AC05-00OR22725. Accordingly, the U.S. Government retains a nonexclusive, royalty-free license to publish or reproduce the published form of this contribution, or allow others to do so, for U.S. Government purposes.


Download ppt "O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Cluster Computing Applications Project Parallelizing BLAST Research Alliance of Minorities."

Similar presentations


Ads by Google