Presentation is loading. Please wait.

Presentation is loading. Please wait.

Zach Miller Computer Sciences Department University of Wisconsin-Madison Bioinformatics Applications.

Similar presentations


Presentation on theme: "Zach Miller Computer Sciences Department University of Wisconsin-Madison Bioinformatics Applications."— Presentation transcript:

1 Zach Miller Computer Sciences Department University of Wisconsin-Madison zmiller@cs.wisc.edu http://www.cs.wisc.edu/condor Bioinformatics Applications and Workloads

2 www.cs.wisc.edu/condor Collaboration with the BMRB The BioMagResBank is a repository for data from NMR spectroscopy on proteins. Two main efforts: - Weekly BLAST run - Protein Structure Determination

3 www.cs.wisc.edu/condor BLAST Framework in PERL completely automates the process: - Requires no previous setup - Downloads and installs BLAST - Retrieves and formats all DBs - Retrieves input queries from URL

4 www.cs.wisc.edu/condor BLAST - Input can be in.tar,.zip,.gz,.Z files - Automatically splits input - Creates condor jobs and a.dag file - Is very fault tolerant by using DAGMan to oversee the run - When all results are complete, it packages the results and log files

5 www.cs.wisc.edu/condor BLAST - Resulting tarballs can be configured to be no larger than a certain size for more reliable transfer - After tarballs are created, they are automatically sent to an ftp server

6 www.cs.wisc.edu/condor BLAST - We’ve been doing the run every week for about a year with almost no human intervention - Very easy to add new databases or sets of input sequences!

7 www.cs.wisc.edu/condor Protein Structure - Collaboration with Jurgen Dorelijers of the BMRB and Aart Nederveen from Utrecht University in the Netherlands - Recalculated the structure of over 500 proteins using state-of-the-art techniques - Applications used were both CNS and CYANA

8 www.cs.wisc.edu/condor Protein Structure - DAGMan used to manage workflow and to provide fault-tolerance. - Using periodic_remove in the submit file to keep the job from “misbehaving” combines nicely with DAGMan’s RETRY feature.

9 www.cs.wisc.edu/condor Protein Structure - The effort used about 30000 hours of compute time - We accomplished the run in about 60 hours of real time - Framework that I created allows you to very simply compute the structure of as many proteins as you like, making it easy, automatic, and repeatable.

10 www.cs.wisc.edu/condor Protein Structure - Groups often use different parameters and protocols in structure determination and only calculate a few structures - Comparing structures from different groups is then difficult

11 www.cs.wisc.edu/condor Protein Structure - Our work was significant because it computed not just a few but over 500 structures - All were computed with the same paramaters, making the results very internally consistant (besides being more accurate on their own due to the state-of- the-art techniques)

12 www.cs.wisc.edu/condor Web Portal - Currently supports only BLAST - Being used by a handful of users from the biochem department at the UW - Interest is growing, so we’ll soon be adding more applications

13 www.cs.wisc.edu/condor Questions? Thank You!


Download ppt "Zach Miller Computer Sciences Department University of Wisconsin-Madison Bioinformatics Applications."

Similar presentations


Ads by Google