Presentation is loading. Please wait.

Presentation is loading. Please wait.

BalticGrid-II Project 2nd BG-II AHM, 13.05.2009, Riga, Latvia1 Overview of application CoPS (Comparison of Protein Structures) D.Ludviga IMCS UL (SigmaNet)

Similar presentations


Presentation on theme: "BalticGrid-II Project 2nd BG-II AHM, 13.05.2009, Riga, Latvia1 Overview of application CoPS (Comparison of Protein Structures) D.Ludviga IMCS UL (SigmaNet)"— Presentation transcript:

1 BalticGrid-II Project 2nd BG-II AHM, 13.05.2009, Riga, Latvia1 Overview of application CoPS (Comparison of Protein Structures) D.Ludviga IMCS UL (SigmaNet)

2 2nd BG-II AHM, 13.05.2009, Riga, Latvia2 Outline About CoPS (scientific value); What's new?; Challenges (mentioned during 1AHM); Our solution; Collaboration possibilities.

3 2nd BG-II AHM, 13.05.2009, Riga, Latvia3 About CoPS (scientific value) Started at the beginning of BG-II as the pilot application;  developed by Dr. Natalja Kurbatova and Asoc. Prof. Juris Viksna Field – Bioinformatics; “It has taken biologists some 230 years to identify and describe three quarters of a million insects; if there are indeed at least thirty million... then, working as they have in the past, insect taxonomists have ten thousand years of employment ahead of them.” R.Leakey and L.Roger

4 2nd BG-II AHM, 13.05.2009, Riga, Latvia About CoPS Assumption - protein structures have evolved by a stepwise process, each step involving a small change in the structure. Comparison of protein structures using Evolutionary Secondary Structures Matching (ESSM) algorithm  ESSM was created for pair wise comparison of structures that allow to identify fold mutations and to estimate evolutionary relationship between proteins. For exploration of evolution of protein structures all-against-all comparison have to be done Application needs:  Protein data base (data set description files are stored) – PDB (3D), FASTA (.txt), structural elements; – size ~8 GB (~2.3GB if compressed);  Total number of tasks - 20 451 945, divided in 410 files

5 2nd BG-II AHM, 13.05.2009, Riga, Latvia About CoPS Application consists of:  jdl.essm - JDL file for submitting ESSM (CoPS) job  essm.sh - shell script that is executed on WN once the job starts  database.tar.gz - archive of the protein database with protein descriptions, which is extracted on the WN before anything else starts  essm.linux - statically compiled executable for ESSM(CoPS) that works on Scientific Linux [CERN] 4, 32-bit binary  pairs.txt - sample calculation file that contains pair comparisons  At the end of each job result file pairs.result is generated Afterwards visualized using a self made tool.  developed using one of GRADE components

6 2nd BG-II AHM, 13.05.2009, Riga, Latvia6 About CoPS

7 2nd BG-II AHM, 13.05.2009, Riga, Latvia Whats new? Developed (results received);  ~2 weeks. Implemented in Migrating Desktop; Presented/demonstrated on OGF25/EGEE Users Forum in Catania, Italy Demo

8 2nd BG-II AHM, 13.05.2009, Riga, Latvia Challenges and our solution Challenges:  Transport the data; – 410 x 2.3GB ≈ 950GB  VOMS-proxy.Solutions  The needed data was installed on separate clusters software directories (developed “devoted” protein clusters)  Myproxy

9 2nd BG-II AHM, 13.05.2009, Riga, Latvia Results The results of the ESSM algorithm were successfully used for the exploration of the CATH fold space by using fold space graphs for representation of comparison results and estimation of "evolution distance" on the basis of observed changes. The results obtained in the application can be represented as a few steps toward the creation of an general protein evolution model.

10 2nd BG-II AHM, 13.05.2009, Riga, Latvia Collaboration “Computer science is no more about computers than astronomy is about telescopes” E.W.Dijkstra Continue collaboration with biologists in LU; Develop an VO or just devoted servers:  PDB can be installed on a clusters VO software directory – To speed up execution of jobs and avoid per-job download and extraction of these databases.

11 2nd BG-II AHM, 13.05.2009, Riga, Latvia Thank you!


Download ppt "BalticGrid-II Project 2nd BG-II AHM, 13.05.2009, Riga, Latvia1 Overview of application CoPS (Comparison of Protein Structures) D.Ludviga IMCS UL (SigmaNet)"

Similar presentations


Ads by Google