BalticGrid-II Project 2nd BG-II AHM, 13.05.2009, Riga, Latvia1 Overview of application CoPS (Comparison of Protein Structures) D.Ludviga IMCS UL (SigmaNet)

Slides:



Advertisements
Similar presentations
RCAC Research Computing Presents: DiaGird Overview Tuesday, September 24, 2013.
Advertisements

Bioinformatics Needs for the post-genomic era Dr. Erik Bongcam-Rudloff The Linnaeus Centre for Bioinformatics.
Motif Space Database Design Kiranjit Sidhu. 2 Outline  Schema Design  Content of Database  Functionality  Future Plans.
CSCI 3 Introduction to Computer Science. CSCI 3 Course Description: –An overview of the fundamentals of computer science. Topics covered include number.
Introduction to C++ Programming CS 117 Section 2 and KNET Sections Spring 2001 MWF 1:40-2:30.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
Initialize\ Update CZM input parameters u s – u e < tolerance? Extract simulated displacement field, u s ABAQUS solve in batch mode Replicate experiment.
Data Consistency Verification with Veridata
LOFAR Self-Calibration Using a Blackboard Software Architecture ADASS 2007Marcel LooseASTRON, Dwingeloo.
Architectural Design Establishing the overall structure of a software system Objectives To introduce architectural design and to discuss its importance.
EU 2nd Year Review – Jan – WP9 WP9 Earth Observation Applications Demonstration Pedro Goncalves :
Data Mining Techniques
Software Configuration Management (SCM)
CLARIN tools for workflows Overview. Objective of this document  Determine which are the responsibilities of the different components of CLARIN workflows.
The BioBox Initiative: Bio-ClusterGrid Gilbert Thomas Associate Engineer Sun APSTC – Asia Pacific Science & Technology Center.
Interpreting the data: Parallel analysis with Sawzall LIN Wenbin 25 Mar 2014.
BalticGrid-II Project BalticGrid-II 2nd AHM, May 12 th 2009, Riga1 SA3 Application Integration and Support Hardi Teder EENet.
The CHINA – BRAIN Project Prof. Dr. Hugo de Garis, Director of the “China Brain Project”, Institute of Artificial Intelligence, Department of Computer.
The educational-oriented pack of computer programs to simulate solar cell behavior Aleksy Patryn 1 Stanisław M. Pietruszko 2  Faculty of Electronics,
Functions and Demo of Astrogrid 1.1 China-VO Haijun Tian.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches1 By Jayakumar Rudhrasenan S Primary Supervisor: Prof. Heiko Schroder.
T.Jadczyk, Bioinformatics Applications in the Virtual Laboratory Bioinformatics Applications in the Virtual Laboratory Tomasz Jadczyk AGH University of.
NMED 3850 A Advanced Online Design January 12, 2010 V. Mahadevan.
Wenjing Wu Computer Center, Institute of High Energy Physics Chinese Academy of Sciences, Beijing BOINC workshop 2013.
Data Management Console Synonym Editor
Group 1 : Grid Computing Laboratory of Information Technology Supervisors: Alexander Ujhinsky Nikolay Kutovskiy.
E-science grid facility for Europe and Latin America E2GRIS1 Gustavo Miranda Teixeira Ricardo Silva Campos Laboratório de Fisiologia Computacional.
CERN-PH-SFT-SPI August Ernesto Rivera Contents Context Automation Results To Do…
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
BalticGrid-II Project The Second BalticGrid-II All-Hands Meeting, Riga, May, Joint Research Activity Enhanced Application Services on Sustainable.
Final Project Bioinformatics for Biologists. Alternative A Alternative B.
ReproZip Packing Experiments for Sharing and Publication Fernando Chirigati, Juliana Freire | NYU-Poly Dennis Shasha | NYU.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
Open Science Grid OSG Engagement Strategy and Status ETP Conference Call Oct ; 5:30PM EST Bringing additional non-physicists onto OSG John McGee.
MOOCdb Franck Dernoncourt Colin Taylor. Outline Motivation Overview of schema Challenge.
Bioinformatics for biologists Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University Presented.
Computer Applications Chapter 16. Management Information Systems Management Information Systems (MIS)- an organized system of processing and reporting.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America gLite Information System Claudio Cherubino.
Top-K Generation of Integrated Schemas Based on Directed and Weighted Correspondences by Ahmed Radwan, Lucian Popa, Ioana R. Stanoi, Akmal Younis Presented.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Polish Infrastructure for Supporting Computational Science in the European Research Space EUROPEAN UNION Examining Protein Folding Process Simulation and.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
Susanna Guatelli Geant4 in a Distributed Computing Environment S. Guatelli 1, P. Mendez Lorenzo 2, J. Moscicki 2, M.G. Pia 1 1. INFN Genova, Italy, 2.
EGEE is a project funded by the European Union under contract IST Enabling bioinformatics applications to.
Project 3 SIFT Matching by Binary SIFT
Large-scale accelerator simulations: Synergia on the Grid turn 1 turn 27 turn 19 turn 16 C++ Synergia Field solver (FFT, multigrid) Field solver (FFT,
Ganga/Dirac Data Management meeting October 2003 Gennady Kuznetsov Production Manager Tools and Ganga (New Architecture)
Introduction to MPEG  Moving Pictures Experts Group,  Geneva based working group under the ISO/IEC standards.  In charge of developing standards for.
Software tools for digital LLRF system integration at CERN 04/11/2015 LLRF15, Software tools2 Andy Butterworth Tom Levens, Andrey Pashnin, Anthony Rey.
Oracle 10g database installation kit  A bundle of scripts which allows to install Oracle 10g database server on a single node: Useful for both experienced.
SVBIT SUBJECT:- Operating System TOPICS:- File Management
High Throughput Sequence (HTS) data analysis 1.Storage and retrieving of HTS data. 2.Representation of HTS data. 3.Visualization of HTS data. 4.Discovering.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
E-science grid facility for Europe and Latin America Updates on Information System Annamaria Muoio - INFN Tutorials for trainers 01/07/2008.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
Enabling Grids for E-sciencE LRMN ThIS on the Grid Sorina CAMARASU.
ROBOT NAVIGATION AI Project Asmaa Sehnouni Jasmine Dsouza Supervised by :Dr. Pei Wang.
WEKA Machine Learning Use Case – Breast Cancer - Final report
Core Elements Engineering - Midrange
Grid Application Support Group Case study Schrodinger equations on the Grid Status report 16. January, Created by Akos Balasko
Moving from a PHP Flat-File Electronic Resources Manager to Drupal 6 Views Image courtesy of USFSW Mountain Praire (Flickr User) Under Creative Commons.
IMAGE MOSAICING MALNAD COLLEGE OF ENGINEERING
Chapter 1: An Overview of Computers and Programming Languages
Overview of Workflows: Why Use Them?
Production Manager Tools (New Architecture)
Web Application Development Using PHP
Introduction to the SHIWA Simulation Platform EGI User Forum,
Information Services Claudio Cherubino INFN Catania Bologna
Best Practices in Higher Education Student Data Warehousing Forum
Presentation transcript:

BalticGrid-II Project 2nd BG-II AHM, , Riga, Latvia1 Overview of application CoPS (Comparison of Protein Structures) D.Ludviga IMCS UL (SigmaNet)

2nd BG-II AHM, , Riga, Latvia2 Outline About CoPS (scientific value); What's new?; Challenges (mentioned during 1AHM); Our solution; Collaboration possibilities.

2nd BG-II AHM, , Riga, Latvia3 About CoPS (scientific value) Started at the beginning of BG-II as the pilot application;  developed by Dr. Natalja Kurbatova and Asoc. Prof. Juris Viksna Field – Bioinformatics; “It has taken biologists some 230 years to identify and describe three quarters of a million insects; if there are indeed at least thirty million... then, working as they have in the past, insect taxonomists have ten thousand years of employment ahead of them.” R.Leakey and L.Roger

2nd BG-II AHM, , Riga, Latvia About CoPS Assumption - protein structures have evolved by a stepwise process, each step involving a small change in the structure. Comparison of protein structures using Evolutionary Secondary Structures Matching (ESSM) algorithm  ESSM was created for pair wise comparison of structures that allow to identify fold mutations and to estimate evolutionary relationship between proteins. For exploration of evolution of protein structures all-against-all comparison have to be done Application needs:  Protein data base (data set description files are stored) – PDB (3D), FASTA (.txt), structural elements; – size ~8 GB (~2.3GB if compressed);  Total number of tasks , divided in 410 files

2nd BG-II AHM, , Riga, Latvia About CoPS Application consists of:  jdl.essm - JDL file for submitting ESSM (CoPS) job  essm.sh - shell script that is executed on WN once the job starts  database.tar.gz - archive of the protein database with protein descriptions, which is extracted on the WN before anything else starts  essm.linux - statically compiled executable for ESSM(CoPS) that works on Scientific Linux [CERN] 4, 32-bit binary  pairs.txt - sample calculation file that contains pair comparisons  At the end of each job result file pairs.result is generated Afterwards visualized using a self made tool.  developed using one of GRADE components

2nd BG-II AHM, , Riga, Latvia6 About CoPS

2nd BG-II AHM, , Riga, Latvia Whats new? Developed (results received);  ~2 weeks. Implemented in Migrating Desktop; Presented/demonstrated on OGF25/EGEE Users Forum in Catania, Italy Demo

2nd BG-II AHM, , Riga, Latvia Challenges and our solution Challenges:  Transport the data; – 410 x 2.3GB ≈ 950GB  VOMS-proxy.Solutions  The needed data was installed on separate clusters software directories (developed “devoted” protein clusters)  Myproxy

2nd BG-II AHM, , Riga, Latvia Results The results of the ESSM algorithm were successfully used for the exploration of the CATH fold space by using fold space graphs for representation of comparison results and estimation of "evolution distance" on the basis of observed changes. The results obtained in the application can be represented as a few steps toward the creation of an general protein evolution model.

2nd BG-II AHM, , Riga, Latvia Collaboration “Computer science is no more about computers than astronomy is about telescopes” E.W.Dijkstra Continue collaboration with biologists in LU; Develop an VO or just devoted servers:  PDB can be installed on a clusters VO software directory – To speed up execution of jobs and avoid per-job download and extraction of these databases.

2nd BG-II AHM, , Riga, Latvia Thank you!