Bioinformatics Grid Application for Life Science. COMMUNICATION NETWORK DEVELOPMENT SPECIFIC SUPPORT ACTION BIOINFOGRID Andreas Gisel & Luciano Milanesi.

Slides:



Advertisements
Similar presentations
Challenges for Interactive Grids a point of view from Int.Eu.Grid project Remote Instrumentation Services in Grid Environment RISGE BoF Manchester 8th.
Advertisements

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Why Grids Matter to Europe Bob Jones EGEE.
Fighting Malaria With The Grid. Computing on The Grid The Internet allows users to share information across vast geographical distances. Using similar.
Plateforme de Calcul pour les Sciences du Vivant Embrace WP3 meeting Vincent Breton Chargé de Recherches au CNRS.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
An overview of the EGEE project Bob Jones EGEE Technical Director DTI International Technology Service-GlobalWatch Mission CERN – June 2004.
Introduction to Grids and Grid applications Gergely Sipos MTA SZTAKI
INFSO-RI Enabling Grids for E-sciencE WISDOM mini-workshop Vincent Breton (CNRS-IN2P3, LPC Clermont-Ferrand) ISGC 2007 March 28th,
Bioinformatics Grid Application for Life Science. COMMUNICATION NETWORK DEVELOPMENT SPECIFIC SUPPORT ACTION BIOINFOGRID Luciano Milanesi CNR-ITB.
Enabling, facilitating and delivering quality training in the UK and Internationally The challenge of grid training and education David Fergusson, Deputy.
Anna Scipioni Stefano Nativi Jerome Bequignon Mirco Mazzucato Lorenzo Bigagli Vincenzo Cuomo CYCLOPS Project.
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial Session 6: Introduction to grids and cloud computing International Symposium on Grid.
Universität Stuttgart Universitätsbibliothek Information Retrieval on the Grid? Results and suggestions from Project GRACE Werner Stephan Stuttgart University.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
INFSO-RI SZTAKI’s Exploitation plan AHM meeting Budapest, 23 June 2009 Peter Kacsuk, Robert Lovas MTA SZTAKI.
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Co-funded.
Milanesi Luciano CAPI Milan, Italy HPC AND GRID BIOCOMPUTING APPLICATIONS IN LIFE SCIENCE Milanesi Luciano National Research Council Institute of.
Workflow sharing and integration services by the ER-flow project on behalf of the ER-flow consortium EGI Community Forum, Manchester,
4th EGEE Meeting, PISA October, 2005 E-Infraestructure shared between Europe and Latin America Jesús Casado, CIEMAT
IST E-infrastructure shared between Europe and Latin America Biomedical Applications in EELA Esther Montes Prado CIEMAT (Spain)
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Building Grid-enabled Virtual Screening Service.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
SEE-GRID-2 The SEE-GRID-2 initiative is co-funded by the European Commission under the FP6 Research Infrastructures contract no
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
A short introduction to the Worldwide LHC Computing Grid Maarten Litmaath (CERN)
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Related Projects Dieter Kranzlmüller Deputy.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Grid projects in Europe Giuseppe Andronico.
BIOINFOGRID: Bioinformatics Grid Application for Life Science Giorgio Maggi INFN and Politecnico di Bari
INFSO-RI Enabling Grids for E-sciencE EGEE - a worldwide Grid infrastructure opportunities for the biomedical community Bob Jones.
Protein Molecule Simulation on the Grid G-USE in ProSim Project Tamas Kiss Joint EGGE and EDGeS Summer School.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
INFSO-RI Enabling Grids for E-sciencE V. Breton, 30/08/05, seminar at SERONO Grid added value to fight malaria Vincent Breton EGEE.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
Non-HEP Grid Use of EGEE/GridPP IHEPCCC Meeting (June 06) Peter Watkins  EGEE Virtual Organisations  Grid Utilisation by VOs  Biomed Data Challenges.
EMBRACE An example of Grid Integration (I): The EMBRACE project Jean SALZEMANN CNRS/IN2P3.
EGEE-II INFSO-RI Enabling Grids for E-sciencE WISDOM, a grid enabled virtual screening initiative Yannick Legré LPC Clermont-Ferrand,
Using SWARM service to run a Grid based EST Sequence Assembly Karthik Narayan Primary Advisor : Dr. Geoffrey Fox 1.
INFSO-RI Enabling Grids for E-sciencE Biomedical applications V. Breton, CNRS-IN2P3.
INFSO-RI Enabling Grids for E-sciencE In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
EU-IndiaGrid (RI ) is funded by the European Commission under the Research Infrastructure Programme WP5 Application Support Marco.
EGEE-II INFSO-RI Enabling Grids for E-sciencE WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.
INFSO-RI Enabling Grids for E-sciencE External Projects Integration Summary – Trigger for Open Discussion Fotis Karayannis, Joanne.
ACGT: Open Grid Services for Improving Medical Knowledge Discovery Stelios G. Sfakianakis, FORTH.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
George Goulas, Christos Gogos, Panayiotis Alefragis, Efthymios Housos Computer Systems Laboratory, Electrical & Computer Engineering Dept., University.
INFSO-RI Enabling Grids for E-sciencE EGEE Review WISDOM demonstration Vincent Bloch, Vincent Breton, Matteo Diarena, Jean Salzemann.
INFSO-RI Grupo de Redes y Computación de Altas Prestaciones Actividades del Grupo de Redes y Computación de Altas Prestaciones.
B i o i n f o r m a t i c s / B i o m e d i c a l A p p l i c a t i o n s i n E E L A Mexico, D.F., october 22 – 26, e – s c i e n c e M e x i c.
FP6−2004−Infrastructures−6-SSA EUChinaGrid status report Giuseppe Andronico INFN Sez. Di Catania CERN – March 3° 2006.
BIOINFOGRID: Bioinformatics Grid Application for life science MILANESI, Luciano National Research Council Institute of.
7. Grid Computing Systems and Resource Management
INFSO-RI Enabling Grids for E-sciencE Use Case of gLite Services Utilization. Multiple Ligand Trajectory Docking Study Jan Kmuníček.
Università di Perugia Enabling Grids for E-sciencE Status of and requirements for Computational Chemistry NA4 – SA1 Meeting – 6 th April.
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
INFSO-RI SA2 ETICS2 first Review Valerio Venturi INFN Bruxelles, 3 April 2009 Infrastructure Support.
EGEE is a project funded by the European Union under contract IST The CompChem Virtual Organization EGEE 07 Conference Budapest (HU)‏ October.
Northwest Indiana Computational Grid Preston Smith Rosen Center for Advanced Computing Purdue University - West Lafayette West Lafayette Calumet.
Milanesi Luciano Catania, Italy 13/03/2007 Bioinformatics challenges in European projects in Grid. Milanesi Luciano National Research Council Institute.
EGEE is a project funded by the European Union under contract IST Compchem VO's user support EGEE Workshop for VOs Karlsruhe (Germany) March.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Introduction to Grids and the EGEE project.
U.S. ATLAS Grid Production Experience
NGIs – Turkish Case : TR-Grid
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Infrastructure Support
TAPAS in the EGI “ecosystem”
WISDOM-II, status of preparation
Presentation transcript:

Bioinformatics Grid Application for Life Science. COMMUNICATION NETWORK DEVELOPMENT SPECIFIC SUPPORT ACTION BIOINFOGRID Andreas Gisel & Luciano Milanesi

Project goals In the BIOINFOGRID Specific Support Action (SSA) we propose combining Bioinformatics services and applications for molecular biology users with the Grid Infrastructure created by the EGEE project (6th Framework Program). genomics, transcriptomics, proteomics and molecular dynamics In the BIOINFOGRID initiative we plan to evaluate genomics, transcriptomics, proteomics and molecular dynamics applications studies based on GRID technology. BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT

Project goals The BIOINFOGRID SSA will establish a common ground for collaboration between the European Grid Infrastructure providers and the Bioinformatics research user community in various fields of Bioinformatics applications (Biology, Computational Chemistry, Medicine and Biotechnology). This will be achieved through specific studies for each reference application in the Bioinformatics domain in which experts of various disciplines can collaborate on the solution of highly complex problems. One of the key objectives is to “combine bioinformatics services and applications for molecular biology that have thousands of users, with the Grid Infrastructure created by the EGEE Project”. BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT

Project Timescale 24 month full project duration Starting date 1st January 2006 BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT Project Budget Euro

Project Partners BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT Partic. Role* Participant nameShort name Country COConsiglio Nazionale delle Ricerche - Istituto Tecnologie Biomediche CNRItaly CRIstituto Nazionale di Fisica NucleareINFNItaly CRDeutsches KrebsforschungzentrumDKFZGermany CRCentre National de la Recherche ScientifiqueCNRSFrance CRThe Chancellor, Master and Scholars of the University of Cambridge UCAMUK CRConsorzio Interuniversitario Lombardo Elaborazione Automatica CILEAItaly CR Steinbeis GmbH & Co StCGermany

Relationships to EGEE BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT BIOINFOGRID aims to use – the EGEE infrastructure – and the gLite middleware BIOINFOGRID will join the Biomed VO and evaluate the opportunity of setting-up a specific BIOIFOGRID VO. BIOINFOGRID will bring resources to the EGEE infrastructure: – A farm of 28 CPU of ITB have been recently added to the INFN GRID/EGEE Infrastructure. – Of the order of 600 CPU will be available during next year in the BIOINFOGRID community (>10% of which will be accessible from the EGEE infrastructure)

Relationships to EGEE and other projects BIOINFOGRID partner are also involved in HEALTHGRID, EMBRACE, and EGEE. Potential collaboration with BIOSAPIENS, DILIGENT, ICEAGE, EUCHINAGRID, ETICS, EUMEDGRID, EELA, SEEGRID, BalticGrid, ISSSeG, eIRGSP, and BELIEF projects is envisaged. BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT

Project applications (First 5 WPs) The project will support studies on applications for: distributed laboratory management systems for microarray technology gene expression studies gene data mining analysis of cDNA data phylogenetics analysis distributed database access protein functional analysis molecular dynamics simulations in GRID BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT

Dissemination of knowledge ( WP6 and WP7) BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT Major objective of the BIOINFOGRID project are: raise the awareness, inside the bioinformatics community, about the potentialities offered by the Grid technology in solving Bioinformatics research problems. built a solid kernel of specialists with knowledge about the major aspects of bioinformatics applications on the GRID. evaluate and adopt common solutions to port the BIOINFOGRID applications to the Grid.

Aims and Needs BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT Most of the Bioinformatics applications need, in a way or another, publicly available databases (DB). We are looking for: the major public biological DBs deployed on the GRID (~600Gb, flatfile and RDBM), multiple copies of those DBs for an efficient and seamless access, a synchronized updating system since those DBs are regularly updated, a common middleware (API) for unified access.

Aims and Needs BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT Many Applications are data and/or processing intensive We need: to parallelize the applications (divide them in many independent jobs), to have access to CPUs and storage (VO-biomed, VO- bioinfogrid), to have access to stable and/or temporary replications of certain DBs.

Aims and Needs BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT Bioinformatics applications run either as batch or on-demand. Simple, user friendly access to the GRID for “simple user”, the possibility to access CPUs outside VO-biomed / VO- bioinfogrid for “advanced users”, the possibility of launching clusters of jobs are highly desirable.

Aims and Needs BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT Many applications involve several different tools An efficient workflow managing systems and an efficient data managing system are highly desirable.

Applications ready to be deployed Functional Analogous Finder BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT Goal: comparing gene product according to their described function instead of by the conventional sequence comparison. Data source: Gene Ontology (GO) and gene association  GODB: GO-terms, ~ 1.3M gene products, 7.1M associations Processing: one gene against all others  1 CPU hour Output:text file with 100 best hits  compiled as an additional packages of the GODB Tested Solution:temporary distribution of GODB over several worker nodes and parallel processing of a sub list of input gene products

Applications ready to be deployed Unfolding and Ion binding events in gramicidin BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT Goal: Study the effect of ions on stability and structure of the gramicidin in two different conformations. Molecular Dynamics program Entry:PDB structure file: 1000 atoms in (1), 2500 atoms in (2) ; starting parameters  1M Process: 1) Run 10 MD trajectories each of 4 ns  266 CPU h. 2) Run 10 MD trajectories each of 2.5 ns  419 CPU h Output: MD trajectory, describing time evolution of the system

Applications ready to be deployed World-wide In Silico Docking On Malaria (WISDOM) BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT GOAL: virtual screening for in silico drug discovery for malaria First step in 2005: docking data challenge – Total of about 46 million ligands docked in 6 weeks – 1TB of data produced – Up 1000 computers in 15 countries used simultaneously corresponding to about 80 CPU years BIOINFOGRID goal: extend WISDOM virtual screening pipeline to Molecular Dynamics - Starting from EGEE data challenge results

Applications ready to be deployed Analysis of the protein surface BIOINFOGRID COMMUNICATION NETWORK DEVELOPMENT Goal: Define a protein surface model by defining the exposed residues and to search for protein similarities by such models. Entry:pdb files containing the atomic coordinates of proteins pdb-database  datasets Process: 3 steps - volumetric description  15 min CPU/prot - protein surface description  10 min CPU/prot - decompose in parametres  15 min CPU/prot Output: A database of protein surface parameters describing the surface characteristics in some protein functionality site. Compare the protein surface to establish relationship between proteins  25 min CPU/prot