EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Web interface for Protein Sequence.

Slides:



Advertisements
Similar presentations
X-SIGMA (An XML based Simple data Integration system for Gathering, Managing and Accessing scientific experimental data in grid environments) Karpjoo
Advertisements

Enabling Grids for E-sciencE INFSO-RI GPSA grid portal for Bioinformatics, EGEE3 Athens, 20/04/ GPSA - Grid Protein Sequence Analysis on the.
Aug. 20, JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,
INFSO-RI Enabling Grids for E-sciencE EGEE Security Status – Biomed meeting – Valencia, January 27th, 2006 EGEE Security status.
Plateforme de Calcul pour les Sciences du Vivant A Service for Biological Database Replication and Update Jean Salzemann – LPC.
Portals and Credentials David Groep Physics Data Processing group NIKHEF.
INFSO-RI Enabling Grids for E-sciencE FloodGrid application Ladislav Hluchy, Viet D. Tran Institute of Informatics, SAS Slovakia.
E-BIOGENOUEST: A REGIONAL LIFE SCIENCES INITIATIVE FOR DATA INTEGRATION Datacite Annual Conference Nancy Olivier Collin – IRISA/INRIA
Life Sciences Integrated Demo Joyce Peng Senior Product Manager, Life Sciences Oracle Corporation
Grid Initiatives for e-Science virtual communities in Europe and Latin America DIRAC TEAM CPPM – CNRS DIRAC Grid Middleware.
BIRN Update Carl Kesselman Professor of Industrial and Systems Engineering Information Sciences Institute Fellow Viterbi School of Engineering University.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Simply monitor a grid site with Nagios J.
Privacy issues in integrating R environment in scientific workflows Dr. Zhiming Zhao University of Amsterdam Virtual Laboratory for e-Science Privacy issues.
EGEE-II INFSO-RI Enabling Grids for E-sciencE NA2 Meeting Prague, 28 November 2007 Centrale Recherche S.A., Ecole Centrale Paris.
INFSO-RI Enabling Grids for E-sciencE EGEE - a worldwide Grid infrastructure opportunities for the biomedical community Bob Jones.
Building and Running caGrid Workflows in Taverna 1 Computation Institute, University of Chicago and Argonne National Laboratory, Chicago, IL, USA 2 Mathematics.
The PROGRESS Grid Service Provider Maciej Bogdański Portals & Portlets 2003 Edinburgh, July 14th-17th.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The network monitoring in grid context Operations.
INFSO-RI Enabling Grids for E-sciencE ES applications in EGEEII – M. Petitdidier –11 February 2008 Earth Science session Wrap up.
INFSO-RI Enabling Grids for E-sciencE Supporting legacy code applications on EGEE VOs by GEMLCA and the P-GRADE portal P. Kacsuk*,
Page 1 SCAI Dr. Marc Zimmermann Department of Bioinformatics Fraunhofer Institute for Algorithms and Scientific Computing (SCAI) Grid-enabled drug discovery.
PROGRESS: ICCS'2003 GRID SERVICE PROVIDER: How to improve flexibility of grid user interfaces? Michał Kosiedowski.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Provenance Challenge gLite Job Provenance.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE – paving the way for a sustainable infrastructure.
EGEE-Forum – May 11, 2007 Enabling Grids for E-sciencE EGEE and gLite are registered trademarks A gateway platform for Grid Nicolas.
Wrapping Scientific Applications As Web Services Using The Opal Toolkit Wrapping Scientific Applications As Web Services Using The Opal Toolkit Sriram.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Introduction to GILDA and gaining access.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Next steps with EGEE EGEE training community.
INFSO-RI Enabling Grids for E-sciencE Biomedical applications V. Breton, CNRS-IN2P3.
LSIDs in a Nutshell Jun Zhao University of Manchester 1 st December, 2005.
GO-ESSP Workshop, LLNL, Livermore, CA, Jun 19-21, 2006, Center for ATmosphere sciences and Earthquake Researches Construction of e-science Environment.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Application Porting Support in EGEE Gergely Sipos MTA SZTAKI EGEE’08.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The GILDA training infrastructure.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Status report on Application porting at SZTAKI.
INFSO-RI Enabling Grids for E-sciencE NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE08 conference, Istambul Biomed community meeting V. Breton, CNRS.
INFSO-RI Enabling Grids for E-sciencE User Survey Objectives and Results F.Jacq CNRS-IN2P3 EGEE Conference - Athens 21 th April.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Services for advanced workflow programming.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Web Portal for Chemists M. Sterzel,
INFSO-RI Enabling Grids for E-sciencE EGEE Review WISDOM demonstration Vincent Bloch, Vincent Breton, Matteo Diarena, Jean Salzemann.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The GILDA t-Infrastructure Roberto Barbera.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks HYP3D Gilles Bourhis Equipe SIMPA, laboratoire.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Going from EGEE-NA ES cluster to EGI SSC.
Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks BiG: A Grid Service to Distribute Large BLAST.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks New Authorization Service Christoph Witzig,
EGEE-II INFSO-RI Enabling Grids for E-sciencE Command Line Grid Programming Spiros Spirou Greek Application Support Team NCSR “Demokritos”
INFSO-RI Enabling Grids for E-sciencE EGEE-2 NA4 Biomed Bioinformatics in CNRS Christophe Blanchet Institute of Biology and Chemistry.
Università di Perugia Enabling Grids for E-sciencE Status of and requirements for Computational Chemistry NA4 – SA1 Meeting – 6 th April.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks C. Martín, A. Lorca (UCM) Introduction to.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Alistair.
Portals and my Grid Stefan Rennick Egglestone Mixed Reality Laboratory University of Nottingham.
EGEE-II INFSO-RI Enabling Grids for E-sciencE P-GRADE overview and introduction: workflows & parameter sweeps (Advanced features)
PROGRESS: GEW'2003 Using Resources of Multiple Grids with the Grid Service Provider Michał Kosiedowski.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE08 conference, Istambul Life sciences cluster perspective on EGI V. Breton, CNRS On.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks UK-Ireland-France Regional Participation.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Software Licensing in the EGEE Grid infrastructure.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE: Enabling grids for E-Science Bob Jones.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Workflow management tool for Earth science applications Ladislav Hluchy, Viet Tran Institute of Informatics.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Bioinformatics activity Christophe BLANCHET.
INFSO-RI Enabling Grids for E-sciencE EGEE Review Transition from EGEE-II to EGEE- III Biomed cluster V. Breton – J. Montagnat.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Bob Jones EGEE project director.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GOCDB4 Gilles Mathieu, RAL-STFC, UK An introduction.
2 nd EGEE/OSG Workshop Data Management in Production Grids 2 nd of series of EGEE/OSG workshops – 1 st on security at HPDC 2006 (Paris) Goal: open discussion.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Computational Chemistry Cluster – EGEE-III.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Service to Encrypt Biological Data on Grid.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Direct User Support and updates on EGEE.
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Presentation transcript:

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Web interface for Protein Sequence Analysis on Grid Blanchet C., Mollon R., Combet C. and Deleage G. Pôle Bioinformatique de Lyon – PBIL Institut de Biologie et de Chimie des Protéines IBCP CNRS UMR 5086 Lyon – Gerland, France

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 2 French CNRS Institute, associated to Univ. Lyon 1 –Life Science –About 160 people – –Located in Lyon, France Study of proteins in their biological context –Approaches used include : integrative cellular (cell culture, various types of microscopies) and molecular techniques, both experimental (including biocrystallography, and nuclear magnetic resonance) and theoretical (structural bioinformatics) Three main departments, bringing together 13 groups –Topics such as cancer, extracellular matrix, tissue engineering, membranes, cell transport and signalling, bioinformatics and structural biology Institute of Biology and Chemistry of Proteins

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 3 Protein ? FRUCTOSE REPRESSOR DNA-BINDING DOMAIN, NMR, MINIMIZED STRUCTURE –Penin, F., Geourjon, C., Montserret, R., Bockmann, A., Lesage, A., Yang, Y., Bonod-Bidaud, C., Cortay, J.C., Negre, D., Cozzone, A.J., Deleage, G  >1UXC:_|PDBID|CHAIN|SEQUENCE MKLDEIARLAGVSRTTASYVINGKAKQYRVSDKTVEKVMAVVREH NYHPNAVAAGLRLQHHHHHH

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 4 Bioinformatics Web Portal Network Protein Sequence Analysis release 3) Online since integrated methods for protein sequence analysis 12 Online up-to-date biological databanks International pointers: Expasy (Ch), University of California,... Ref.: “ Network Protein Sequence Analysis”, Combet C., Blanchet C., Geourjon C. et Deléage G. Tibs, 2000, 25,

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 5 hits More than 8 millions analyses since 1998 More than 5000 analyses/day

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 6 Biological Data and Tools Numerous –+ 800 (Galperin et al., 2006) Heterogeneous –Data & metadata!  Swiss-Prot: 12 % of data, 88% MD  TrEMBL: 19% data, 81% MD –Size: kB to GB –Authors & initial location –Storage: file, object, image –Format: EMBL, GenBank, Pearson-Fasta, PDB, pubmed, … Updatable !! In some case sensitive (Patient, Industrial, Scientific) Numerous – BioCatalog: at end of 1990s – EMBOSS toolkit: Heterogeneous – Bioinformatics algorithm: Sequence similarity, Multiple alignment, Structural prediction, … – Execution: sequential, MPI, openMP, … Data I/O –Text files – Specific format – Local I/O only

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 7 Grid-enabling tools with XML wrapping Example of pattinprot.xml Describe to gridify! Description of an application Arguments, output, options, Introducing attribute about biological semantic XML-based: Bio_methods.dtd Blast.xml, pattinprot.xml, clustalw.xml, … Use application descriptors To build user interface: bio_cgi for html To manage jobs: bio_launcher to launch job on cluster or grid

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 8 Grid-enabled Biological Resources

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 9 Data Virtualization Perroquet –IBCP’s extension to Parrot tool Adding EGEE file namespace URL recognition Adding EGEE name resolution Querying File Catalog (RLS, LFC) Based on Parrot Tool Collaboration with D. L. Thain (Univ. ND, USA). Paper accepted at GRID’2006 Custom I/O: chirp, ftp, gsi-ftp, … Conclusion Grid-enabling legacy applications on EGEE Grid Good performances Christophe Blanchet, Rémi Mollon, Douglas L. Thain and Gilbert Deléage Grid Deployment of Legacy Bioinformatics Applications with Transparent Data Access. To be presented at Conference GRID’2006, Barcelona, Sept , 2006

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 10 Secure Data Virtualization C. Blanchet, R. Mollon and G. Deleage. Building an Encrypted File System on the EGEE grid: Application to Protein Sequence Analysis. IEEE Proceedings of the First International Conference on Availability, Reliability and Security (ARES'06)

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 11 Computation Virtualization: Web portal

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 12 Grid Protein Sequence Analysis

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 13 Comp. Virtualization: Web Services Joined work with EU-FP6 EMBRACE (LHSG-CT )

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 14 Bioinformatics Workflow: Taverna

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 15 Bioinformatics Workflow : Triana

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 16 Conclusion Grid-enabling Biological data and tools Deploying biological databases Workload management –Wrapping tools with generic XML-based framework –Toolkit to remotely run jobs: bio_launcher Data management –Grid-enabling legacy application with remote data access –Transparent encryption system of data (mgmt and access) User interface –Web Portal: at gpsa-pbil.ibcp.fr –Web Services: details at gbio.ibcp.fr/ws Pending issues –Short Jobs ( deploying this SDJ recommendation on other biomed sites –Data management: still some security issues in gLite data management system, DLI interface cannot be activated on biomed LFC => RB should use user credential to access DLI interface on LFC

Enabling Grids for E-sciencE EGEE-II INFSO-RI Web interface for Protein Sequence Analysis on Grid 17 Acknowledgement  Science collaborators ● D.G. Thain (Univ. ND, US) ● Y. Denneulin (IMAG, Fr) ● Members of the EU-FP6 EGEE project  Team collaborators ● C. Blanchet ● R. Mollon (EGEE fellow) ● V. Daric (EMBRACE fellow) ● C. Combet ● G. Deléage (Team Leader) Work supported in part by projects: French ACI GriPPS (FR-GRID-PPL02-05), EU-FP6 EGEE-II (INFSO-RI ) EU-FP6 EMBRACE (LHSG-CT )