EBI is an Outstation of the European Molecular Biology Laboratory. Rodrigo Lopez Head of EMBL-EBI/ES Andrew Lyall ELIXIR PM. ELIXIR and the integration.

Slides:



Advertisements
Similar presentations
Identity management – life sciences perspective Ugis Sarkans European Bioinformatics Institute.
Advertisements

High-Performance Computing
EBI Proteomics Services Team – Standards, Data, and Tools for Proteomics Henning Hermjakob European Bioinformatics Institute SME forum 2009 Vienna.
BELMONT FORUM E-INFRASTRUCTURES AND DATA MANAGEMENT PROJECT Updates and Next Steps to Deliver the final Community Strategy and Implementation Plan Maria.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
The European Molecular Biology Laboratory (EMBL) is supported by sixteen countries. Consists of the main Laboratory in Heidelberg (Germany), Outstations.
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, Per Kraulis
1 Enriching UK PubMed Central SPIDER launch meeting, Wolfson College, Oxford Paul Davey, UK PubMed Central Engagement Manager.
EBI is an Outstation of the European Molecular Biology Laboratory. Web Services Programmatic access to Life Sciences resources. Rodrigo Lopez.
EMBL-EBI and Bioinformatics Steven Newhouse, Head of Technical Services, EMBL-EBI.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
Bioinformatics tools for the EBI An overview.
Welcome to EMBL-EBI Dr Laura Emery. Before we start… Stand up How experienced are you in bioinformatics? Get to know each other by arranging yourselves.
Small Molecules EBI Bioinformatics Roadshow Gareth Owen, ChEBI group
From T. MADHAVAN, & K.Chandrasekaran Lecturers in Zoology.. EXIT.
The New EU Framework Programme for Research and Innovation EXCELLENT SCIENCE HORIZON 2020 Peter Fisch DG RTD A.5.
European Life Sciences Infrastructure for Biological Information ELIXIR: Safeguarding the results of life sciences research in Europe.
Steven Newhouse, Head of Technical Services European Bioinformatics Institute: ICT Challenges.
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
European Life Sciences Infrastructure for Biological Information ELIXIR
Databases in Bioinformatics and Systems Biology Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
EGI-Engage EGI-Engage Engaging the EGI Community towards an Open Science Commons Project Overview 9/14/2015 EGI-Engage: a project.
The SLING project is funded by the European Commission within Research Infrastructures of the FP7 Capacities Specific Programme, grant agreement number.
CCP-EM community meeting 7 February 2013 EMDB and beyond Ardan Patwardhan and Gerard Kleywegt Protein Data Bank in Europe EMBL-EBI.
Network Services for Biologists in the Genome Era The Work of the European Bioinformatics Institute.
ELIXIR UK - Industry Engagement sector Gabriella Rustici School of Biological Sciences.
UniProt Non-redundant Reference Cluster (UniRef) Databases Swiss Institute of Bioinformatics (SIB) European Bioinformatics Institute (EMBL-EBI)
Helix Nebula The Science Cloud CERN – 14 May 2014 Bob Jones (CERN) This document produced by Members of the Helix Nebula consortium is licensed under a.
EMBL-EBI EMBL-EBI EMBL-EBI What is the EBI's particular niche? Provides Core Biomolecular Resources in Europe –Nucleotide; genome, protein sequences,
The Helix Nebula Marketplace HNX The European cloud marketplace for scientists, researchers, developers & public organisations Marc-Elian Bégin, CEO, Co-founder,
A public-private partnership building a multidisciplinary cloud platform for data intensive science Bob Jones Head of openlab IT dept CERN This document.
This document produced by Members of the Helix Nebula Partners and Consortium is licensed under a Creative Commons Attribution 3.0 Unported License. Permissions.
European Life Sciences Infrastructure for Biological Information META-pipe WP6 Kick-off Lars Ailo Bongo, ELIXIR-NO.
ELIXIR: a sustainable infrastructure for biological information in Europe Workshop on the future of Big Data Management The Blackett Laboratory, Imperial.
EBI is an Outstation of the European Molecular Biology Laboratory. EBI patent related services Jennifer McDowall Senior Scientist, EMBL-EBI 3 rd Annual.
B i o i n f o r m a t i c s / B i o m e d i c a l A p p l i c a t i o n s i n E E L A Mexico, D.F., october 22 – 26, e – s c i e n c e M e x i c.
Learning and exploring Life science through the EBI reosurces and tools BIOQUEST workshop_2011 Vicky Schneider, EMBL-EBI Training Programme Project leader.
Helix Nebula The Science Cloud CERN – 13 June 2014 Alberto Di MEGLIO on behalf of Bob Jones (CERN) This document produced by Members of the Helix Nebula.
EBI is an Outstation of the European Molecular Biology Laboratory. Literature Resources at the EBI Information Workshop on European Bioinformatics Resources.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
Central hub for biological data UniProtKB/Swiss-Prot is a central hub for biological data: over 120 databases are cross-referenced (EMBL/DDBJ/GenBank,
Describing Bioinformatic Metadata at EBI James Malone
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI strategy and Grand Vision Ludek Matyska EGI Council Chair EGI InSPIRE.
For EGI/EUDAT EMBL/ELIXIR use-cases Tony Wildish
Helix Nebula Workshop On Interoperability among Public And Community Clouds Session 2: Networking Connectivity Convener: Carmela ASERO, EGI.eu19 September.
European Life Sciences Infrastructure for Biological Information EGI 2015, Lisbon, 18 May 2015 Rafael C Jimenez, ELIXIR CTO ELIXIR.
Building European Scientific Cloud Computing Infrastructure An overview by Marc-Elian Bégin, SixSq 1.
European Life Sciences Infrastructure for Biological Information ELIXIR’s needs from the EOSC Steven Newhouse, EMBL-EBI Part of the.
Distributed Computing Infrastructures for e-Science: Future Perspectives EGI Technical Forum 2012 The Clarion Congress Hotel, Freyova 945/33, Prague 18.
OncoTrack Bioinformatics Workshop Max Planck Institute for Molecular Genetics, Berlin Wednesday 6 th November 2013 TimeSubject 13:30-15:00 Introduction.
A worldwide e-Infrastructure and Virtual Research Community for NMR and structural biology Alexandre M.J.J. Bonvin Project coordinator Bijvoet Center for.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
EGI-InSPIRE EGI-InSPIRE RI EGI strategy towards the Open Science Commons Tiziana Ferrari EGI-InSPIRE Director at EGI.eu.
Rafael Jimenez ELIXIR CTO BioMedBridges Life science requirements from e-infrastructure: initial results from a joint BioMedBridges workshop Stephanie.
Work Plan for the Second Period Bob Jones, CERN First Helix Nebula Review 03 July This document produced by Members of the Helix Nebula consortium.
For EGI/EUDAT EMBL/ELIXIR use-cases Tony Wildish
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
Cheminformatics and Metabolism Team The EBI Enzyme Portal.
Take a REST from manual searching
ELIXIR Core Data Resources and Deposition Databases
EMBL’s European Bioinformatics Institute
Steven Newhouse EGI-InSPIRE Project Director, EGI.eu
ELIXIR: Potential areas for collaboration with e-Infrastructures
ELIXIR: Authentication and Authorization Infrastructure Requirements
ELIXIR Safeguarding the results of life science research in Europe
Overview of EBI Data Resources and Services
3rd Annual Forum for SMEs: Meeting Overview
Florian Gräf Software Developer of the McEntyre group at EMBL-EBI
Introduction to Bioinformatics
Presentation transcript:

EBI is an Outstation of the European Molecular Biology Laboratory. Rodrigo Lopez Head of EMBL-EBI/ES Andrew Lyall ELIXIR PM. ELIXIR and the integration of biomolecular data in life sciences information systems

Summary Definitions Challenges Technologies Solutions Community Conclusions

Definitions Biomolecular Computer representation of living molecules and bio-active compounds (e.g. Gene, transcript, gene expression, protein structure, function, drugs). Representations have a structure (i.e. computer readable formats) “Web Service” vs. web service see Architectures: SOAP (Simple Object Access Protocol), REST (Representational State Transfer) and DAS (Distributed Annotation System) Cloud Cloud computing is the delivery of computing as a service rather than a product (e.g. lease storage space instead of buying physical hard disks)

ELIXIR – What is it? An EU Framework 7 Preparatory Phase Project Coordinated by Prof Janet Thornton, Director EMBL-EBI To construct a plan for the operation of a sustainable infrastructure for biological information in Europe €4.5 million grant awarded May 2007, three year term 32 member consortium engaging many of Europe’s main bioinformatics funding agencies and research institutes Deliverables are memoranda of understanding to fund the implementation phase which could cost €500 million Interested parties should register as stake-holders via the ELIXIR Website:

Databases Challenges: Biomolecular diversity Genomes Ensembl Ensembl Genomes EGA Genomes Ensembl Ensembl Genomes EGA Nucleotide sequence ENA Nucleotide sequence ENA Functional genomics ArrayExpress, Expression Atlas Functional genomics ArrayExpress, Expression Atlas Protein sequences UniProt Protein sequences UniProt Protein families + motifs InterPro Macromolecular Structure PDBe Macromolecular Structure PDBe Protein expression PRIDE Protein expression PRIDE Chemical entities ChEBI Chemical entities ChEBI Interactions + pathways IntAct, Reactome Interactions + pathways IntAct, Reactome Literature and ontologies CiteXplore, UKPMC, (GO) Literature and ontologies CiteXplore, UKPMC, (GO) Chemogenomics ChEMBL Chemogenomics ChEMBL Systems BioModels Systems BioModels

Challenges: Growth of core biomolecular data a.Nucleotide sequences in the European Nucleotide Archive b.Genomes in Ensembl & Ensembl Genomes c.Gene expression: hybridisations in the Array Express Archive d.Protein sequences in UniParc e.Macromolecular structures in PDBe f.Protein families, motifs and domains from entries in InterPro

Challenges: Disk storage at EMBL-EBI 7 Petabytes Dec 2011.

Challenges: Storing data 1000genomes will produce 1TB of data but will require 100TB of raw storage to get there (before NGS). …9PB at present. Your NGS (Illumina, 454, etc.) analysis strategy directly affects your data storage needs. Are you doing Whole Genome or Exome sequencing? An updated diagram for the "Moore's Law" : 7/full/475435a/box/1.html 7/full/475435a/box/1.html

Technologies: Genomics omicmaps.com

Technologies: Data management ENA Sources & RNA-Seq, ChIP-Seq, and epigenomic data that are submitted to GEO and ArrayExpress Genomic and Transcriptomic assemblies are submitted to INSDC (EMBL-Bank, GenBank and DDBJ) 16S ribosomal RNA data associated with metagenomics that are submitted to INSDC

Technologies: Proteomics The PRIDE database currently contains: 21,731 Experiments 8,897,573 Identified Proteins 51,246,134 Identified Peptides 4,896,394 Unique Peptides 292,341,092 Spectra Proteome Commons Annotations: 22 Tb gpmDB statistics for Tue May 22 11:51: UTC (#3030) models = 197,292 proteins = 63,651,028 distinct proteins = 1,476,612 protein redundancy = 43.1 × peptides = 539,375,566 distinct peptides = 4,003,692 peptide redundancy = × residues = 7,551,257,

Europe 2020: The Grand Societal Challenges Europe has an ageing population, an unsustainable food supply and is facing increasing threats from environment destruction, bioterrorism and emerging pandemics. Business and commerce are challenged by competitive pressures from globalization. The future well-being and prosperity of our citizens will depend absolutely on innovation to tackle these Grand Challenges as well as to create new products and services in life sciences and ICT. Innovation has been placed at the heart of the Europe 2020 Strategy for Growth and Jobs and indeed the Innovation Union explicitly highlights the biological and medical challenges as providing the opportunity for economic recovery and growth.

Challenges: Access to data Download data to local facilities from central repositories: ftp.ebi.ac.uk (306 TB compressed data/year to >1500 worldwide institutes core facilities) fasp.sra.ebi.ac.uk; fasp.era.ebi.ac.uk (110 TB compressed data/year to 20 core genomic centres) Management challenge: Data is generated quicker than it can be analysed. Many never finish downloading a data set before a new one is ready

Challenges: Disruptive technologies. “A technology becomes disruptive when the rate at which it improves exceeds the rate at which users can adapt to the new performance.” The Innovator's Dilemma. Clayton M. Christensen. Harvard Press. 1997

Solutions – a timeline Web Services – Service Oriented Architecture under TEMBLOR, EMBRACE, FELICS, SLING EU funded NoE projects. Cloud storage (Amazon S3, EMC Atmos, Google Cloud Storage, iCloud, Windows Azure, etc.) Enterprise Service Bus (Integration via Interoperability) Focus: Giving scientists the possibility to bring their analysis software closer to the big data (ELIXIR, BiomedBridges, - EGI.eu, Helix Nebula, etc.)

Solutions: Community/Hybrid Cloud architectures Source:

Solutions: EBI - SaaS Web Services (SOAP/REST)

Solutions: EBI - IaaS

Solutions: PaaS COMMON PLATFORMS

Community: ELIXIR: An e-Infrastructure for biological and medical research GÉANT, DANTE, EGI.eu, PRACE, etc e-Infrastructure

21 Community: Visits during consultation phase.

22 Sites of ELIXIR survey data providers

Community: UXD

Conclusions…so far. Interoperability is more important than integration. Measurements, not only of compute variables but of work pattern metrics are extremely important. User engagement and outreach are important. Also at the grass-root level. The challenge of BIG data is how to get it closer to the users. The community is [a big] part of the solution. ELIXIR is already delivering and setting a pace for research and development investment in bioinformatics (e.g beyond the feasibility phase: Web Services, Identity Federation for EGA and Distributed Search and Retrieval for several trillion biological data objects )

abell nepp TECHNICAL EBI Ground breaking ceremony 13 th June 2012

Thanks TERENA 2012 EMBL EU, WT, BBRC, EPO, Data & Scientific Content Providers and many collaborators Director General of EMBL (Ian Mattaj) and Director of EMBL-EBI (Janet Thornton) will visit Iceland towards the end of May