How to get from a pile of unprocessed data to knowledge: The user’s perspective Guido Jenster, Ph.D. Professor of Experimental Urological Oncology Department.

Slides:



Advertisements
Similar presentations
Martin John Bishop UK HGMP Resource Centre Hinxton Cambridge CB10 1 SB
Advertisements

Bioinformatics for genomics Kickoff Bioinformatics Expertise Center 10 November 2009 Judith Boer Dept. of Human Genetics.
Vanderbilt Center for Quantitative Sciences Summer Institute Sequencing Analysis Yan Guo.
20,000 GENES IN HUMAN GENOME; WHAT WOULD HAPPEN IF ALL THESE GENES WERE EXPRESSED IN EVERY CELL IN YOUR BODY? WHAT WOULD HAPPEN IF THEY WERE EXPRESSED.
Data integration across omics landscapes Bing Zhang, Ph.D. Department of Biomedical Informatics Vanderbilt University School of Medicine
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
CISC667, F05, Lec24, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) DNA Microarray, 2d gel, MSMS, yeast 2-hybrid.
Bioinformatics Alternative splicing Multiple isoforms Exonic Splicing Enhancers (ESE) and Silencers (ESS) SpliceNest Lecture 13.
Aleksi Kallio CSC – IT Center for Science Chipster and collaboration with other bioinformatics platforms.
High Throughput Sequencing
Paola CASTAGNOLI Maria FOTI Microarrays. Applicazioni nella genomica funzionale e nel genotyping DIPARTIMENTO DI BIOTECNOLOGIE E BIOSCIENZE.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Gene Regulatory.
Special Topics in Genomics Lecture 1: Introduction Instructor: Hongkai Ji Department of Biostatistics
Bioinformatics Jan Taylor. A bit about me Biochemistry and Molecular Biology Computer Science, Computational Biology Multivariate statistics Machine learning.
Quality assessment AATGCGTACATGCACCANTTCAG GTC TGTCANNTGCATTACATGCATTGA CC AATGCGTACATGCACCANTTCAG GTC TGTCTTTTGCATNACATGCAAAAA CC TGTCTTTTGCATNACATGCAGGG.
Bioinformatics Core Facility Ernesto Lowy February 2012.
Gene Control Chapter 11. Prokaryotic Gene Regulation Operons, specific sets of clustered genes, are the controlling unit Promoter: sequence where RNA.
From motif search to gene expression analysis
Detecting enriched regions (Chip- seq, RIP-seq) Statistical evaluation of enriched regions Data displayed in Genome Browser Detection of enriched motifs.
The Genome is Organized in Chromatin. Nucleosome Breathing, Opening, and Gaping.
Gary Stormo by Andrew Bardee. History Born 1950 in South Dakota Undergraduate in Biology from Caltech PhD in Molecular Biology from University of Colorado.
Regulation of Gene Expression Eukaryotes
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
EDACC Primary Analysis Pipelines Cristian Coarfa Bioinformatics Research Laboratory Molecular and Human Genetics.
Integrating the Bioinformatic Technology Group into your research programme Introduction People and Skills Examples Integrating the BTG Contacts BHRC Away.
8.6 Gene Expression and Regulation TEKS 5C, 6C, 6D, 6E KEY CONCEPT Gene expression is carefully regulated in both prokaryotic and eukaryotic cells.
1 Transcript modeling Brent lab. 2 Overview Of Entertainment  Gene prediction Jeltje van Baren  Improving gene prediction with tiling arrays Aaron Tenney.
Geuvadis Analysis Meeting 16/02/2012 Micha Sammeth CNAG – Barcelona.
1 Bioinformatics at Norwegian University of Science and Technology Professor Finn Drabløs Department of Cancer Research and Molecular Medicine Finn Drabløs.
Computational methods for genomics-guided immunotherapy Sahar Al Seesi Computer Science & Engineering Department, UCONN Immunology Department, UCONN Health.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
First of all: “Darnit Jim, I’m a doctor not a bioinformatician!”
Elena Klenova CTCF and BORIS in normal development, epigenetics and tumourigenesis Areas of research: Molecular Oncology Gene regulation Translational.
Affymetrix User’s Group Meeting Boston, MA May 2005 Keynote Topics: 1. Human genome annotations: emergence of non-coding transcripts -tiling arrays: study.
INTERPRETING GENETIC MUTATIONAL DATA FOR CLINICAL ONCOLOGY Ben Ho Park, M.D., Ph.D. Associate Professor of Oncology Johns Hopkins University May 2014.
High throughput biology data management and data intensive computing drivers George Michaels.
Roy Williams PhD Sanford | Burnham Medical Research Institute.
Canadian Bioinformatics Workshops
Advances and challenges in computational modeling and statistical learning of biological systems Qi Liu Department of Biomedical Informatics Vanderbilt.
 Facilities Open House Functional Genomics Facility Molishree Joshi, Ph.D. 6/1/2015 Contact Information:
The regulation of Caspase 8 chIP-seq motifs mRNA expression DNA methylation.
Centralizing Bioinformatics Services: Analysis Pipelines, Opportunities, and Challenges with Large- scale –Omics, and other BigData High-Performance Computing.
Erasmus Andrew Stubbs
Semantic Web - caBIG Abstract: 21st century biomedical research is driven by massive amounts of data: automated technologies generate hundreds of.
Regulation of Gene Expression
Cancer Genomics Core Lab
Cancer Genomics and Class Discovery
Galaxy course EMC TraIT Nov 2014_Jenster
Gene expression.
Introduction to Bioinformatics February 13, 2017
 The human genome contains approximately genes.  At any given moment, each of our cells has some combination of these genes turned on & others.
EPConDB: Endocrine Pancreas Consortium Database
Last Week’s Reading Assignments
Day 2: Session 8: Questions and follow-up…. James C. Fleet, PhD
Transcriptome Assembly
Sequencing Data Analysis
Control of Gene Expression
Daily Warm-Up Thursday, January 9th
Eukaryote Regulation and Gene Expression
Agenda 3/16 Eukaryotic Control Introduction and Reading
Schedule for the Afternoon
Welcome To The Centre for Applied Genomics
ChIP-seq Robert J. Trumbly
Next Generation Sequencing Market Next Generation Sequencing Market.
Galaxy course EMC TraIT Nov 2014_Jenster
Proteomics Informatics David Fenyő
Altered Caspase-8 Expression
Sequencing Data Analysis
Next Generation Sequencing Market. Report Description and Highlights According to Renub Research market research report “Next Generation Sequencing (NGS)
Presentation transcript:

How to get from a pile of unprocessed data to knowledge: The user’s perspective Guido Jenster, Ph.D. Professor of Experimental Urological Oncology Department of Urology Erasmus MC Data analysis and integration

Structure of Cancer Research Projects Functional Research Prevention Research Marker Research Technology & Protocols Datasets Bioinformatics & Statistics Models & Biobanks Organization & Management; Education; Outreach Therapy Research

Clinical Research Biobanking Experimental Research Imaging DATA GENERATION DATA STORAGE DATA PROCESSING DATA INTEGRATION DATA QUERY VIEWING Prostate Cancer Molecular Medicine NEW KNOWLEDGE

Prostate Cancer Molecular Medicine What do we want? Use case: Identify novel fusion genes from DNA and RNA sequencing data PUSH TO START

Where is the Red Button? Why is it so difficult to make? Data analysis and integration -Different types of data -Different platforms and their limitations -Different data analysis tools -Limitations in storage and compute power -Analysis and integration is dependent on research question and the needs of the scientist

Markers and therapy targets: An inventory of the differences between normal and cancer cells: Markers and therapy targets for prostate cancer Metabolite DNA RNA Protein Morphology Cellular behavior

DNAseq Data Analysis B-Allele Frequency DNAseq data Active Chromatin TF Binding Methylation Structural VariationsChromatin Interactions Copy Number Abberations SNVs / InDels Read Barcode Identify Integration Sites

RNAseq Data Analysis Alternative splicing & Promoters RNAseq data Differential expression SNVs / InDels Read-Through & Fusion Transcripts Novel Transcripts

DNA and RNA analysis platforms DNA level: -Home made array CGH -1M SNP arrays (Illumina) -Ion Proton low pass DNAseq -Ion Proton exome DNAseq -Complete Genomics whole genome DNAseq -FAIREseq, ChIPseq, MeDIPseq, Methylation arrays (Illumina) RNA level: -Home made cDNA and oligo arrays -Affymetrix Exon arrays -Illumina RNAseq (small RNA and mRNA) -Ion Proton RNAseq

Where is the Red Button? Why is it so difficult to make? Data analysis and integration -Different types of data -Different platforms and their limitations -Different data analysis tools -Limitations in storage and compute power -Analysis and integration is dependent on research question and the needs of the scientist

Clinical Research Biobanking Experimental Research Imaging DATA GENERATION DATA STORAGE DATA PROCESSING DATA INTEGRATION DATA QUERY VIEWING Prostate Cancer Molecular Medicine Where is the Red Button? How to solve the issues? NEW KNOWLEDGE

TraIT subdivision into work packages Four data generating work packages Data integration & analysis across the four platforms Shared hardware and professional training & support

The TraIT mansion requires good support Open Clinica BMIA TOP desk Alfresco tEPIS Catalogue Workflow Phenotype Database Chipster Galaxy coLIMS XNAT Keosys Logis tranSMART Website Wiki Jira Data storage + CPU power TTP SurfConext

Where is the Red Button? How to solve the issues? Adopt, Adapt, Create Data analysis and integration DATA STORAGE & COMPUTE DATA PROCESSING DATA INTEGRATION DATA MINING VIEWING -Own (external hard drives) -Central CSC, CCBC, GEO, ENA -Commercial Clouds -Own pipelines and tools -Commercial programs CLCBio, etc. -Central / Open Source tool platforms -Own (Access) -Commercial (NextBio) -Central Oracle TRC, tranSMART

Data Mining: Query & Viewing Tools cBioPortal Between-Study Level Study Level Patient/Sample Level Molecular Level Platform: Where do I get my data from? Level: Which level do I want to mine? Tool: What is the best query & viewing tool?

Prostate Cancer Molecular Medicine What do we want? Use case: Identify novel fusion genes from DNA and RNA sequencing data PUSH TO START

Andrew Stubbs Harmen van de Werken Please attend the monthly Bridge Meetings: Lectures)