BiGCaT Bioinformatics Hunting strategy of the bigcat.

Slides:



Advertisements
Similar presentations
A Little More Advanced Biotechnology Tools
Advertisements

DNA strands can be separated under conditions which break H-bonds
BioASP Roadshow Maastricht; May Brought to you by: BioASP and BiGCaT Bioinformatics.
The Rice Functional Genomics Program of China cDNA microarray database (RIFGP-CDMD) consists of complete datasets, including the probe sequences, microarray.
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
The Sense of Sequense The Sense of Sequense Chris Evelo BiGCaT Bioinformatics Universiteit Maastricht.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Toxicology in the omics era. Chris Evelo BiGCaT Bioinformatics Group – BMT-TU/e & UM.
Understanding Proteomics through Bioinformatics Chris Evelo BiGCaT Bioinformatics Group – BMT-TU/e & UM Masterclass Nutrigenomics; May
9 Genomics and Beyond Brief Chapter Outline
Gene Expression Chapter 9.
Gene Expression And Regulation Bioinformatics January 11, 2006 D. A. McClellan
DNA Microarray: A Recombinant DNA Method. Basic Steps to Microarray: Obtain cells with genes that are needed for analysis. Isolate the mRNA using extraction.
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Bioinformatics Student host Chris Johnston Speaker Dr Kate McCain.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Bacterial Physiology (Micr430)
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Arrays: Narrower terms include bead arrays, bead based arrays, bioarrays, bioelectronic arrays, cDNA arrays, cell arrays, DNA arrays, gene arrays, gene.
CISC667, F05, Lec24, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) DNA Microarray, 2d gel, MSMS, yeast 2-hybrid.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Microarrays: Theory and Application By Rich Jenkins MS Student of Zoo4670/5670 Year 2004.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Microarrays: Basic Principle AGCCTAGCCT ACCGAACCGA GCGGAGCGGA CCGGACCGGA TCGGATCGGA Probe Targets Highly parallel molecular search and sort process based.
and analysis of gene transcription
Analysis of microarray data
This Week: Mon—Omics Wed—Alternate sequencing Technologies and Viromics paper Next Week No class Mon or Wed Fri– Presentations by Colleen D and Vaughn.
Bioinformatics.
The European Nutrigenomics Organisation Deciding and acting on quality of microarray experiments in genomics Chris Evelo BiGCaT Bioinformatics Maastricht.
DNA microarrays Each spot contains a picomole of a DNA ( moles) sequence.
Bio-Asp Road Show May 19 th, 2004 Dr. Ann Pascale Bijnens Department Pathology University Maastricht Implementation of Bio-Asp analysis tools in NWO Genomics.
CDNA Microarrays MB206.
Data Type 1: Microarrays
Microarray Technology
A New Oklahoma Bioinformatics Company. Microarray and Bioinformatics.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Part I: Identifying sequences with … Speaker : S. Gaj Date
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
What Is Microarray A new powerful technology for biological exploration Parallel High-throughput Large-scale Genomic scale.
Mining Biological Data. Protein Enzymatic ProteinsTransport ProteinsRegulatory Proteins Storage ProteinsHormonal ProteinsReceptor Proteins.
Systems Biology through Pathway Statistics Chris Evelo BiGCaT Bioinformatics Group – BMT-TU/e & UM Diepenbeek; May
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Microarray Technology. Introduction Introduction –Microarrays are extremely powerful ways to analyze gene expression. –Using a microarray, it is possible.
Microarray (Gene Expression) DNA microarrays is a technology that can be used to measure changes in expression levels or to detect SNiPs Microarrays differ.
Microarray analysis Quantitation of Gene Expression Expression Data to Networks BIO520 BioinformaticsJim Lund Reading: Ch 16.
DNA Gene A Transcriptional Control Imprinting Histone Acetylation # of copies of RNA? Post Transcriptional Processing mRNA Stability Translational Control.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Disease Diagnosis by DNAC MEC seminar 25 May 04. DNA chip Blood Biopsy Sample rRNA/mRNA/ tRNA RNA RNA with cDNA Hybridization Mixture of cell-lines Reference.
Microarrays and Other High-Throughput Methods BMI/CS 576 Colin Dewey Fall 2010.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
Genomics A Systematic Study of the Locations, Functions and Interactions of Many Genes at Once.
DNA Microarray Overview and Application. Table of Contents Section One : Introduction Section Two : Microarray Technique Section Three : Types of DNA.
Genomics A Systematic Study of the Locations, Functions and Interactions of Many Genes at Once.
From: Duggan et.al. Nature Genetics 21:10-14, 1999 Microarray-Based Assays (The Basics) Each feature or “spot” represents a specific expressed gene (mRNA).
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Other uses of DNA microarrays
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
Microarray: An Introduction
Detecting DNA with DNA probes arrays. DNA sequences can be detected by DNA probes and arrays (= collection of microscopic DNA spots attached to a solid.
The Transcriptional Landscape of the Mammalian Genome
Genomics A Systematic Study of the Locations, Functions and Interactions of Many Genes at Once.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

BiGCaT Bioinformatics Hunting strategy of the bigcat

BiGCaT, bridge between two universities Universiteit Maastricht Patients, Experiments, Arrays and Loads of Data TU/e Ideas & Experience in Data Handling BiGCaT

Major Research Fields Cardiovascular Research Nutritional & Environmental Research BiGCaT

What are we looking for?

Different conditions show different levels of gene expression for specific genes

Differences in gene expression? Between e.g.: healthy and sick healthy and sick different stages of disease progression different stages of healing different stages of healing failed and successful treatment more and less vulnerable individualsShows: important pathways and receptors important pathways and receptors which then can be influenced which then can be influenced

The transfer of information from DNA to protein. From: Alberts et al. Molecular Biology of the Cell, 3rd edn.

Eukaryotic genes in somewhat more detail

Gene expression measurement Functional genomics/transcriptomics: Changes in mRNA – Gene expression microarrays – Suppression subtraction lybraries – Proteomics: Changes in protein levels – 2D gel electrophoresis – Antibody arrays – DNA  mRNA  protein

Gene expression arrays Microarrays: relative fluorescense signals. Identification. Macroarrays: absolute radioactive signal. Validation.

Layout of a microarray experiment 1)Get the cells 2)Isolate RNA 3)Make fluorescent cDNA 4)Hybridize 5)Laser read out 6)Analyze image

The cat and its prey: the data Comprises: Known cDNA sequences (not known genes!) on the array = reporters Data sets typically contain 20,000 image spot intensity values in 2 colors One experiment often contains multiple data points for every reporter (e.g. times or treatments) Each datapoint can (should) consist of multiple arrays Bioinformatics should translate this in to useful biological information

Hunting Comprises: Analyze reporters Data pretreatment Finding patterns in expression Evaluate biological significance of those patterns

Reporter analysis Reporter sequence must be known (can be sequenced using digest electrophoresis). Lookup sequence in genome databases (e.g. Genbank/Embl or Swissprot) Will often find other RNA experiments (ESTs) or just chromosome location.

Blast reporters against what? Nucleotide databases (EMBL/Genbank) Disadvantages: many hits, best hit on clone, we actually want function (ie protein) Nucleotide clusters (Unigene) Disadvantage: still no function Protein databases (Swissprot+trEMBL) Disadvantages: non coding sequence not found, frameshifts in clones

Two implemented solutions Start with Unigene (from Blastn or platform provider), mine using SRS (direct, through PDB, through PIR) -> Swissprot/trEMBL Use dedicated EMBL-Swissprot X- linked DB (Blast against EMBL subset get Swissprot/trEMBL)

Two implemented solutions Start with Unigene (from Blastn or platform provider), mine using SRS (direct, through PDB, through PIR) -> Swissprot/trEMBL Use dedicated EMBL-Swissprot X- linked DB (Blast against EMBL subset get Swissprot/trEMBL)

Scotland - Holland: 1-0? Check Affymetrix reporter sequences. - Each reporter mer probes. - Blast against ENSEMBL genes (takes 1 month on UK grid). - Use for cross-species analysis - Adapt RMA statistical analysis in Bioconductor

Next slide shows data of one single actual microarray Normalized expression shown for both channels. Each reporter is shown with a single dot. Red dots are controls Note the GEM barcode (QC) Note the slight error in linear normalization (low expressed genes are higher in Cy5 channel)

Next slide shows same data after processing Controls removed Bad spots (<40% average area) removed Low signals (<2.5 Signal/Background) removed All reporters with <1.7 fold change removed (only changing spots shown)

Final slide shows information for one single reporter This signifies one single spot It is a known gene: an UDP glucuronyltransferase Raw data and fold change are shown

Secondary Analyses Gene clustering (find genes that behave equally) Cluster evaluation (what do we see in clusters …) Physiological evaluation (for arrays, proteomics, clusters) Understand the regulation

2 time Expr. level Clustering: find genes with same pattern T1 signal T2 signal Left hand picture shows expression patterns for 2 genes (these should probably end up in the same cluster). Right hand picture shows the expression vector for one gene for the first 2 dimensions. Can be normalized by amplitude (circle) or relatively (square).

Cluster evaluation Group genes (function, pathway, regulations etc.) Find groups in patterns using visualization tools and automatic detection. Should lead to results like: “This experiment shows that a large number of apoptosis genes are up-regulated during the early stage after treatment. Probably meaning that cells are dying”

Example of GenMAPP results: Manual lookup on a MAPP

Understanding regulation The main idea: co-regulated genes could have common regulatory pathways. The basic approach: annotate transcription factor binding sites using Transfac and use for supervised clustering. The problem: each gene has hundreds of tfb’s. Solution? Use syntenic regions using rVista (work in progress with Rick Dixon)

Understanding QTL’s Get blood pressure QTLs: from ENSEMBL/cfg Welcome group. Look up functional pathways and Go annotations using GenMapp: virtual experiment assume all genes in QTL are changing. Create a new blood pressure Mapp: confront this with real blood pressure/heart failure microarray data. Work in progress TU/e MDP3 group.

People involved Bigcat Maastricht: Rachel van Haaften (IOP), Edwin ter Voert (BMT), Joris Korbeeck (BMT/UM), Willem Ligtenberg (IOP), Stan Gaj (tUL), Chris Evelo Tue: Peter Hilbers, Huub ten Eijkelder, Patrick van Brakel, lots of students CARIM: Yigal Pinto, Umesh Sharma, Blanche Schroen, Matthijs Blankesteijn, Jos Smits, Jo de Mey, Danielle Curfs, Kitty Cleutjens, Natasja Kisters, Esther Lutgens, Birgit Faber, Petra Eurlings, Ann-Pascalle Bijnens, Mat Daemen, Frank Stassen, Marc van Bilssen, Marten Hoffker. NUTRIM: Wim Saris, Freddy Troost, Johan Renes, Simone van Breda. GROW: Daisy vd Schaft, Chamindie Puyandeera IOP Nutrigenomics: Milka Sokolovic, Theo Hackvoort, Meike Bunger, Guido Hooiveld, Michael Müller, Lisa Gilhuis-Pedersen, Antoine van Kampen, Edwin Mariman, Wout Lamers, Nicole Franssen, Jaap keijer Cfg Welcome group: Neil Hanlon (Glasgow) Gontran Zepeda (Edinburg), Rick Dixon (Leicester), Sheetal Patel (London). Paris leptin group: Soraya Taleb, Rafaelle Cancello,Nathalie Courtin, Carine Clement Organon: Jan Klomp, Rene van Schaik. BioAsp: Marc Laarhoven.