Presentation is loading. Please wait.

Presentation is loading. Please wait.

Microarray Technology and Data Analysis Roy Williams PhD Sanford | Burnham Medical Research Institute.

Similar presentations


Presentation on theme: "Microarray Technology and Data Analysis Roy Williams PhD Sanford | Burnham Medical Research Institute."— Presentation transcript:

1 Microarray Technology and Data Analysis Roy Williams PhD Sanford | Burnham Medical Research Institute

2 Microarray Revolution

3 Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein would be more direct, but is currently harder. Measuring Gene Expression

4 General assumption of microarray technology Use mRNA transcript abundance level as a measure of expression for the corresponding gene Proportional to degree of gene expression

5 How to measure RNA abundance Several different approaches with similar themes Illumina bead array – highly redundant oligo array Affymetrix GeneChip – highly redundant oligo array Nimblegen – highly redundant long oligo array 2-colour array (very long cDNA; low redundancy) SAGE (random Sanger sequencing of cDNA library) Reborn as Next Gen RNA seq

6 The Illumina Beadarray Technology Highly redundant ~50 copies of a bead 60mer oligos Absolute expression Each array is deconvoluted using a colour coding tag system Human, Mouse, Rat, Custom

7 Affymetrix Technology Highly redundant (~25 short oligos per gene) Absolute expression PM-MM oligo system valuable for cross hybe detection Human, Mouse, E. coli, Yeast…….. Affy and illumina arrays have been systematically compared

8 Spotted Arrays Low redundancy cDNA and oligo Two dyes Cy5/Cy3 Relative expression Cost and custom

9 Single Colour Labelling

10 Microarrays in action off on

11 Areas Being Studied with Microarrays Differential gene expression between two (or more) sample types Similar gene expression across treatments Tumour sub-class identification using gene expression profiles Classification of malignancies into known classes Identification of “marker” genes that characterize different cell types Identification of genes associated with clinical outcomes (e.g. survival)

12 Experimental Design Design Experiment Replicates 2x 3 chips <2x 5 chips Perform Experiment Standardize conditions Dump outliers

13 Microarray Data Analysis Workflow Quality Control Normalize Data Set up experimental data Filter for differential expression Advanced analysis techniques- clustering Compare results to biology; Nextbio, GeneGo; IPA

14 Recommended Software Free Software – GenePattern -- powerful, many plug-in packages and pipelines -- good video examples/tutorials GeneSpring GX11 R-Bioconductor (with guidance) Hierarchical Cluster Explorer – easy clustering Cytoscape, GSEA – for pathway visualisation Partek IPA, Nextbio, GeneGo <= Burnham subscriptions!

15 Log Transformed Data 2/2 = 1log2(1) = 0 4/1=4log2(4) = +2 ¼=0.25log2(0.25) = -2 Transformation often performed before normalisation

16 After QC for low confidence genes (P<0.99) Note: ~50 replicate beads per array Median Outliers 25% quartile 75% quartile BAD CHIP BOXPLOT REPRESENTATION OF DATA SPREAD CHIP NUMBER SIGNAL INTENSITY

17 The effect of quantiles Normalisation on the filtered 36 data sets IMPORTANT: use non-linear normalisation >library(affy) >Qdata <- normalize.quantiles(Rawdata) All same range

18 Data Analysis Examples 1# Illumina arrays with GeneSpring GX11 2# Affymetrix data, with a GenePattern module Import, Quality Control, normalize Detect differentially expressed genes Pathway analysis

19 Illumina Analysis Workflow Check array hybridisation quality Direct Export file as “sample probe profile” Import into GENESPRING GX11 Genome Studio Application: process binary.idat files to txt Normalisation here is optional

20 GeneSpring GX11 features Guided workflows Pathways GSEA IPA integration Ontologies MySQL R script API

21 GeneSpring GX11 Create New Project Browse to and load Data Automated install of GenomeDef from Agilent repository

22 Illumina Advanced Workflow

23 Grouping Sample Replicates

24 Check Replicates Are Similar

25 Scatterplot of replicates

26 Scatterplot of differently treated samples

27 Filter genes on P-value

28 Significantly different genes in a Volcano plot

29 Significant Pathway Determination

30 Which types of genes are enriched in a cluster? Idea: Compare your cluster of genes with lists of genes with common properties (function, expression, location). Find how many genes overlap between your cluster and a gene list. Calculate the probability of obtaining the overlap by chance This measures if the enrichment is significant. This analysis provides an unbiased way of detecting connections between expression and function. 25 0 7 GeneOntology Cell cycle Our Cell cycle 15000

31 Send list to IPA for pathway Analysis

32 Significant Pathways sent to Ingenuity Pathway Analysis

33 Completed Analysis genelists Data Pathways

34 Affymetrix Workflow: GenePattern

35 Comparative Marker Selection

36 Paste the URLs for Data files

37 Send results to next module Viewer module

38 Outputs ranked list of genes List of Marker genes can be Filtered and exported

39 Nextbio Compares your Genelists to the Nextbio database Can reveal unexpected similarities between datasets Has a very good literature database connected to the results Contains data from model organisms

40 Ingenuity Pathway Analysis Detects networks in your data Allows you to look for connections between genes and drugs/small molecules Focused on Man and Mouse GeneGo High Quality hand annotated ontologies Has a very good literature database connected to the results Contains data from model organisms

41 Start a new core analysis

42 Ingenuity Data import

43 IPA determines functions

44 Overlay drug and disease data

45 Data Import to Nextbio

46 The Nextbio Report Page

47 What else does my gene do?

48 THE END Many thanks for coming!


Download ppt "Microarray Technology and Data Analysis Roy Williams PhD Sanford | Burnham Medical Research Institute."

Similar presentations


Ads by Google