Presentation is loading. Please wait.

Presentation is loading. Please wait.

Microarray technology and analysis of gene expression data Hillevi Lindroos.

Similar presentations

Presentation on theme: "Microarray technology and analysis of gene expression data Hillevi Lindroos."— Presentation transcript:

1 Microarray technology and analysis of gene expression data Hillevi Lindroos

2 Introduction to microarray technology Technique for studying gene expression for thousands of genes simultaneously. Study gene regulation, effects of treatments, differences between healthy and diseased cells... Comparative Genome Hybridization: - gene content in related strains/species - gene dosage in cancer cells Microarray: glass slide with spots, each containing DNA from one gene

3 Two-colour spotted microarrays Spot = PCR-product (~500 bp) from one gene or long oligonucleotide (~50 bp) Differential expression (two samples compared)

4 Experimental procedure: 1. Isolate RNA from 2 samples (experiment and control). 2. Reverse transcribe to cDNA with fluorescently labelled nucleotides, e.g. Cy3-dCTP (control) or Cy5-dCTP (experiment). 3. Mix and hybridize to microarray. 4. Laser scan: measure fluorescent intensities

5 In principle... Red spot: up-regulated gene, ratio >1 Green spot: down-regulated gene, ratio <1 Yellow spot: no differential expression, ratio =1 Red and green images superimposed:

6 Sample (e.g. heat shock) gene A RT + red dye mixing equal amounts of cDNA competitive hybridization Microarray Red dot in image Up-regulation

7 Why differential expression? Fluorescent intensities do not directly correspond to mRNA concentrations, due to: different shapes and densities of spots different hybridization properties between genes different amounts of dye incorporation between genes  Compare intensities (expression) from two samples.

8 Data processing and analysis 1. Image analysis Locate spots in image Quantify fluorescence intensity (spot + background) Mean / median of pixel intensities

9 2. Background correction – local background for each spot, or global for whole array – assuming additive background: Spot intensity = True intensity + Background

10 Output Cy5 (R) and Cy3 (G) intensities Ratio = R/G ~ [mRNA_experiment] / [mRNA_control] Up-regulated genes: ratio >1 Down-regulated genes: ratio= 0-1  Assymetry!

11  Use logarithm! M = log 2 (ratio) is symmetrically distributed around 0 Upregulated 2 times: ratio= 2, M= 1 Downregulated 2 times: ratio= 0.5, M= -1

12 3. Normalization: correction of systematic errors (dye bias) different amounts of control and experiment samples different fluorescent intensities of Cy3 and Cy5 different labelling and detection efficiencies

13 Dye bias: Most genes seem to be upregulated (higher Cy5 than Cy3 intensity). Plot of Cy5 intensity (R) vs Cy3 intensity (G):

14 Corrected for by scaling Cy5 values with total_Cy3/total_Cy5. Assumes most genes unaffected by treatment.

15 Dye bias may depend on total spot intensity A (A =½(log 2 R+log 2 G)), position on array, print-tip… Intensity dependent dye bias

16 Correction: M normalized = M – M trend (A)

17 Identify differentially expressed genes Simple: cutoff (e.g. |M| > 1) Better: statistical test, e.g. t-test (replicate spots or repeated experiments) => Significance – Unstable mRNAs may have high ratios – and high variation! – Weak spots: small difference in signal may be big relative difference (high ratio).

18 Affymetrix genchips Spots = 25 bp oligonucleotides Pairs of perfectly matching probe + probe with 1 mismatch for each gene One sample per array Radioactive labelling Expression level computed from difference in intensity between matching and mis-matching probe

19 Expression profiles Plot expression over a series of experiment (e.g. time series)

20 Clustering expression profiles Analyze multiple experiments to identify common patterns of gene expression Similar function – similar expression (co-regulation) Goals: Identify regulatory motifs Infer function of unknown genes Distinguish cell types, e.g. tumors (cluster arrays)

21 Hierarchical clustering Expression profile -> vector Compute similarity between expression profiles (e.g. correlation coefficient) Successively join the most similar genes to clusters, and clusters to superclusters

22 Serum stimulation of human fibroblasts, time series. A: cholesterol biosynthesis B: cell cycle C: immediate-early response D: signaling and angiogenesis E: wound healing from: Eisen et al., 1998, PNAS 95(25): 14863-14868 Distance: correlation coefficient Agglomeration: average linkage

23 Clustering of arrays: classification of cancer cells. From Chen et al. (2002). Mol Biol Cell 13(6):1929-39

24 Exercise: Normalization (Excel): R-G plot M-A plot most up- and downregulated genes

Download ppt "Microarray technology and analysis of gene expression data Hillevi Lindroos."

Similar presentations

Ads by Google