Presentation is loading. Please wait.

Presentation is loading. Please wait.

Microarray - Leukemia vs. normal GeneChip System.

Similar presentations


Presentation on theme: "Microarray - Leukemia vs. normal GeneChip System."— Presentation transcript:

1

2

3

4

5

6 Microarray - Leukemia vs. normal

7 GeneChip System

8 Affymetrix Gene Expression Machine

9 Spotter device

10 Robot arm

11 Robot for printing

12 Print head

13 Print-head of robot arm

14 Hybridization equipment

15 Part 3 Raw Microarray Data

16 Raw data from microarrays
Microarray data comprise images from hybridized arrays representing hybridization signal intensities for individual spots These may be generated by single fluorescent, dual fluorescent, radioactive or colorimetric labels and the recording methods differ in each case

17 Microarray preparation
They are miniature devices comprising a large number of DNA sequences immobilized on a substrate such as a glass microscope slide The sequences, known as features, are arranged as a grid Arrays are hybridized with a complex probe (a population of labeled DNA or RNA molecules, representing a particular cell type or tissue) The intensity of hybridization signal for each feature corresponds to the amount of that particular molecule in the probe, and this is directly proportional to the level of gene expression in the cell type or tissue from which the probe was prepared

18 They consist of images from hybridized arrays
The exact nature of image depends on the array platform (the type of array used)

19 First generation microarrays
They were made by spotting DNA molecules onto nylon membranes and hybridized with a radioactive probe. The signals were detected and quantified using a phosphorimager spatial resolution of radioactive signals is low, so the features on the array cannot be packed very tightly Hence, nylon arrays tend to be large (in order of 10 cm2) and are sometimes called macroarrays for this reason Feature density can be increased by using a colorimetric label instead of a radioactive label, but the sensitivity is lower

20 Second generation microarrays
Spotted cDNA microarrays or high-density oligonucleotide chips are used in most array experiments these days In both cases, the substrate has minimal autofluorescence so a fluorescent probe can be used Data are acquired by confocal laser scanning of hybridized array at appropriate excitation wavelength and recording at appropriate emission wavelength (or channel)

21 Part 4 Data Quality

22 Labeling and hybridization
A single label is used for oligonucleotide chips, so differential gene expression is detected hybridizing different probes to duplicate arrays However, in the case of spotted arrays, two probes can be labeled with different fluorophores and hybridized simultaneously to the same array allowing differential gene expression to be monitored directly

23 Data Normalization Issues
Normalization of data from different chips MGED normalization standards -- natural biological variation is large technical variation is small ~ 98% auto-correlation MIT approach -- raw gene expression values Stanford approach -- ratios

24 Data Preparation Thresholding: usually min 20, max 16,000
For older Affy chips (new Affy chips do not have negative values) Filtering - remove genes with insufficient variation e.g. MaxVal - MinVal < 500 and MaxVal/MinVal < 5 biological reasons feature reduction for algorithmic For clustering, normalize each gene separately to Mean = 0, Std. Dev = 1

25 Data quality It is essential to record signal intensities from individual spots accurately as errors in data recording cannot be detected or corrected at a later stage Software for reading microarrays is generally provided with the recording equipment (scanner or phosphorimager) but manual adjustment is necessary to compensate for variations in array manufacture The signal must be corrected for background (nonspecific hybridization, autofluorescence, contamination) and hybridization controls must be used when comparing results across different arrays

26 Process automation DNA arrays may contain many thousands of features and hence data acquisition and analysis must be automated Software for initial image processing is normally provided with the scanner (or phosphoimager), which allows boundaries of individual spots to be determined and the total signal intensity to be measured over the whole spot (called signal volume) Locating spots precisely can be problem, particularly if there is distortion on array surface, and hence often necessary to align the grid manually. This is essential since signal intensities can vary across individual spots and the shape an size of different spots may not be uniform

27 Noise suppression Signal intensity has to be corrected for background noise, which may creep in by non-specific hybridization, autofluorescence, dust and other contaminants or poor hybridization technique (e.g. partial dehydration) Noise can vary over the array surface, so signal intensities must be normalized for local background values Correction for background noise is difficult when the signal intensity for a particular spot is itself very low

28 Control features They should be included on array to measure non-specific hybridization and variable hybridization across arrays For instance, Affymetrix GeneChips incorporate a set of mismatching oligonucleotides for each perfect match set to determine non-specific hybridization Controls are important where duplicate arrays are being used to study differential gene expression, since variation in array manufacture or experimental protocol can influence signal intensities on different arrays Bottom line is that errors and artifacts introduced before or during data acquisition cannot be detected or corrected later

29 Gene expression matrices
The raw data from microarray experiments are converted into tables known as gene expression matrices The rows represent genes and the columns represent experimental conditions The data in the table are signal intensities, representing relative levels of gene expression

30 Grouping expression data
Each gene in a gene expression matrix has an expression profile, that is, the expression measurements over a range of conditions The analysis of microarray data involves grouping these data on the basis of similar expression profiles If a predefined classification system is used to group the genes, the analysis is described as supervised If there is no predefined classification, the analysis is described as unsupervised and is known as clustering

31 Clustering methods Clustering first involves converting the gene expression matrix into a distance matrix, so genes with similar expression profiles can be grouped together This generally involves calculating the Euclidean distance, the correlation measure based distance or the Pearson linear correlation based distance for each pair of values Several clustering methods can then be used including hierarchical clustering, k-means clustering and the derivation of self-organizing maps


Download ppt "Microarray - Leukemia vs. normal GeneChip System."

Similar presentations


Ads by Google