Presentation is loading. Please wait.

Presentation is loading. Please wait.

The 2007 Microarray Research Group Project

Similar presentations


Presentation on theme: "The 2007 Microarray Research Group Project"— Presentation transcript:

1 The 2007 Microarray Research Group Project
Comparison of Comparative Genomic Hybridization Technologies Across Microarray Platforms The 2007 Microarray Research Group Project Susan Hester April 1, 2007

2 Two goals for this presentation:
Present a Comparative Genomic Hybridization Study to the ABRF research community 2. Evaluate commercial CGH platforms for their ability to detect known gains or losses

3 Background copy number differences between a reference genome and
Comparative Genomic Hybridization (CGH) measures DNA copy number differences between a reference genome and a test genome. Early CGH experiments, the DNA targets were hybridized to metaphase chromosome spreads in FISH assays. This technology later evolved: DNA targets are hybridized to microarrays containing cDNA fragments or bacterial artificial chromosomes (BACs). Commercial microarrays are characterized as whole-genome CGH measures, obtaining copy number differences in DNA across entire genomes.

4 What is CGH and what is it not?
CGH has nothing to do with gene expression. CGH arrays are structural, not functional—they look for DNA that is missing or duplicated, not DNA that is expressed. In array CGH, thousands of probes are printed on a microscope slide. These probes are bacterial artificial chromosomes, or BACs, which are sequences of cloned human DNA Kb long that have been mapped to specific locations on specific chromosomes. To the array is added test DNA and reference or control DNA, each labeled with a different dye. The essence of the experiment is that the two DNA samples compete for hybridization to the BAC probes on the slide, thus the "comparative" aspect of CGH.

5 The 2007 MARG Study: The research question: How well do BAC and commercial CGH arrays detect known copy number gains or losses in test and reference DNA? Analyzed Human leukemic HL60 DNA compared to reference with 5 platforms: RPCI BAC 19K arrays, Agilent 44K, Illumina HAP 550K, Affymetrix 500K, and Affymetrix U133 expression array. DNA samples analyzed in quadruplicate. 3 different laboratory test sites; Roswell Park Cancer Institute, Memorial Sloan-Kettering Cancer Institute, and Columbus Children’s Hospital. Quality assessment of each platform performed by each test site.

6 Known copy number gains and losses on cell line HL-60*
Amplification of the 8q24 locus, Trisomy 18 Deletions at loci 5q11.2-q31, 6q12, 9p21.3-p22, 10p12-p15, 14q22-q31, 17p12-p13.3, monosomy X *Cancer Genet Cytogenet Nov;147(1):28-35; Peiffer et al.(2006)

7 The fundamental theory behind each platform*: nucleotide base-pairing
Before: hybridization: DNA sample (varying locus concentrations reflecting inherent gains/losses) Chromosome #1 probes _ _ _ _ _ _ _ _ _ _ _ _ After hybridization: DNA sample labeled and fragmented _ _ _ _ __ _ _ _ __ _ _ _ _ __ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ gain gain gain -_-- - -_--- *in this example; one color platform loss loss

8 Chari et al, Cancer Informatics, 2006, 2, 48-58
Example of Array CGH Technology* *in this example; 2-color platform Chari et al, Cancer Informatics, 2006, 2, 48-58

9 Cont.

10 Platform Details Platforms # probes or probe sets (resolution)
annotation platform resolution protocol replicates primary data BAC 19K1 ~19,000 in duplicate NCBI Build 35 (hg17) 100kb 2-color 4 HL60/lung Mean Log2 (test/ref) Agilent2 ~44,000 (~35 KB) Sub 100kb Affy U1333 54,675 1-color 4 HL60 4 lung Log2(RMA) Affy 500K2 ~500,000 Build 35 (hg17) 0 lung Reference database Illumina2 ~550,000 Log R ratio The test sites were: 1Roswell Park Cancer Institute, Buffalo, NY,2Sloan-Kettering Institute Genomic Core, Rochester, NY 3Columbus Pediatric Hospital, Columbus Ohio

11 Study Questions: 1. What is the precision of each platform?
2. How accurately does each platform detect known copy number changes? 3. What is the resolution of array CGH?

12 Results: Assessment of precision- coefficient
of variation across replicates

13 Coefficient of variation statistics
BAC Agilent Illumina Affy GE Affy Nsp Affy Sty Mean 2.13 3.16 5.98 9.64 10.69 9.44 Median 1.96 2.66 5.23 8.71 9.91 8.47 Std Dev 1.09 2.43 3.52 4.78 5.24 5.60 Std Err 0.01 0.00 0.02 Size 18067 42897 555353 54676 524629 476709 Min 0.05 0.71 0.15 Max 16.87 77.89 98.05 59.57 48.97 50.59

14 Platform Visualizations:
Chromosome 8 gain Chromosome 17 deletion Chromosome 5 deletion

15 RPCI BAC 19K array chr8 -2 -1 1 2 3 chr17 3 +8q24 -17p12 2 1 -1 -2

16 Agilent 44K Chr. 5q deletion

17 Visualization of each platform: Illumina HAP 550K
Chr. 5q deletion Replicate 1 Replicate 2 Replicate 3 Replicate 4 -5q11

18 Affymetrix U133 Gene Expression Array
Chromosome 17

19 Partek: Affy500K Chr 17p deletion Replicate 1 Replicate 2 Replicate 3

20 Results: Assessment of Accuracy: concordance with known gains/losses
Gain/Loss -5q11 -6q12 +8q24 -9p21 -10p12 -14q22 -17p12 Tri 18 monoX BAC Y N Agilent Affy U133 GE Affy 500K Illumina red=amplification black=loss

21 Results: Assessment of platform resolution
Circular binary segmentation analysis (CBS)* This approach is used to translate noisy intensity measurements into regions of equal copy number. Detects the point at which signal changes. Log ratios of normalized signal intensities are used. log2 (signal HL60/ref DNA) ex: normal state is 2copies/2copies=1 log2 (1)=0 *Olshen, A.B. et al. (2004)-Circular Binary Segmentation for the Analysis of array-based DNA Copy Number Data

22 Adopted;

23 Circular Binary Segmentation
CBS Calculation Point of transition in signal Simulated Intensity Ratio Segment

24 Assessment of platform resolution on a known
deletion or gain: RPCI BAC 19K Agilent 44K Illumina HAP550 Affymetrix U133 Gene Expression Affymetrix 500K

25 If 1 copy present in HL-60 and 2 in ref;
then intensity ratio= ½=0.5; log2(0.5)= -1 Log2 between 0 & -1= 1cn loss 0 & 0.5= 1cn gain 0.5 & 1= 2cn gain

26 CBS Summary Details for known gains/losses:
Deletion Chr.5 and Gain Chr.8 Platform chromosome loc.start loc.end num.probes* seg.mean Est. Copy number RPCI BAC 5 53,817,423 139,426,041 491 -1 Agilent 53,787,209 139,475,226 1003 Illumina 53,729,928 113,350,609 10438 Affy GE 53,536,915 139,532,618 1212 Affy 500K** 53,925,665 73,549,145 13886 8 98,584,025 130,781,748 14 1.2942 >2 126,357,705 130,632,571 26 0.3993 >1 126,302,381 130,771,462 428 1.1630 126,335,997 128,498,243 22 1.612 126,000,000 131,000,000 1192 1.7650 *used only probes>10 **Data generated using Partek Software to determine gains/losses; Green= deletion Red=gain

27 Did Agilent underestimate the change?
Chr.8 -entire view zoomed-in view of amplification at 120 to 130Mb follows

28 of our Agilent data compared to other platforms;
CBS 2.011 CBS 1.1630 CBS 1.612 CBS 0.3993 Results: lowest CBS ratio value and small dynamic range of our Agilent data compared to other platforms; huge number of segments in Illumina data

29 CBS analysis finds 3 novel gains or losses**
BAC& Affy500K agree -Chromosome 2 Illumina, Agilent, Affy GE, BAC and Affy500K all agree on Chromosome 16 Affy500K and Affy GE agreement on Chromosome 19 ** Agreement means that 2 of the 5 platforms found the same gain or losses in the same location (start to end location)

30 Table of Novel Gains and Losses
 Platform Chr.2 Chr.16 Chr.19 BAC Y N Agilent Affy U133 Affy 500K Illumina Red=gain Black=loss  Platform Chr.2 Chr.16 Chr.19 BAC Y N Agilent Affy U133 Affy 500K Illumina

31 Conclusions Good precision across all platforms: CVs ranged from 2.1 to 10.7% across the platforms Illumina and Affymetrix 500K identified 9 of 9 (100%) of the known losses and gains in HL60 DNA Agilent, Affymetrix U133 gene expression, and RPCI BAC arrays identified 8 of 9 (89%) of known losses and gains in HL60 DNA CBS used to assess resolution of each platform; all platforms detected known gains/losses 3 novel changes were also detected by at least 2 or more platforms

32 Acknowledgments The Microarray Research Group (MARG):
Susan Hester, Laura Reid, Agnes Viale, Norma Nowak, Herbert Auer, Kevin Knudtson, Bill Ward, Jay Tiesman, Caprice Rosato, Aldo Massimi, Greg Khitrov, and Nancy Denslow Laboratory test sites: Roswell Park Cancer Institute, Memorial Sloan-Kettering Cancer Institute, and Columbus Children’s Hospital Data analysis efforts: US Environmental Protection Agency Roswell Park Cancer Institute Columbus Children’s Hospital Expression Analysis Funding support for this project: The Executive Board of the Association of Biomolecular Resource Facilities

33 MARG poster number is RG4-S

34


Download ppt "The 2007 Microarray Research Group Project"

Similar presentations


Ads by Google