____ __ __ _______Birol et al :: AGBT :: 7 February 2008 A NOVEL APPROACH TO IMPROVE THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE.

Slides:



Advertisements
Similar presentations
ICSA, 6/2007 Pei Wang, 1 Spatial Smoothing and Hot Spot Detection for CGH data using the Fused Lasso Pei Wang Cancer Prevention Research.
Advertisements

CZ5225 Methods in Computational Biology Lecture 9: Pharmacogenetics and individual variation of drug response CZ5225 Methods in Computational Biology.
Genetic Analysis of Genome-wide Variation in Human Gene Expression Morley M. et al. Nature 2004,430: Yen-Yi Ho.
Analysis of the Spring Leg Defect in the Canadian Dorset Sheep Breed J. Cameron 1, M. Jafarikia 2,3, L. Maignel 2, R. Morel 1 1 Centre d’expertise en production.
Single Nucleotide Polymorphism Copy Number Variations and SNP Array Xiaole Shirley Liu and Jun Liu.
We processed six samples in triplicate using 11 different array platforms at one or two laboratories. we obtained measures of array signal variability.
Modeling sequence dependence of microarray probe signals Li Zhang Department of Biostatistics and Applied Mathematics MD Anderson Cancer Center.
Population Approaches to Detecting and Genotyping Copy Number Variation Lachlan Coin July 2010.
SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.
Tumour karyotype Spectral karyotyping showing chromosomal aberrations in cancer cell lines.
Yanxin Shi 1, Fan Guo 1, Wei Wu 2, Eric P. Xing 1 GIMscan: A New Statistical Method for Analyzing Whole-Genome Array CGH Data RECOMB 2007 Presentation.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Getting the numbers comparable
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
Comparative Genomic Hybridization (CGH). Outline Introduction to gene copy numbers and CGH technology DNA copy number alterations in breast cancer (Pollack.
Data analytical issues with high-density oligonucleotide arrays A model for gene expression analysis and data quality assessment.
CISC667, F05, Lec24, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) DNA Microarray, 2d gel, MSMS, yeast 2-hybrid.
02_13.jpg Human chromosome 4 02_15.jpg 02_15_2.jpg.
An Update in Genetics of Epilepsy
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
Genome-wide Copy Number Analysis Qunyuan Zhang,Ph.D. Division of Statistical Genomics Department of Genetics & Center for Genome Sciences Washington University.
CLL Research Consortium FISH studies, Core C June, 2005 NCI Submission.
ChrX probes Autosomal probes ChrX probes Autosomal probes Autosomal probes ChrX probes Effect of hybridization temperature on microarray performance Figure.
Special Topics in Genomics Lecture 1: Introduction Instructor: Hongkai Ji Department of Biostatistics
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Dr Katie Snape Specialist Registrar in Genetics St Georges Hospital
Large-Scale Copy Number Polymorphism in the Human Genome J. Sebat et al. Science, 305:525 Luana Ávila MedG 505 Feb. 24 th /24.
Exploring the behavioral genetics of Trade and Cooperation Arcadi Navarro and Elodie Gazave July 5th 2007.
1 Genetic Variability. 2 A population is monomorphic at a locus if there exists only one allele at the locus. A population is polymorphic at a locus if.
Manifestation of Novel Social Challenges of the European Union in the Teaching Material of Medical Biotechnology Master’s Programmes at the University.
1. Abstract SAGE Serial analysis of gene expression (SAGE) is a method of large-scale gene expression analysis.that involves sequencing small segments.
GENOMIC COPY NUMBER Rudy Guerra Department of Statistics Rice University April 14, 2008.
Amandine Bemmo 1,2, David Benovoy 2, Jacek Majewski 2 1 Universite de Montreal, 2 McGill university and Genome Quebec innovation centre Analyses of Affymetrix.
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
Investigating the use of Multiple Displacement Amplification (MDA) to amplify nanogram quantities of DNA to use for downstream mutation screening by sequencing.
The Center for Medical Genomics facilitates cutting-edge research with state-of-the-art genomic technologies for studying gene expression and genetics,
DNA Copy Number Analysis Qunyuan Zhang,Ph.D. Division of Statistical Genomics Department of Genetics & Center for Genome Sciences Washington University.
A New Oklahoma Bioinformatics Company. Microarray and Bioinformatics.
©Edited by Mingrui Zhang, CS Department, Winona State University, 2008 Identifying Lung Cancer Risks.
Microarrays and Their Uses Brad Windle, Ph.D
CS177 Lecture 10 SNPs and Human Genetic Variation
A Genome-wide association study of Copy number variation in schizophrenia Andrés Ingason CNS Division, deCODE Genetics. Research Institute of Biological.
1 Commentary 1.Do not get too worried about "methods" and details. I fully expect there to be concepts and techniques that you simply are not going to.
Nature Genetics Vol.36 Sept 2004 Detection of Large-scale Variation In the Human Genome Iafrate, Feuk, Rivera, Listewnik, Donahoe, Qi, Scherer, Lee any.
Summarization of Oligonucleotide Expression Arrays BIOS Winter 2010.
Methods in genome wide association studies. Norú Moreno
Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012.
Genotype Calling Jackson Pang Digvijay Singh Electrical Engineering, UCLA.
Identification of Copy Number Variants using Genome Graphs
Julia N. Chapman, Alia Kamal, Archith Ramkumar, Owen L. Astrachan Duke University, Genome Revolution Focus, Department of Computer Science Sources
Other genomic arrays: Methylation, chIP on chip… UBio Training Courses.
Cancer genomics Yao Fu March 4, Cancer is a genetic disease In the early 1970’s, Janet Rowley’s microscopy studies of leukemia cell chromosomes.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Correlation Matrix Diagonal Segmentation (CMDS) A Fast Genome-wide Approach for Identifying Recurrent DNA Copy Number Alterations across Cancer Patients.
Computational Laboratory: aCGH Data Analysis Feb. 4, 2011 Per Chia-Chin Wu.
Class 23, 2001 CBCl/AI MIT Bioinformatics Applications and Feature Selection for SVMs S. Mukherjee.
Ishida et al. Supplementary Figures 1-3 Page 1 Supplementary Fig. 1. Stepwise determination of genomic aberrations on chr-13 in medulloblastomas from Ptch1.
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Biochemistry April Lecture DNA Microarrays.
Copy Number Analysis in the Cancer Genome Using SNP Arrays Qunyuan Zhang, Aldi Kraja Division of Statistical Genomics Department of Genetics & Center for.
Analyzing DNA using Microarray and Next Generation Sequencing (1) Background SNP Array Basic design Applications: CNV, LOH, GWAS Deep sequencing Alignment.
Global Variation in Copy Number in the Human Genome Speaker: Yao-Ting Huang Nature, Genome Research, Genome Research, 2006.
Different microarray applications Rita Holdhus Introduction to microarrays September 2010 microarray.no Aim of lecture: To get some basic knowledge about.
Global Variation in Copy Number in the Human Genome
Microarray Technology and Applications
Position specific effect of SNP on signal ratio from long oligonucleotide CGH microarrays; most single probe aberrations represent genuine genomic variants.
Figure 2 Copy-number variations in multiple myeloma
Genomic alterations in breast cancer cell line MDA-MB-231.
J. M. Friedman, Ágnes Baross, Allen D
Presentation transcript:

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 A NOVEL APPROACH TO IMPROVE THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE MICROARRAYS 12 November 2008 Noushin Farnoud, Marco Marra, Jan Friedman, Stephane Flibotte, Allen Delaney Canada’s Michael Smith Genome Sciences Centre

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 Outline What are Copy Number Variations (CNVs)? Why is it important to study copy number variations? How can we study CNVs? What are the issues associated with studying CNVs? How can we deal with them?

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 What is Copy Number Variation (CNV)? The DNA copy number of a region of a genome is the number of copies of genomic DNA. In humans the normal copy number is two for majority of autosomes. However, recent discoveries have revealed that many segments of DNA, ranging in size from kilobases to megabases, can vary in copy- number. These DNA copy number variations (CNVs) are a result of genomic events causing discrete gains and losses in contiguous segments of the genome.

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 Why is it important to study CNVs? CNVs are common in cancer and other diseases. For example, a review paper by Charles Lee have listed 17 conditions of the nervous system alone – including Parkinson’s Disease and Alzheimer’s Disease – that can result from copy number variation (Neuron Oct 06) CNVs are also common in normal individual and contribute to our uniqueness. These changes can also influence the susceptibility to disease. Since CNVs often encompass genes, they can have important roles both in characterizing human disease and discovering drug response targets. Understanding the mechanisms of CNV formation may also help us better understand human genome evolution.

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 How can we detect CNVs? Two-color arrays One-color arrays Patient Reference

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 Main issue of oligonucleotide microarrays Log2 Ratio of Intensity Position (Mb) * Although high density microarrays provide genome wide data on copy number, they are often associated with substantial amount of noise that could affect the performance of the analyses.

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 How can we improve this noise? Can we improve the oligonucleotide microarray noise by analyzing individual oligonucleotide probes? Hypothesis Each SNP probe set has : # oligonucleotide probes (10K array): 647,080 oligos # oligonucleotide probes (100K array) : 4,648,160 oligos # oligonucleotide probes (500K array): 12,013,632 oligos

____ __ __ _______Birol et al :: AGBT :: 7 February 2008

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 Therefore… We can conclude that a major source of the noise is the different behavior of the individual oligonucleotide probes in the SNP probe-set. This points out to the fact that averaging all PM oligos is not a proper approximation of information content of a SNP.

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 Novel Algorithm: Oligonucleotide Probe-level Analysis of Signal intensities (OPAS) Clusters the individual oligos in each SNP probe-set Apply Null-hypothesis testing : estimates the likelihood (p-value) that each cluster of oligos have log-ratio- intensity =0; >0 or <0 Based on these p-values and ML classification algorithms; identify the “most significant cluster of oligos”. The other cluster(s) of oligos is noise; exclude them from analysis.

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 Example of Improving the SNP Noise by OPAS Before After

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 How does OPAS Affect CN analysis?

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 What's next? The next-generation of DNA microarray-based technologies will allow equal detection of large and small CNVs. Also on the horizon are new DNA sequencing technologies enabling rapid (and ultimately inexpensive) 'personalized' genome sequencing projects. Coupled together, these technologies will capture almost all the variation in a genome.

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 Acknowledgments Funding Contact: GSC Marco Marra Stephane Flibotte Allen Delaney Irene Li Hong Qian Robert Holt Sussana Chan BC Children’s & Women’s Hospital Jan Friedman Patrice Eydoux

____ __ __ _______Birol et al :: AGBT :: 7 February 2008

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 Advantages of Array CGH

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 Log2 Ratio Classification (each SNP is classified to be deleted, normal or amplified, based on comparing the P’s of its consisting clusters of PM oligos Likelihood Estimation Apply a series of Null-Hypothesis Tests, to determine the likelihood : P Hs (cluster = 0) P Hs (cluster< -0.5) P Hs (cluster> +0.6 Clustering PM oligos (using Fuzzy Clustering approach) Post Processing the Results Test Array (Normalized Log2 Raw- Intensity) Ref Set (Pool of Normal Parents)

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 What is Copy Number Introduction - What is a SNP? - What is a SNP array? Array Design + Target Preparation Applications of SNP arrays (other than genotyping) - Copy number analysis Genotyping using SNP arrays - Generations of methodologies - Properties of SNP arrays

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 Schematic Representation of DNA Copy Number Change Normal cell deletion amplification CN=0 CN=1 CN=3 CN=4 CN=2

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 T T T T T T A A A A A C C CG G G G A T T T T T T A A A A A C C CG G G G A CG GCTA Single Nucleotide- Polymorphism (SNP) Background (1) : What are SNPs? Definition: SNPs are variations in single base pairs that are randomly dispersed throughout the genome

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 Major conclusions so far* … There is a considerable variation among the numbers and types of candidate CNVs detected by different analysis approaches. Multiple programs are needed to find all real aberrations in a test set. The frequency of false positive deletions is substantial, but can be greatly reduced by using the SNP genotype information to confirm loss of heterozygosity. * Friedman et. al, AJHG 2006 Baross et. al, BMC Bioinformatics, 2007 Delaney et. al, in progress

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 Profile of SNP probe sets Deleted SNPs SNPs in ‘Normal’ Region

____ __ __ _______Birol et al :: AGBT :: 7 February Generation of Affymetrix SNP arrays 10K100K500K Number of SNPs 11,555116,204500,568 Number of Oligonucleotide Probes 647,080 (14 quartets / SNP) 4,648,160 (10 quartets / SNP) 12,013,632 (6 quartets / SNP) Number of Arrays 1 Xba I 2 Xba I + Hind III 2 Nsp I + Sty I Number of SNPs per Array -- ~58,960 : Xba ~57,244 : Hind ~262,000 : Nsp ~238,000 : Sty Median inter-marker distance (kb) Mean inter-marker distance (kb) Average heterozygosity % genome within 10kb of a SNP --40%85%

____ __ __ _______Birol et al :: AGBT :: 7 February 2008 Background : Structure of Affy SNP array Each SNP probe set has : 57 oligonucleotide probes (10K array): 647,080 oligos 40 oligonucleotide probes (100K array) : 4,648,160 oligos 20 oligonucleotide probes (500K array): 12,013,632 oligos