Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012.

Slides:



Advertisements
Similar presentations
Lecture 2 Strachan and Read Chapter 13
Advertisements

ICSA, 6/2007 Pei Wang, 1 Spatial Smoothing and Hot Spot Detection for CGH data using the Fused Lasso Pei Wang Cancer Prevention Research.
What is an association study? Define linkage disequilibrium
Single Nucleotide Polymorphism And Association Studies Stat 115 Dec 12, 2006.
DNA copy number variation and cancer risk John F Pearson Canterbury Statistics Open Day University of Canterbury 2/10/2012.
Single Nucleotide Polymorphism Copy Number Variations and SNP Array Xiaole Shirley Liu and Jun Liu.
We processed six samples in triplicate using 11 different array platforms at one or two laboratories. we obtained measures of array signal variability.
Next-generation sequencing
The origin of metastatic disease: clues from genomics 7/13/2011.
SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.
The neuroblastoma genome Studies of genomic alterations using copy number microarray analyzes Tommy Martinsson Department of Clinical Genetics Sahlgrenska.
Tumour karyotype Spectral karyotyping showing chromosomal aberrations in cancer cell lines.
DNA Copy Number Analysis Qunyuan Zhang, Ph.D. Division of Statistical Genomics Department of Genetics & Center for Genome Sciences Washington University.
1 Example of HMMs: copy number data. 2 DNA copy number is the number of copies of a genomic segment present in the cell. Copy numbers are measured in.
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
Positional Cloning LOD Sib pairs Chromosome Region Association Study Genetics Genomics Physical Mapping/ Sequencing Candidate Gene Selection/ Polymorphism.
Comparative Genomic Hybridization (CGH). Outline Introduction to gene copy numbers and CGH technology DNA copy number alterations in breast cancer (Pollack.
An Update in Genetics of Epilepsy
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
Paola CASTAGNOLI Maria FOTI Microarrays. Applicazioni nella genomica funzionale e nel genotyping DIPARTIMENTO DI BIOTECNOLOGIE E BIOSCIENZE.
Dr Katie Snape Specialist Registrar in Genetics St Georges Hospital
Understanding Genetics of Schizophrenia
Exploring the behavioral genetics of Trade and Cooperation Arcadi Navarro and Elodie Gazave July 5th 2007.
Detecting copy number variations using paired-end sequence data Nick Furlotte CS224 May 29, 2009.
Genetic and Molecular Epidemiology Lecture III: Molecular and Genetic Measures Jan 19, 2009 Joe Wiemels HD 274 (Mission Bay)
Constitutional (germ-line) variants in hereditary conditions
Factors to Consider in Selecting a Genotyping Platform Elizabeth Pugh June 22, 2007.
DNA MICROARRAYS WHAT ARE THEY? BEFORE WE ANSWER THAT FIRST TAKE 1 MIN TO WRITE DOWN WHAT YOU KNOW ABOUT GENE EXPRESSION THEN SHARE YOUR THOUGHTS IN GROUPS.
Copy Number Variants: detection and analysis Manuel Ferreira & Shaun Purcell Boulder, 2009.
CDNA Microarrays MB206.
DNA Copy Number Analysis Qunyuan Zhang,Ph.D. Division of Statistical Genomics Department of Genetics & Center for Genome Sciences Washington University.
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
Affymetrix CytoScan HD array
©Edited by Mingrui Zhang, CS Department, Winona State University, 2008 Identifying Lung Cancer Risks.
Conservation of genomic segments (haplotypes): The “HapMap” n In populations, it appears the the linear order of alleles (“haplotype”) is conserved in.
Microarrays and Their Uses Brad Windle, Ph.D
CS177 Lecture 10 SNPs and Human Genetic Variation
A Genome-wide association study of Copy number variation in schizophrenia Andrés Ingason CNS Division, deCODE Genetics. Research Institute of Biological.
Gene Hunting: Linkage and Association
1 Commentary 1.Do not get too worried about "methods" and details. I fully expect there to be concepts and techniques that you simply are not going to.
Nature Genetics Vol.36 Sept 2004 Detection of Large-scale Variation In the Human Genome Iafrate, Feuk, Rivera, Listewnik, Donahoe, Qi, Scherer, Lee any.
Association mapping: finding genetic variants for common traits & diseases Manuel Ferreira Queensland Institute of Medical Research Brisbane Genetic Epidemiology.
Methods in genome wide association studies. Norú Moreno
Genotype Calling Jackson Pang Digvijay Singh Electrical Engineering, UCLA.
Identification of Copy Number Variants using Genome Graphs
____ __ __ _______Birol et al :: AGBT :: 7 February 2008 A NOVEL APPROACH TO IMPROVE THE NOISE IN DETECTING COPY NUMBER VARIATIONS USING OLIGONUCLEOTIDE.
Cancer genomics Yao Fu March 4, Cancer is a genetic disease In the early 1970’s, Janet Rowley’s microscopy studies of leukemia cell chromosomes.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Computational Laboratory: aCGH Data Analysis Feb. 4, 2011 Per Chia-Chin Wu.
Lecture 11. Topics in Omic Studies (Cancer Genomics, Transcriptomics and Epignomics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Copy Number Analysis in the Cancer Genome Using SNP Arrays Qunyuan Zhang, Aldi Kraja Division of Statistical Genomics Department of Genetics & Center for.
Analyzing DNA using Microarray and Next Generation Sequencing (1) Background SNP Array Basic design Applications: CNV, LOH, GWAS Deep sequencing Alignment.
Unit 1 – Living Cells.  The study of the human genome  - involves sequencing DNA nucleotides  - and relating this to gene functions  In 2003, the.
동물 분자 유전체 연구의 최신 동향 National Institute of Animal Science Animal Genomics & Bioinformatics 정호영
Recent Advances in Genomic Science Julian Sampson Institute of Medical Genetics, Cardiff.
Global Variation in Copy Number in the Human Genome Speaker: Yao-Ting Huang Nature, Genome Research, Genome Research, 2006.
Different microarray applications Rita Holdhus Introduction to microarrays September 2010 microarray.no Aim of lecture: To get some basic knowledge about.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
GENOME ORGANIZATION AS REVEALED BY GENOME MAPPING WHY MAP GENOMES? HOW TO MAP GENOMES?
Global Variation in Copy Number in the Human Genome
Current Applications for Genomic Microarray
Applications of DNA Analysis
Genomic alterations in breast cancer cell line MDA-MB-231.
Statistical Analysis and Design of Experiments for Large Data Sets
Histology and genomic copy number alterations in TRAMP tumors.
BF528 - Genomic Variation and SNP Analysis
Hunting for Celiac Disease Genes
Single nucleotide polymorphism array analysis can distinguish different genetic mechanisms that lead to loss of heterozygosity (LOH). Single nucleotide.
Presentation transcript:

Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012

What do we mean by “copy number variation?” GCTCATATATATTTG kb - Mb (gene or gene region)

Copy number variation in a gene or gene region “normal” duplication of one gene duplication of several genes deletion duplication of part of a gene

What Find chromosomal segments (usually large ones) that are duplicated and/or deleted in tumor cell lines Why Learn something about cancer biology or Implications for treatment and prognosis Cancer genetics Clinical pediatrics What Detect inherited or de novo deletions in individuals Why “Diagnose” birth defects Classical copy number study types

And now: Genetic association studies for CNVs 1) Collect cases and controls. 5 2) “Genotype” everyone at a CNV ) Test genotype/phenotype association cases controls

How do we assay copy number variation?

What Microarray of clones (e.g. BACs) Usually on glass slide Competitive hybridization of test and reference samples. Measure fluorescence ratio clone by clone. Limitations Large clones. Sparse coverage. High noise due to spotting process. Generation 1 - Array CGH

What High-throughput SNP genotyping platforms (e.g. Affymetrix, Illumina) Disadvantages Technology was never intended for measuring copy number. SNPs on chip selected to avoid CNV regions by design. Generation 2 - SNP chips Advantage Hundreds of thousands of points of info.

Advantages SNPs in known CNV regions are now included. Also have “non-polymorphic SNPs” (SNs?) Generation 3 - SNP chips with CNV markers (Affy 6.0, Illumina 1M) Affymetrix 200K probes in 5K known large CNV regions 700K probes “evenly spaced along the genome” Illumina 1M markers in 10K regions of various types and sizes

Changes Got rid of the non-polymorphic markers. Special coverage of CNV regions??? Are these better or worse for CNVs than the previous generation? Generation 4 - (Illumina 2.5M, 5M)

What data do these technologies give us, and how do we use it?

BB AB AA Standard genotyping Genotype information is in the angle (relative intensity of the two alleles). Copy number information is in the distance from the origin (total intensity).

AAA AAB ABB BBB AA AB BB A B null In theory

AAA and AA AAB AB ABB BBB and BB But when you look at the data … trisomic (Down Syndrome) disomic

total intensity (disomic) total intensity (trisomic) trisomic disomic All SNPs on chromosome 21

AAA AAB ABB BBB AA AB BB A B null In theory

A B null In practice

So how are copy numbers called? Look for runs of SNPs that are high or low in intensity Many available algorithms e.g. HMM, CBS, change-point

Basic picture

Komura et al. Genome Research 2006

More complex examples (cancer genetics) Peiffer et al. Genome Research, 2006

Angle (genotype info) total intensity amplification AA AB BB

deletion

Extra copy of whole chromosome total intensity high over whole chromosome 3 genotype groups

No copy number change, but a region of homozygosity (LOH) LOH

Basic picture Wang et al. Genome Research, 2007

29 Chromosome 9

A few statistical issues to think about … (there’s still a lot to do)

Many run-calling algorithms are oriented towards clinical applications. Many CNV detection algorithms are very conservative - aim for zero false positive rate. Most use normalization methods that assume a large reference population is not available. Many use models that make assumptions about what kinds of variation are likely (e.g. cancer).

Family data should be modeled together. CNV “calls” will be much more accurate if you use the whole family, but the model you use should depend on whether you are expecting de novo mutations or not. For some diseases you’ll expect associations with de novo changes. For others you might expect inherited variants.

How do we group CNVs for association testing? deletion duplication

Separate methods for deletions? Deletions are easier to detect than other changes. Deletions are likely to have simpler biological effects.

The most important one … The technology is still NOT intended for reliably and comparably measuring total intensity! Total intensity numbers are very sensitive to DNA source, sample handling, etc., so extreme measures must be taken to ensure that cases and controls are comparable.