Sequencing an Ashkenazi Jewish Reference Cohort for Medical Genetics and Implications for Ashkenazi History Shai Carmi Department of Computer Science Columbia.

Slides:



Advertisements
Similar presentations
Regulation of Consumer Tests in California AAAS Meeting June 1-2, 2009 Beatrice OKeefe Acting Chief, Laboratory Field Services California Department of.
Advertisements

Imputation for GWAS 6 December 2012.
The Ashkenazi Genome Project
Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh
METHODS FOR HAPLOTYPE RECONSTRUCTION
Recombination and genetic variation – models and inference
IBD sharing: Theory and applications in the Ashkenazi Jewish population Shai Carmi Pe’er lab, Columbia University Mt. Sinai, NY March 2014.
Genome-wide Association Study Focus on association between SNPs and traits Tendency – Larger and larger sample size – Use of more narrowly defined phenotypes(blood.
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Genetic Patterns of Ashkenazi Jews Victoria Olson.
High resolution detection of IBD Sharon R Browning and Brian L Browning Supported by the Marsden Fund.
MALD Mapping by Admixture Linkage Disequilibrium.
Computational Challenges in Whole-Genome Association Studies Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
Human non-synonymous SNP: molecular function, evolution and disease Shamil Sunyaev Genetics Division, Brigham & Women’s Hospital Harvard Medical School.
Computational Tools for Finding and Interpreting Genetic Variations Gabor T. Marth Department of Biology, Boston College
What Can BRCA Mutations Tell Us About Ancestry? Krizia Gupiteo, Anthropology CONCLUSIONS OBJECTIVES/RESEARCH QUESTIONS RESULTS BACKGROUND IMPLICATIONS.
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary, May 2006.
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
Sequencing 128 Ashkenazi Genomes: Implications for Medical Genetics and History Shai Carmi Department of Computer Science Columbia University Itsik Pe’er’s.
Understanding Genetics of Schizophrenia
Whole Exome Sequencing for Variant Discovery and Prioritisation
The Ashkenazi Genome Project Shai Carmi Pe’er lab, Columbia University and The Ashkenazi Genome Consortium (TAGC) Boston September 2013.
The Ashkenazi Genome Project Shai Carmi Pe’er lab, Columbia University and The Ashkenazi Genome Consortium (TAGC) Personal Genomes & Medical Genomics Cold.
Sharing of long genomic segments: Theory and results in Ashkenazi Jews Bar-Ilan University July 26, 2012 Shai Carmi Itsik Pe’er’s lab Department of Computer.
The Ashkenazi Genome Project
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
Loss-of-co-Homozygosity mapping and exome sequencing of a Syrian pedigree identified the candidate causal mutation associated with rheumatoid arthritis.
Next-Generation Sequencing
CS177 Lecture 10 SNPs and Human Genetic Variation
A Genome-wide association study of Copy number variation in schizophrenia Andrés Ingason CNS Division, deCODE Genetics. Research Institute of Biological.
Next-Generation Sequencing Eric Jorgenson Epidemiology 217 2/28/12.
The 1000 Genomes Project Gil McVean Department of Statistics, Oxford.
Identification of Copy Number Variants using Genome Graphs
California Pacific Medical Center
Association analysis Genetics for Computer Scientists Biomedicum & Department of Computer Science, Helsinki Päivi Onkamo.
Lecture 20 : Tests of Neutrality
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
The distribution of the IBD sharing and applications Tel Aviv University July 23, 2012 Shai Carmi Itsik Pe’er’s lab Department of Computer Science Columbia.
Inferring the Demographic History of the Ashkenazi Jewish population Shai Carmi Pe’er lab, Columbia University Leicester, UK April 2014.
Lectures 7 – Oct 19, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
Computational Biology and Genomics at Boston College Biology Gabor T. Marth Department of Biology, Boston College
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College
Analyzing DNA using Microarray and Next Generation Sequencing (1) Background SNP Array Basic design Applications: CNV, LOH, GWAS Deep sequencing Alignment.
Meiotic gene conversion in humans: rate, sex ratio, and GC bias Amy L. Williams June 19, 2013 University of Chicago.
Analysis of Next Generation Sequence Data BIOST /06/2015.
Abraham's Children in the Genome Era: Major Jewish Diaspora Populations Comprise Distinct Genetic Clusters with Shared Middle Eastern Ancestry Gil Atzmon,
Global Variation in Copy Number in the Human Genome Speaker: Yao-Ting Huang Nature, Genome Research, Genome Research, 2006.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Date of download: 7/2/2016 Copyright © 2016 American Medical Association. All rights reserved. From: Clinical Interpretation and Implications of Whole-Genome.
Population sequencing using short reads: HIV as a case study Vladimir Jojic et.al. PSB 13: (2008) Presenter: Yong Li.
Armenian Genome Project
Interpreting exomes and genomes: a beginner’s guide
Common variation, GWAS & PLINK
Nucleotide variation in the human genome
Gil McVean Department of Statistics
Constrained Hidden Markov Models for Population-based Haplotyping
Genome Wide Association Studies using SNP
Itsik Pe’er, Yves R. Chretien, Paul I. W. de Bakker, Jeffrey C
Content and Labeling of Tests Marketed as Clinical “Whole-Exome Sequencing” Perspectives from a cancer genetics clinician and clinical lab director Allen.
Alicia R. Martin, Christopher R. Gignoux, Raymond K
Perspectives from Human Studies and Low Density Chip
BF528 - Genomic Variation and SNP Analysis
Pier Francesco Palamara, Laurent C. Francioli, Peter R
Pier Francesco Palamara, Todd Lencz, Ariel Darvasi, Itsik Pe’er 
Haplotypes When the presence of two or more polymorphisms on a single chromosome is statistically correlated in a population, this is a haplotype Example.
Analysis of protein-coding genetic variation in 60,706 humans
Abraham's Children in the Genome Era: Major Jewish Diaspora Populations Comprise Distinct Genetic Clusters with Shared Middle Eastern Ancestry  Gil Atzmon,
The Time and Place of European Gene Flow into Ashkenazi Jews
Presentation transcript:

Sequencing an Ashkenazi Jewish Reference Cohort for Medical Genetics and Implications for Ashkenazi History Shai Carmi Department of Computer Science Columbia University Itsik Pe’er’s lab 2015

Population/Statistical Genetics Find disease genes Predict genetic risk Understand molecular genetics and evolution Learn about ancestry and history Henn et al., 2012 Chromosome -log 10 (P) Lencz et al., 2013 Schizophrenia study in Ashkenazi Jews

Outline Ashkenazi Jewish Genetics: Background The Ashkenazi Genome Sequencing Project Segment Sharing and the Founder Event Future Directions

Outline Ashkenazi Jewish Genetics: Background The Ashkenazi Genome Sequencing Project Segment Sharing and the Founder Event Future Directions

Ashkenazi Jewish (AJ) Genetics: Significance Medical genetics Large founder population Mendelian disorders Complex diseases o Breast cancer, Parkinson’s, Crohn’s Population genetics Debated origins Segment sharing Why do we need to sequence genomes? mtDNA: Behar et al., 2004; Behar et al., 2006 Y chr: Behar et al., 2003; Behar et al., 2004 Disease genes: Risch et al., 2003; Slatkin, 2004 SNP arrays: Gusev et al., 2012; Palamara et al., 2012 Review: Ostrer and Skorecki, 2013

Founder Populations: Opportunities Recent successes Crete o Tachmazidou et al., 2013; HDL Finland o Kurki et al. 2014; aneurysm Iceland o Many papers; most recently Steinthorsdottir et al., 2014; T2D Ashkenazi Jews o Hui et al., in preparation; Crohn’s See also: Hatzikotoulas et al., 2014 Zuk et al., 2014 Past Founder population Non-founder population Disease alleles Bottleneck Population size Present

Founder populations Disease prevalence: 1% Bottleneck effective size: 330 Number of disease alleles: 1000 Sample size: 5000 cases/controls Allele frequency: 1/10,000 P-value cutoff: 10-4

Opportunities: Reduced Haplotypic Diversity Chromosom es in the sample Full sequence Partial sequence (SNP array, low-coverage sequence) Observed data Imputation Inferred sequence Problem: The Ashkenazi population is missing a reference panel of complete sequences

Opportunities: Personal Genomics in AJ Personal clinical genomics is here But genomes are hard to interpret Problem: The Ashkenazi population is missing a reference panel of complete sequences

The Documented Ashkenazi History Ca. 1000: Small communities in Northern France, Rhineland Migration east Expansion Migration to US and Israel

Ashkenazi History: Questions Origin? Founder event? European gene flow: o Where? o When? o How much? Relation to other Jews? Whole- genomes?

Outline Ashkenazi Jewish Genetics: Background The Ashkenazi Genome Sequencing Project Segment Sharing and the Founder Event Future Directions

The Ashkenazi Genome Consortium NY area labs interested in specific diseases Quantify utility and use in medical genetics Learn about population history Phase I: 128 whole genomes (completed*) Phase II: ≈500 whole genomes (NYGC; under way) Large cohorts of AJ cases * Carmi et al., Nat Commun, 2014

Technical Details QC measureGenome (exome) Coverage≈56x Fraction called96.7±0.3% (98.1%) Concordance with arrays 99.67±0.25% Ti/Tv ratio2.14±0.004 (3.05) Samples: Controls of Parkinson’s, longevity studies o Some phenotypes exist o Ashkenazi ancestry verified Platform: Complete Genomics o Uniform QC measures QC: Remove indels, poly-allelic variants, Hardy-Weinberg violations, low call rate Error rate estimates using a duplicate and runs-of- homozygosity Error rate after QC: ≈1.7∙10 -6 per base pair hets roh

Sequencing Statistics (Raw Data) StatisticPer genome (exome) SNVs3.4M (22k) Novel SNVs3.8% (4.1%) Het/hom ratio1.65 (1.67) Insertions220k (242) Deletions235k (223) Multi-nucleotide variants83k (374) Synonymous SNVs10,536 Non-synonymous SNVs9706 Nonsense SNVs72 Other disrupting255 CNVs302 SVs1480 MEIS4090

Comparison to Europeans Main comparison panel: 26 Flemish from Belgium (platform- matched) Novel variants per genome (%) (dbSNP) Population-specific variants (25x25 genomes) Carmi et al., Nat Commun, 2014 Variants per genome

Variant Discovery Rate Heterozygosity paradox? Variants per genome Overall discovered variants

An Ashkenazi reference panel filters more likely benign variants from an AJ genome than a European panel AJ Clinical Genomics Carmi et al., Nat Commun, 2014

Correlation between imputed and real genotypes Imputation in AJ An Ashkenazi reference panel improves imputation accuracy of AJ SNP arrays compared to the standard European panel Rare variants (≤1%) accuracy: 87% vs 65% Carmi et al., Nat Commun, 2014

Improving Carrier Screening Databases of disease-causing mutations (OMIM/ClinVar) Exist in AJ but not common TAGC genomes A new panel with ≈170 mutations Mendelian disordersPredisposition Assessment of clinical validity and utility 1000 Genomes Baskovich et al., submitted

Low Frequency CNV in Tumors Sequence coverage can identify high frequency CNVs o But not low frequency ? Resolve parental haplotypes o Using TAGC genomes and a hidden Markov model 30% deletion 2% deletion Backenroth et al., in preparation

Other Medical Genetics Studies Our consortium: o Association studies: schizophrenia, Parkinson’s, Crohn’s, longevity, cancer Other groups: o Data available on EGA o Clinical frequency lookups (retinal degeneration, epilepsy, …) o Population genetics, imputation, …

Do AJ have more deleterious mutations than Europeans? Mutation Burden in AJ Enrichment is ≈0.5-1% (P>0.01) No disease category is significantly enriched Carmi et al., Nat Commun, 2014 Fraction of variants per genome 0.45% 0.46% 0.51% 0.52% 1.31% 1.33% 1.32%

Principal Component Analysis (PCA) Price et al., 2008; Olshen et al., 2008; Need et al., 2009; Kopelman et al., 2009; Atzmon et al., 2010; Behar et al., 2010; Bray et al., 2010; Guha et al., 2012; Behar et al., 2014, Carmi et al., 2014; O’connor et al., 2015 Ashkenazi Jews (TAGC) Middle- East Druze Palestinians Bedouins Sardinians Tuscans Italians Basque French Flemish Sephardi Jews (Italy, Turkey) Europ e

The Documented Ashkenazi History Origin? Founder event? European gene flow: o Where? o When? o How much? Relation to other Jews?

A Model for Ancient History Out-of-Africa Middle- East European gene flow into AJ 25x25 genomes Carmi et al., Nat Commun, 2014

The Documented Ashkenazi History Origin? Founder event? European gene flow: o Where? o When? o How much? Relation to other Jews?

Outline Ashkenazi Jewish Genetics: Background The Ashkenazi Genome Sequencing Project Segment Sharing and the Founder Event Future Directions

Genetic Segment Sharing Shared segments Shared segment k Siblings

Genetic Segment Sharing Shared segment Time

Importance Segments are rare but long, hence observable A segment indicates recent co-ancestry Methods and theory Shared segment detection o Gusev et al., 2009 o Yang, Carmi, et al., 2015 Disease mapping o Gusev et al., 2011, 2012 Pedigree reconstruction o Henn et al., 2012 Demographic inference o Palamara et al., 2012, 2013 o Carmi et al., 2013, 2014 Population histories Ashkenazi Jews o Gusev et al., 2012 o Carmi et al., 2014 Other Jews o Atzmon et al., 2010 o Campbell et al., 2012 Druze o Zidan, Ben-Avraham, Carmi, et al., 2014 Netherlands o Francioly et al., 2014 Disease/trait mapping Cholesterol, Micronesia o Kenny et al., 2009, 2010 Parkinson’s, AJ o Vacic et al., 2014 Schizophrenia, AJ o Mukherjee et al., 2014

Segment Sharing Theory Model: o A population with a constant effective size N o Two chromosomes of length L (Morgans) o A minimal segment length m (Morgans) The number of shared segments n m ? The fraction of the chromosome in shared segments f m ? L m ℓ1ℓ1 ℓ3ℓ3 ℓ2ℓ2

Results overview Palamara et al., 2012; Carmi et al., 2013; Carmi et al., Theor Popul Biol, 2014

Demographic Inference Palamara et al., 2012 Method: Record shared segments in each length bin Using Eq. (1), find the history N(t) that fits best Hypothetical example

Segment Sharing in Ashkenazi Jews Gusev et al., 2012Atzmon et al., 2010 Bray et al., 2010 AJ EU % Sharing

Segment Sharing in Ashkenazi Jews Carmi et al., Nat. Commun., 2014 See also: Atzmon et al., 2010, Gusev et al., 2012, Palamara et al., 2012 A pair of AJ individuals shares 1-2% of their genome (≈50cM) in ≈10-15 long segments (>3cM)

Segment Sharing in Ashkenazi Jews Carmi et al., Nat. Commun., 2014 See also: Atzmon et al., 2010, Gusev et al., 2012, Palamara et al., 2012

Segment Sharing in Ashkenazi Jews Time (years) Carmi et al., Nat. Commun., 2014 See also: Atzmon et al., 2010, Gusev et al., 2012, Palamara et al., 2012

Robustness Potential confounders: o Phasing, sequencing, and segment detection errors o Model specification and assumptions Good resolution only for ≈10-50 generations ago Parameter95% confidence interval Bottleneck size Bottleneck time (years) Results consistent with previous studies Time confirmed using lengths of haplotypes around doubletons o Mathieson and McVean, 2014

Media Coverage

Ashkenazi History Origin? Founder event? European gene flow: o Where? o When? o How much? Relation to other Jews?

The Place and Time of European Gene Flow “Most of these theories … are myths or speculation … based on some vague or misunderstood references. … It will probably be impossible to say definitely where the... Jews in Poland … came from.” B. Weinryb, The Jews of Poland, 1972

Approach Johnson et al., 2011; Moreno-Estrada et al., 2013 o o o o o o o o o o o o EU ME x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x EUEU x x x x x x x x x x x x x x x o o o o o o x x x x x x x x x x x x EU ME AJ An Ashkenazi genome PC2 PC1 PC2

Preliminary Results Used European and Middle-Eastern SNP array reference data Origin in the Levant Gene flow predominantly from South Europe o Some from East Europe o ≈30-40 generations ago Sex-imbalanced history?

Outline Ashkenazi Jewish Genetics: Background The Ashkenazi Genome Sequencing Project Segment Sharing and the Founder Event Future Directions

Coverage by Shared Segments A sequenced reference panel Partly sequenced genome Impute What fraction of the genome can we cover with segments shared with the panel? Full sequence Partial sequence Inferred sequence

Coverage by Shared Segments: AJ Phase II Mine public data? Other studies? See Carmi et al., 2013 for a theoretical analysis Segments >3cM

The Era of Near-Complete Coverage Now Phase II Mine public data? Other studies? Segments >3cM Every locus in a new genome has a fully sequenced “relative” Opportunities: o Interpretation of personal genomes o Cost-effectively implementing large-scale association studies o Historical inference Methods to be developed!

Summary Ashkenazi genetics is interesting We sequenced 128 whole-genomes Useful for personal genomics and imputation Segment sharing reveals a founder event and suggests opportunities My research statement

Acknowledgements Funding : Itsik Pe’er’s lab: James Xue, Ethan Kochav, Shuo Yang, Pier Palamara, Vladimir Vacic TAGC consortium members: Todd Lencz, Semanti Mukherjee (LIJMC) Lorraine Clark, Xinmin Liu (CUMC) Gil Atzmon, Harry Ostrer, Carole Oddoux, Brett Baskovich, Danny Ben-Avraham (AECOM) Inga Peter, Judy Cho (ISMMS) Ariel Darvasi (HUJI) Joseph Vijai (MSKCC) Ken Hui (Yale) VIB Ghent, Belgium Thank you for your attention! Harvard University: Peter Wilton, John Wakeley Sheba Medical Center: Eitan Friedman Columbia University Medical Center: Daniel Backenroth, Yufeng Shen