Biometrical genetics Pak C Sham The University of Hong Kong Manuel AR Ferreira Queensland Institute for Medical Research 23 rd International Workshop on.

Slides:



Advertisements
Similar presentations
Introduction to QTL mapping
Advertisements

Genetic Linkage and Recombination
The next generation Chapters 9, 10, 17 in the course textbook, especially pages , ,
Lecture 39 Prof Duncan Shaw. Meiosis and Recombination Chromosomes pair upDNA replication Chiasmata form Recombination 1st cell division 2nd cell divisionGametes.
Genetic Inheritance & Variation
Qualitative and Quantitative traits
Tutorial #1 by Ma’ayan Fishelson
Review Mendel’s “rules of the game”
Genetics notes For makeup. A gene is a piece of DNA that directs a cell to make a certain protein. –Homozygous describes two alleles that are the same.
1 Statistical Considerations for Population-Based Studies in Cancer I Special Topic: Statistical analyses of twin and family data Kim-Anh Do, Ph.D. Associate.
Practical H:\ferreira\biometric\sgene.exe. Practical Aim Visualize graphically how allele frequencies, genetic effects, dominance, etc, influence trait.
Intro to Quantitative Genetics HGEN502, 2011 Hermine H. Maes.
Heredity and Evolution
Outline: 1) Basics 2) Means and Values (Ch 7): summary 3) Variance (Ch 8): summary 4) Resemblance between relatives 5) Homework (8.3)
Basics of Linkage Analysis
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
Linkage analysis: basic principles Manuel Ferreira & Pak Sham Boulder Advanced Course 2005.
Quantitative Genetics Up until now, we have dealt with characters (actually genotypes) controlled by a single locus, with only two alleles: Discrete Variation.
Quantitative Genetics Theoretical justification Estimation of heritability –Family studies –Response to selection –Inbred strain comparisons Quantitative.
Biometrical genetics Manuel Ferreira Shaun Purcell Pak Sham Boulder Introductory Course 2006.
Biometrical genetics Manuel Ferreira Shaun Purcell Pak Sham Boulder Introductory Course 2006.
Estimating “Heritability” using Genetic Data David Evans University of Queensland.
Genetic Theory Manuel AR Ferreira Egmond, 2007 Massachusetts General Hospital Harvard Medical School Boston.
Quantitative Genetics
Introduction to Linkage Analysis March Stages of Genetic Mapping Are there genes influencing this trait? Epidemiological studies Where are those.
Introduction to Linkage
Reminder - Means, Variances and Covariances. Covariance Algebra.
Biometrical Genetics Pak Sham & Shaun Purcell Twin Workshop, March 2002.
Genetic Theory - Overview Pak Sham International Twin Workshop Boulder, 2005.
Linkage Analysis in Merlin
PBG 650 Advanced Plant Breeding
Linkage and LOD score Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Quantitative Trait Loci, QTL An introduction to quantitative genetics and common methods for mapping of loci underlying continuous traits:
Broad-Sense Heritability Index
A gene is composed of strings of bases (A,G, C, T) held together by a sugar phosphate backbone. Reminder - nucleotides are the building blocks.
Non-Mendelian Genetics
Genetic Theory Manuel AR Ferreira Boulder, 2007 Massachusetts General Hospital Harvard Medical School Boston.
Introduction to Linkage Analysis Pak Sham Twin Workshop 2003.
PBG 650 Advanced Plant Breeding
Whole genome approaches to quantitative genetics Leuven 2008.
1 Population Genetics Basics. 2 Terminology review Allele Locus Diploid SNP.
Regression-Based Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
What is genetics? 01. Genetics is the study of inherited traits.
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
Tutorial #10 by Ma’ayan Fishelson. Classical Method of Linkage Analysis The classical method was parametric linkage analysis  the Lod-score method. This.
Lecture 21: Quantitative Traits I Date: 11/05/02  Review: covariance, regression, etc  Introduction to quantitative genetics.
Genetic Theory - Overview Pak Sham International Twin Workshop Boulder, 2005.
Genetic Theory Pak Sham SGDP, IoP, London, UK. Theory Model Data Inference Experiment Formulation Interpretation.
Epistasis / Multi-locus Modelling Shaun Purcell, Pak Sham SGDP, IoP, London, UK.
Principles of Mendelian Genetics B-4.6. Principles of Mendelian Genetics Genetics is the study of patterns of inheritance and variations in organisms.
Meiosis 15 October, 2004 Text Chapter 13. In asexual reproduction, individuals give rise to genetically identical offspring (clones). All cell division.
Mendelian Genetics AP Biology Unit 3 Mendel’s Experiments Crossbred Pea Plants P, F1, F2 generations.
Powerful Regression-based Quantitative Trait Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
Introduction to Genetic Theory
Genetic principles for linkage and association analyses Manuel Ferreira & Pak Sham Boulder, 2009.
Biometrical Genetics Shaun Purcell Twin Workshop, March 2004.
 Structural genes: genes that contain the information to make a protein.  Regulatory genes: guide the expression of structural genes, without coding.
Biometrical genetics Manuel AR Ferreira Boulder, 2008 Massachusetts General Hospital Harvard Medical School Boston.
Genetic Theory Manuel AR Ferreira Boulder, 2007 Massachusetts General Hospital Harvard Medical School Boston.
13/11/
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Biometrical model and introduction to genetic analysis
Pak Sham & Shaun Purcell Twin Workshop, March 2002
Meiosis and Sexual Life Cycles
Mendelian Inheritance
Genetics: Inheritance
Unit 5: Heredity Review Lessons 1, 3, 4 & 5.
Introduction to Biometrical Genetics {in the classical twin design}
Power Calculation for QTL Association
Presentation transcript:

Biometrical genetics Pak C Sham The University of Hong Kong Manuel AR Ferreira Queensland Institute for Medical Research 23 rd International Workshop on Methodology of Twin and Family Studies 2 nd March, 2010

Y1Y1 A1A1 D1D1 E1E1 Y2Y2 A2A2 D2D2 E2E2 1/0.25 1/0.5 eaceca ADE Model for twin data MZ/DZ

Biometrical Genetics

Outline 1. Basic molecular genetics 2. Components of genetic model 3. Biometrical properties for single locus 4. Introduction to linkage analysis

1. Basic molecular genetics

A DNA molecule is a linear backbone of alternating sugar residues and phosphate groups Attached to carbon atom 1’ of each sugar is a nitrogenous base: A, C, G or T DNA A gene is a segment of DNA which is translated to a peptide chain Complementarity: A always pairs with T, likewise C with G nucleotide

23 chromosome pairs 22 autosomes, X,Y ~ 33,000,000,000 base pairs Human genome Other functional sequences non-translated RNA binding sites for regulatory molecules ~ 25,000 translated “genes”

DNA sequence variation (polymorphisms) A B Microsatellites >100,000 Many alleles, eg. (CA) n repeats, very informative, easily automated Single nucleotide polymorphims (SNPs) 11,883,685 (build 128, 03 Mar ‘08) Most with 2 alleles (up to 4), not very informative, easily automated Copy Number polymorphisms Large-scale insertions / deletions A

Haploid gametes ♁ ♂ ♂ ♁ G 1 phase chr1 S phase Diploid zygote 1 cell M phase Diploid somatic cells ♁ ♂ ♁ A - B - A - B - A - B - A - B - A - B - A - B - ♂ ♁ A - B - A - B - ♂ ♁ - A - B - A - B - A - B - A - B Fertilization and mitotic cell division Mitosis (22 + 1)

Diploid gamete precursor cell (♂)(♁) (♂) (♁) Haploid gamete precursors Hap. gametes NR R R ♁ A - B - - A - B A - B - - A - B A - B - - A - B A - B - - A - B ♂ ♁ A - B - A - B - - A - B - A - B Meiosis / Gamete formation Meiosis 2 (22 + 1) chr1

2. Components of genetic model

Segregation (Meiosis) Mendel’s law of segregation A3 (½)A3 (½)A4 (½)A4 (½) A1 (½)A1 (½) A2 (½)A2 (½) Mother (A 3 A 4 ) A1A3 (¼)A1A3 (¼) A2A3 (¼)A2A3 (¼) A1A4 (¼)A1A4 (¼) A2A4 (¼)A2A4 (¼) Gametes Father (A 1 A 2 ) A. Transmission model Offspring Note: 50:50 segregation can be distorted (“meiotic drive”)

Segregation (Meiosis) B1 (½)B1 (½) B2 (½)B2 (½) A1 (½)A1 (½) A2 (½)A2 (½) Locus B (B 1 B 2 ) A 1 B 1 (1/4) A 2 B 1 (1/4) A 1 B 2 (1/4) A 2 B 2 (1/4) Locus A (A 1 A 2 ) A. Transmission model: two unlinked loci Gametes Phase: A 1 B 1 / A 2 B 2

Segregation (Meiosis) B1 (½)B1 (½) B2 (½)B2 (½) A1 (½)A1 (½) A2 (½)A2 (½) Locus B (B 1 B 2 ) A 1 B 1 ((1-  )/2) A 2 B 1 (  /2) A 1 B 2 (  /2) A 2 B 2 ((1-  )/2) Locus A (A 1 A 2 ) A. Transmission model: two linked loci Gametes Phase: A 1 B 1 / A 2 B 2  : Recombination fraction, between 0 (complete linkage) and 1/2 (free recombination)

B: Population model Frequencies aaAA Aa aa Genotype frequencies: AA: P Aa: Q aa: R Allele frequencies: A: P+Q/2 a: R+Q/2

B: Population model AAAaaa AA AA:Aa 0.5:0.5 Aa AA:Aa 0.5:0.5 AA:Aa:aa 0.25:0.5:0.25 Aa:aa 0.5:0.5 aaAaAa:aa 0.5:0.5 aa AAAaaa AAP2P2 PQPR AaPQQ2Q2 QR aaPRQRR2R2 Hardy-Weinberg Equilibrium (Hardy GH, 1908; Weinberg W, 1908) Random mating Offspring genotypic distribution

B: Population model Hardy-Weinberg Equilibrium (Hardy GH, 1908; Weinberg W, 1908) GenotypeFrequency AAP 2 +PQ+Q 2 /4 = (P+Q/2) 2 Aa2PR+PQ+QR+Q 2 /2 = 2(P+Q/2)(R+Q/2) aaR2+QR+Q 2 /4 = (R+Q/2) 2 Offspring genotype frequencies AlleleFrequency A(P+Q/2) 2 + (P+Q/2)(R+Q/2) = P+Q/2 a(R+Q/2) 2 + (P+Q/2)(R+Q/2) = R+Q/2 Offspring allele frequencies

B. Population model Panmixia (Random union of gametes) A (p)a (q) A (p) a (q) Maternal allele Paternal allele AA (p 2 ) aA (qp) Aa (pq) aa (q 2 ) P (AA) = p 2 P (Aa) = 2pq P (aa) = q 2 Deviations from HWE Assortative mating Imbreeding Population stratification Selection

C. Phenotype model Classical Mendelian (Single-gene) traits Dominant trait - AA, Aa1 - aa0 Recessive trait - AA1 - aa, Aa0 Cystic fibrosis 3 bp deletion exon 10 CFTR gene Huntington’s disease (CAG)n repeat, huntingtin gene

C. Phenotype model 1 Gene  3 Genotypes  3 Phenotypes 2 Genes  9 Genotypes  5 Phenotypes 3 Genes  27 Genotypes  7 Phenotypes 4 Genes  81 Genotypes  9 Phenotypes Central Limit Theorem  Normal Distribution Polygenic model

Quantitative traits AA Aa aa C. Phenotype model e.g. cholesterol levels

d +a P(X)P(X) X AA Aa aa – a Fisher’s model for single quantitative trait locus (QTL) Genotypic effects D. Phenotype model 2pq p2p2 q2q2 Genotype frequencies Assumption: Effect of allele independent of parental origin (Aa = aA) Violated in genomic imprinting

3. Biometrical properties of single locus

Biometrical model for single biallelic QTL 1. Contribution of the QTL to the Mean (X) aa Aa AA Genotypes Frequencies, f(x) Effect, x p2p2 2pqq2q2 a d-a = m = a(p 2 ) + d(2pq) – a(q 2 )Mean (X)= (p-q)a + 2pqd Note: If everyone in population has genotype aa then population mean = -a  change in mean due to A = ( (p-q)a + 2pqd) – (-a) = 2p(a+qd)

Biometrical model for single biallelic QTL 2. Contribution of the QTL to the Variance (X) aa Aa AA Genotypes Frequencies, f(x) Effect, x p2p2 2pqq2q2 a d-a = (a-m) 2 p 2 + (d-m) 2 2pq + (-a-m) 2 q 2 Var (X) = 2 pq ( a +( q - p ) d ) 2 + ( 2pqd )2 = V QTL Broad-sense heritability of X at this locus = V QTL / V Total

Biometrical model for single biallelic QTL 2. Partitioning of QTL variance: additive component A (p)a (q) A (p) a (q) Maternal allele Paternal allele a d d -a Average pa+qdpa+qd pd-qapd-qa pa+qdpa+qd pd-qapd-qa (p-q)a + 2pqd Variance due to a single allele = p(q(d+a)-2pqd) 2 +q(p(d-a)-2pqd) 2 = pq(a+(q-p)d) 2 For both alleles, additive variance = 2 pq ( a +( q - p ) d ) 2

Biometrical model for single biallelic QTL 2. Partitioning of QTL variance: dominance component Effect Additive effect AA (p 2 ) Aa (2pq) a d 2(pa+qd) (pa+qd)+(pd-qa)(pa+qd)+(pd-qa) -a 2(pd-qa) Dominance variance due to QTL = p 2 (a- 2(pa+qd)) 2 + 2pq ( d -(pa+qd+pd-qa)) 2 + q 2 (- a -2(pd-qa) = (2 pqd ) 2 Genotype aa (q2)

Biometrical model for single biallelic QTL Var (X) = Regression Variance + Residual Variance -a 0 a aa Aa AAaa Aa AAaa Aa AA AdditiveDominant Recessive Genotypic effects = Additive Variance + Dominance Variance = V A QTL + V D QTL

aa Aa AA aa Aa AA log (x) No departure from additivity Significant departure from additivity Statistical definition of dominance is scale dependent

Practical H:\ferreira\biometric\sgene.exe

Practical Aim Visualize graphically how allele frequencies, genetic effects, dominance, etc, influence trait mean and variance Ex1 a=0, d=0, p=0.4, Residual Variance = 0.04, Scale = 2. Vary a from 0 to 1. Ex2 a=1, d=0, p=0.4, Residual Variance = 0.04, Scale = 2. Vary d from -1 to 1. Ex3 a=1, d=0, p=0.4, Residual Variance = 0.04, Scale = 2. Vary p from 0 to 1. Look at scatter-plot, histogram and variance components.

Some conclusions 1.Additive genetic variance depends on allele frequency p & additive genetic value a as well as dominance deviation d 2.Additive genetic variance typically greater than dominance variance

Biometrical model for single biallelic QTL 3. Contribution of the QTL to the Covariance (X,Y) 2. Contribution of the QTL to the Variance (X) 1. Contribution of the QTL to the Mean (X)

Biometrical model for single biallelic QTL AA Aa aa AA Aaaa (a-m)(a-m) (d-m)(d-m) (-a-m) (a-m)(a-m) (d-m)(d-m) (a-m)2(a-m)2 (a-m)(a-m) (d-m)(d-m) (a-m)(a-m) (d-m)2(d-m)2 (d-m)(d-m) (-a-m) 2 3. Contribution of the QTL to the Cov (X,Y)

Biometrical model for single biallelic QTL AA Aa aa AA Aaaa (a-m)(a-m) (d-m)(d-m) (-a-m) (a-m)(a-m) (d-m)(d-m) (a-m)2(a-m)2 (a-m)(a-m) (d-m)(d-m) (a-m)(a-m) (d-m)2(d-m)2 (d-m)(d-m) (-a-m) 2 p2p pq 0 q2q2 3A. Contribution of the QTL to the Cov (X,Y) – MZ twins = (a-m) 2 p 2 + (d-m) 2 2pq + (-a-m) 2 q 2 Cov(X,Y) = V A QTL + V D QTL = 2pq[a+(q-p)d] 2 + (2pqd) 2

Biometrical model for single biallelic QTL AA Aa aa AA Aaaa (a-m)(a-m) (d-m)(d-m) (-a-m) (a-m)(a-m) (d-m)(d-m) (a-m)2(a-m)2 (a-m)(a-m) (d-m)(d-m) (a-m)(a-m) (d-m)2(d-m)2 (d-m)(d-m) (-a-m) 2 p3p3 p2qp2q 0 pq pq 2 q3q3 3B. Contribution of the QTL to the Cov (X,Y) – Parent-Offspring

e.g. given an AA father, an AA offspring can come from either AA x AA or AA x Aa parental mating types AA x AA will occur p 2 × p 2 = p 4 and have AA offspring Prob()=1 AA x Aa will occur p 2 × 2pq = 2p 3 q and have AA offspring Prob()=0.5 and have Aa offspring Prob()=0.5 Therefore, P(AA father & AA offspring) = p 4 + p 3 q = p 3 (p+q) = p 3

Biometrical model for single biallelic QTL AA Aa aa AA Aaaa (a-m)(a-m) (d-m)(d-m) (-a-m) (a-m)(a-m) (d-m)(d-m) (a-m)2(a-m)2 (a-m)(a-m) (d-m)(d-m) (a-m)(a-m) (d-m)2(d-m)2 (d-m)(d-m) (-a-m) 2 p3p3 p2qp2q 0 pq pq 2 q3q3 = (a-m) 2 p 3 + … + (-a-m) 2 q 3 Cov (X,Y) = ½V A QTL = pq[a+(q-p)d] 2 3B. Contribution of the QTL to the Cov (X,Y) – Parent-Offspring

Biometrical model for single biallelic QTL AA Aa aa AA Aaaa (a-m)(a-m) (d-m)(d-m) (-a-m) (a-m)(a-m) (d-m)(d-m) (a-m)2(a-m)2 (a-m)(a-m) (d-m)(d-m) (a-m)(a-m) (d-m)2(d-m)2 (d-m)(d-m) (-a-m) 2 p4p4 2p 3 q p2q2p2q2 4p 2 q 2 2pq 3 q4q4 = (a-m) 2 p 4 + … + (-a-m) 2 q 4 Cov (X,Y) = 0 3C. Contribution of the QTL to the Cov (X,Y) – Unrelated individuals

Biometrical model for single biallelic QTL Cov (X,Y) 3D. Contribution of the QTL to the Cov (X,Y) – DZ twins and full sibs ¼ genome ¼ (2 alleles) + ½ (1 allele) + ¼ (0 alleles) MZ twins P-O Unrelateds ¼ genome # identical alleles inherited from parents 0 1 (mother) 1 (father) 2 = ¼ Cov(MZ) + ½ Cov(P-O) + ¼ Cov(Unrel) = ¼(V A QTL +V D QTL ) + ½ (½ V A QTL ) + ¼ (0) = ½ V A QTL + ¼V D QTL

Summary so far…

Biometrical model predicts contribution of a QTL to the mean, variance and covariances of a trait Var (X) = V A QTL + V D QTL Cov (MZ) = V A QTL + V D QTL Cov (DZ) = ½V A QTL + ¼V D QTL On average! 0 or 10, 1/2 or 1 IBD estimation / Linkage Mean (X) = a(p-q) + 2pqd Association analysis Linkage analysis For a sib-pair, do the two sibs have 0, 1 or 2 alleles in common?

4. Introduction to Linkage Analysis

For a heritable trait... localize region of the genome where a QTL that regulates the trait is likely to be harboured identify a QTL that regulates the trait Linkage: Association: Family-specific phenomenon: Affected individuals in a family share the same ancestral predisposing DNA segment at a given QTL Population-specific phenomenon: Affected individuals in a population share the same ancestral predisposing DNA segment at a given QTL

Linkage Analysis: Parametric vs. Nonparametric QM Phe A D C E Genetic factors Environmental factors Mode of inheritance Recombination Correlation Chromosome Gene Adapted from Weiss & Terwilliger 2000 Dominant trait 1 - AA, Aa 0 - aa

Approach Parametric: genotypes marker locus & genotypes trait locus (latter inferred from phenotype according to a specific disease model) Parameter of interest: θ between marker and trait loci Nonparametric: genotypes marker locus & phenotype If a trait locus truly regulates the expression of a phenotype, then two relatives with similar phenotypes should have similar genotypes at a marker in the vicinity of the trait locus, and vice-versa. Interest: correlation between phenotypic similarity and marker genotypic similarity No need to specify mode of inheritance, allele frequencies, etc...

Phenotypic similarity between relatives Squared trait differences Squared trait sums Trait cross-product Trait variance-covariance matrix Affection concordance T2T2 T1T1

Genotypic similarity between relatives IBS Alleles shared Identical By State “look the same”, may have the same DNA sequence but they are not necessarily derived from a known common ancestor IBD Alleles shared Identical By Descent are a copy of the same ancestor allele M1M1 Q1Q1 M2M2 Q2Q2 M3M3 Q3Q3 M3M3 Q4Q4 M1M1 Q1Q1 M3M3 Q3Q3 M1M1 Q1Q1 M3M3 Q4Q4 M1M1 Q1Q1 M2M2 Q2Q2 M3M3 Q3Q3 M3M3 Q4Q4 IBS IBD 2 1 Inheritance vector (M)

Genotypic similarity between relatives - M1M1 Q1Q1 M3M3 Q3Q3 M2M2 Q2Q2 M3M3 Q4Q4 Number of alleles shared IBD 0 M1M1 Q1Q1 M3M3 Q3Q3 M1M1 Q1Q1 M3M3 Q4Q4 1 M1M1 Q1Q1 M3M3 Q3Q3 M1M1 Q1Q1 M3M3 Q3Q3 2 Proportion of alleles shared IBD Inheritance vector (M)

Genotypic similarity between relatives - 2 2n ABC D

Var (X) = V A QTL + V D QTL Cov (MZ) = V A QTL + V D QTL Cov (DZ) = ½V A QTL + ¼V D QTL On average! Cov (DZ) = V A QTL + V D QTL For a given twin pair

Cov (DZ) = V A QTL Cov (DZ) = V A QTL Genotypic similarity ( ) Phenotypic similarity Slope ~ V A QTL

Statistics that incorporate both phenotypic and genotypic similarities to test V QTL Regression-based methods Variance components methods Mx, MERLIN, SOLAR, GENEHUNTER Haseman-Elston, MERLIN-regress