Linkage and Linkage Disequilibrium

Slides:



Advertisements
Similar presentations
Micro Evolution -Evolution on the smallest scale
Advertisements

BIOL EVOLUTION AT MORE THAN ONE GENE SO FAR Evolution at a single locus No interactions between genes One gene - one trait REAL evolution: 10,000.
SNP Applications statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt.
Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Molecular Evolution Revised 29/12/06
Mapping Basics MUPGRET Workshop June 18, Randomly Intermated P1 x P2  F1  SELF F …… One seed from each used for next generation.
Molecular Clocks, Base Substitutions, & Phylogenetic Distances.
Estimating recombination rates using three-site likelihoods Jeff Wall Program in Molecular and Computational Biology, USC.
Adaptive Molecular Evolution Nonsynonymous vs Synonymous.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Modes of selection on quantitative traits. Directional selection The population responds to selection when the mean value changes in one direction Here,
Why Models of Sequence Evolution Matter Number of differences between each pair of taxa vs. genetic distance between those two taxa. The x-axis is a proxy.
Molecular basis of evolution. Goal – to reconstruct the evolutionary history of all organisms in the form of phylogenetic trees. Classical approach: phylogenetic.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012.
Genetic Linkage. Two pops may have the same allele frequencies but different chromosome frequencies.
Announcements: Proposal resubmission deadline 4/23 (Thursday).
Chapter 8 Molecular Phylogenetics: Measuring Evolution.
Introduction to Phylogenetics
Calculating branch lengths from distances. ABC A B C----- a b c.
Chapter 24: Molecular and Genomic Evolution CHAPTER 24 Molecular and Genomic Evolution.
Identifying and Modeling Selection Pressure (a review of three papers) Rose Hoberman BioLM seminar Feb 9, 2004.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Phylogeny Ch. 7 & 8.
February 20, 2002 UD, Newark, DE SNPs, Haplotypes, Alleles.
NEW TOPIC: MOLECULAR EVOLUTION.
Genomics of Adaptation
Molecular Evolution Distance Methods Biol. Luis Delaye Facultad de Ciencias, UNAM.
1 Genetic Mapping Establishing relative positions of genes along chromosomes using recombination frequencies Enables location of important disease genes.
Types of genome maps Physical – based on bp Genetic/ linkage – based on recombination from Thomas Hunt Morgan's 1916 ''A Critique of the Theory of Evolution'',
Molecular Evolution. Study of how genes and proteins evolve and how are organisms related based on their DNA sequence Molecular evolution therefore is.
Evolutionary Change in Sequences
Evolution and Population Genetics
Hudson Kreitman Aguadé 1987
Genetic Linkage.
MULTIPLE GENES AND QUANTITATIVE TRAITS
Evolution and Populations –Essential Questions p
Causes of Variation in Substitution Rates
Signatures of Selection
upstream vs. ORF binding and gene expression?
The neutral theory of molecular evolution
Allele frequency Time.
Neutrality Test First suggested by Kimura (1968) and King and Jukes (1969) Shift to using neutrality as a null hypothesis in positive selection and selection.
Human Chimp How does DNA evolve? Nucleotide substitutions
Maximum likelihood (ML) method
Relationship between quantitative trait inheritance and
Genetic Linkage.
Recombination (Crossing Over)
Distances.
Models of Sequence Evolution
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Molecular basis of evolution.
Washington State University
Patterns of Linkage Disequilibrium in the Human Genome
What are the Patterns Of Nucleotide Substitution Within Coding and
MULTIPLE GENES AND QUANTITATIVE TRAITS
The ‘V’ in the Tajima D equation is:
Basic concepts on population genetics
Molecular Clocks Rose Hoberman.
Molecular Evolution.
Molecular evolution: traditional tests of neutrality
Why Models of Sequence Evolution Matter
Genetic Drift, followed by selection can cause linkage disequilibrium
Genetic Linkage.
Pedir alineamiento múltiple
Washington State University
But what if there is a large amount of homoplasy in the data?
Presentation transcript:

Linkage and Linkage Disequilibrium AB = 25% Ab = 25% aB = 25% ab = 25% f(Ab) = f(A) x f(b) A B B b a b A a AB = 50% Ab = 0% aB = 0% ab = 50% A B f(Ab) ≠ f(A) x f(b) B b a b A a A locus and B locus are in Linkage Disequilibrium D = f(AB) x f(ab) - f(Ab) x f(aB) Maximum with no recombination D = 0 with free recombination (linkage equilibrium)

Are alleles at separate loci paired at random? Linkage equilibrium: Are alleles at separate loci paired at random? D = x11 − p1q1 D = x22 − p2q2

A locus and B locus are in Linkage Disequilibrium D = f(AB) x f(ab) - f(Ab) x f(aB) Maximum with no recombination D -> 0 with free recombination (linkage equilibrium) When allele frequencies are intermediate: f(A) = f(a) = f(B) = f(b) = 0.5, and maximal LD occurs so that no recombinants are present: f(AB) = f(ab) = 0.5, so D = 0.5 x 0.5 – 0.0 x 0.0 = 0.25 When allele frequencies are skewed: f(A) = 0.9, f(a) = 0.1; f(B) = 0.9, f(b) = 0.1 and maximal LD occurs so that no recombinants are present, D is less than 0.25: f(AB) = 0.9, and f(ab) = 0.1, so D = 0.9 x 0.1 – 0.0 x 0.0 = 0.09

LD as a two-locus Hardy Weinberg problem

Linkage disequilibrium (LD) decays with distance and time AB = (1-r)/2 Ab = r/2 aB = r/2 ab = (1-r)/2 A B a b r = Rate of recombination

Empirical demonstration of the Decay of LD over time

Epistasis

QTL for flower traits in Mimulus (monkey flowers) Different pollinators M. lewisii F1 M. cardinalis F2’s

Genetic map of monkey flower http://www.genetics.org/cgi/content/full/159/4/1701/F1

Quantitative trait locus (QTL)mapping: Screen for marker-trait associations in F2s or RILs Parentals F1 M, Q M, Q M, Q M, Q M, Q F2 Inbreed to make Recombinant inbred lines (RILs) Scan genome for association Between molecular marker and phenotype Small Large Association between Molecular marker (M) and QTL(Q) M, Q m, q

detecting an association between a genetic marker (M) QTL Mapping: detecting an association between a genetic marker (M) and a gene affecting a quantitative trait (Q). QTL here Marker here http://isotope.bti.cornell.edu/img/intro/qtl_fig_2.gif QTL mapping works because there is linkage disequilibrium (LD) between the marker (M) and the QTL (Q): mm marker genotypes are correlated with small size MM marker genotypes are correlated with large size

Most traits in organisms Show continuous variation How do we find the genes That affect these “quantitative” traits Scan the genome for Nucleotide sites that Co-vary with the phenotype

Genome wide association studies: GWAS Mutation “causing” variation in height Tall A Tall A Tall A Tall A Short G Short G Short G Short G Adjacent SNPs are linked Distant sites show no genotype-phenotype association Problem: how do we find the causal SNPs? Needle in a haystack

What is better: More recombination, more markers? Parentals F1 M, Q M, Q M, Q M, Q M, Q F2 Inbreed to make Recombinant inbred lines (RILs) Scan genome for association Between molecular marker and phenotype Small Large Association between Molecular marker (M) and QTL(Q) M, Q m, q

Human Chimp How does DNA evolve? 1 ATGCCCCAACTAAATACTACCGTATGGCCCACCATAATTACCCCCATACT 50 ||||||||||||||||| ||||||| ||||||||||||||||||||||| 1 atgccccaactaaataccgccgtatgacccaccataattacccccatact 50 . . . . . 51 CCTTACACTATTCCTCATCACCCAACTAAAAATATTAAACACAAACTACC 100 ||| |||||||| ||| |||||||||||||||||||||| |||| |||| 51 cctgacactatttctcgtcacccaactaaaaatattaaattcaaattacc 100 101 ACCTACCTCCCTCACCAAAGCCCATAAAAATAAAAAATTATAACAAACCC 150 | ||||| ||||||||||| ||||||||||||||||| || || |||||| 101 atctacccccctcaccaaaacccataaaaataaaaaactacaataaaccc 150 151 TGAGAACCAAAATGAACGAAAATCTGTTCGCTTCATTCATTGCCCCCACA 200 ||||||||||||||||||||||||| |||||||||||| |||||||||| 151 tgagaaccaaaatgaacgaaaatctattcgcttcattcgctgcccccaca 200 201 ATCC 204 |||| 201 atcc 204

Measuring DNA Evolution Align sequences between species Determine length of sequences, L Count number of differences Divergence = proportion of differences D = p-distance = (number of differences) / (length of sequence) Rate of divergence  = (sequence divergence) / (age of common ancestor)  = D / time Rate of substitution  = D / 2 x time time Example: 5 differences in 100 D = 0.05, t = 6 million years Divergence = 0.05/6x106 Divergence = 8.3 x 10-9

Jukes Cantor One parameter model = rate of substitution PA(t) = ¼ + ¾ e-4at = probability that A remains A at time t PNN = ¼ + ¾ e-8at = probability that two sequences have the same nucleotide at N D = proportion of different nucleotides = 1 - PNN Dhat = 3/4(1-e-8t) K = - ¾ ln (1-4/3p) where p = proportion of nucleotide differences (# diffs./total bp)

Kimura two-parameter model b a = rate of transition substitution b = rate of transversion substitution PAA(t) = ¼ + ¾ e-4bt + ½ e-2(a+b)t = probability that A remains A at time t K = ½ ln(1/[1- 2P-Q]) + ¼ ln(1/[1-2Q]) where P = proportion of transitional differences Q = proportion of transversional differences

Comparison of models P-distance Jukes Cantor Kimura 2-parameter Tamura-Nei Etc…

Molecular clocks Approximately constant Divergence of proteins K = •f0 Rate of substitution = Mutation rate x proportion of neutral mutations “Saturation” due to multiple Hits in DNA evolution

Anatomy of a phylogenetic tree Terminal (external) nodes Taxa = OTUs = Operational taxonomic units Taxon1 Taxon2 Taxon3 Taxon4 Taxon5 Taxon6 Polytomy Non-dichotomous splitting External branch Internal branch Internal nodes Root

Relative rate test KAC = KBC KOC is shared Tajima test (m1-m2)2 / (m1+m2) Chi square, df=1 Species O m1 m2 Species A Species B Species C

DNA test of neutrality Antigen binding sites: dN/dS > 1 “positive” selection Neutral prediction: amino acid (nonsynonymous) substitution rate (dN) should be lower than silent (synonymous) substitution rate (dS) True for most genes Follows from functional constraint argument Different for Major Histocompatibility Complec (MHC) loci Antigen recognition sequence shows dN > dS Rest of molecule shows dN > dS, as expected Amino acid mutations are favored in antigen recognition region Promotes diversity, better recognition of foreign peptides http://depts.washington.edu/rhwlab/dq/3structure.html Rest of molecule: dN/dS < 1 Negative (purifying) selection

Maximum likelihood Likelihood of observing the data set OTU1 OTU1 Likelihood of observing the data set Assuming a given tree Assuming a given model of DNA evolution L = P(data|tree) Consider 4-taxon cases within a tree For each site, Identify nucleotides at each of the four taxa Assume all 16 pairs of nucleotides at internal nodes Likelihood of observed 4 terminal nucleotides = sum of 16 independent probabilities Repeat likelihoods for each position in alignment Likelihood of tree = product of individual likelihoods L = P Li for i = 1 to n positions in alignment (or sum of log likelihoods) Calculate likelihood for other trees; choose tree with maximum likelihood HTU1 HTU1 OTU1 OTU1