By Mireya Diaz Department of Epidemiology and Biostatistics for EECS 458.

Slides:



Advertisements
Similar presentations
The Coalescent Theory And coalescent- based population genetics programs.
Advertisements

Gene tree analyses of Aboriginal Australians Rosalind Harding University of Oxford.
Evolution of Populations
Background The demographic events experienced by populations influence their genealogical history and therefore the pattern of neutral polymorphism observable.
Sampling distributions of alleles under models of neutral evolution.
MIGRATION  Movement of individuals from one subpopulation to another followed by random mating.  Movement of gametes from one subpopulation to another.
 Read Chapter 6 of text  Brachydachtyly displays the classic 3:1 pattern of inheritance (for a cross between heterozygotes) that mendel described.
N-gene Coalescent Problems Probability of the 1 st success after waiting t, given a time-constant, a ~ p, of success 5/20/2015Comp 790– Continuous-Time.
Lecture 23: Introduction to Coalescence April 7, 2014.
THE EVOLUTION OF POPULATIONS
Population Genetics I. Evolution: process of change in allele
Forward Genealogical Simulations Assumptions:1) Fixed population size 2) Fixed mating time Step #1:The mating process: For a fixed population size N, there.
14 Molecular Evolution and Population Genetics
2: Population genetics break.
March 2006Vineet Bafna CSE280b: Population Genetics Vineet Bafna/Pavel Pevzner
March 2006Vineet Bafna CSE280b: Population Genetics Vineet Bafna/Pavel Pevzner
Population Genetics: Populations change in genetic characteristics over time Ways to measure change: Allele frequency change (B and b) Genotype frequency.
Population Genetics What is population genetics?
Dispersal models Continuous populations Isolation-by-distance Discrete populations Stepping-stone Island model.
CSE 291: Advanced Topics in Computational Biology Vineet Bafna/Pavel Pevzner
Modeling evolutionary genetics Jason Wolf Department of ecology and evolutionary biology University of Tennessee.
 Read Chapter 6 of text  We saw in chapter 5 that a cross between two individuals heterozygous for a dominant allele produces a 3:1 ratio of individuals.
Population Genetics 101 CSE280Vineet Bafna. Personalized genomics April’08Bafna.
What evolutionary forces alter
Population Genetics Learning Objectives
Process of Evolution Chapter 18 Mader: Biology 8th Ed.
Molecular phylogenetics
- any detectable change in DNA sequence eg. errors in DNA replication/repair - inherited ones of interest in evolutionary studies Deleterious - will be.
Evolution of Populations
MIGRATION  Movement of individuals from one subpopulation to another followed by random mating.  Movement of gametes from one subpopulation to another.
14 Population Genetics and Evolution. Population Genetics Population genetics involves the application of genetic principles to entire populations of.
The Evolution of Populations.  Emphasizes the extensive genetic variation within populations and recognizes the importance of quantitative characteristics.
Chapter 7 Population Genetics. Introduction Genes act on individuals and flow through families. The forces that determine gene frequencies act at the.
CP Biology Ms. Morrison. Genes and Variation  Gene pool = combined genetic information of all members of a particular population  Relative frequency.
Phylogenetics and Coalescence Lab 9 October 24, 2012.
Lecture 3: population genetics I: mutation and recombination
Population assignment likelihoods in a phylogenetic and demographic model. Jody Hey Rutgers University.
Course outline HWE: What happens when Hardy- Weinberg assumptions are met Inheritance: Multiple alleles in a population; Transmission of alleles in a family.
Trees & Topologies Chapter 3, Part 1. Terminology Equivalence Classes – specific separation of a set of genes into disjoint sets covering the whole set.
Genetics and Speciation
Genomic diversity and differentiation heading toward exam 3.
Models and their benefits. Models + Data 1. probability of data (statistics...) 2. probability of individual histories 3. hypothesis testing 4. parameter.
1 Population Genetics Basics. 2 Terminology review Allele Locus Diploid SNP.
Coalescent Models for Genetic Demography
Introduction to History of Life. Biological evolution consists of change in the hereditary characteristics of groups of organisms over the course of generations.
Remainder of Chapter 23 Read the remaining materials; they address information specific to understanding evolution (e.g., variation and nature of changes)
Lecture 17: Phylogenetics and Phylogeography
FINE SCALE MAPPING ANDREW MORRIS Wellcome Trust Centre for Human Genetics March 7, 2003.
Selectionist view: allele substitution and polymorphism
Evolution of Populations. The Smallest Unit of Evolution Natural selection acts on individuals, but only populations evolve – Genetic variations contribute.
Coalescent theory CSE280Vineet Bafna Expectation, and deviance Statements such as the ones below can be made only if we have an underlying model that.
The Evolution of Populations Chapter Weaknesses  He didn’t know how heritable traits pass from one generation to the next  Although variation.
The plant of the day Pinus longaevaPinus aristata.
Testing the Neutral Mutation Hypothesis The neutral theory predicts that polymorphism within species is correlated positively with fixed differences between.
Restriction enzyme analysis The new(ish) population genetics Old view New view Allele frequency change looking forward in time; alleles either the same.
212 BIOLOGY, CH 11 Selection Pressures There is variation among individuals within a species Some of these variations may give a slight advantage to an.
Fixed Parameters: Population Structure, Mutation, Selection, Recombination,... Reproductive Structure Genealogies of non-sequenced data Genealogies of.
A Little Intro to Statistics What’s the chance of rolling a 6 on a dice? 1/6 What’s the chance of rolling a 3 on a dice? 1/6 Rolling 11 times and not getting.
Evolution of Populations. Individual organisms do not evolve. This is a misconception. While natural selection acts on individuals, evolution is only.
Evolution of Populations
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
LECTURE 9. Genetic drift In population genetics, genetic drift (or more precisely allelic drift) is the evolutionary process of change in the allele frequencies.
Lecture 3 - Concepts of Marine Ecology and Evolution II 3) Detecting evolution: HW Equilibrium Principle -Calculating allele frequencies, predicting genotypes.
Lecture 6 Genetic drift & Mutation Sonja Kujala
Polymorphism Polymorphism: when two or more alleles at a locus exist in a population at the same time. Nucleotide diversity: P = xixjpij considers.
COALESCENCE AND GENE GENEALOGIES
Testing the Neutral Mutation Hypothesis
The Evolution of Populations
The coalescent with recombination (Chapter 5, Part 1)
Presentation transcript:

By Mireya Diaz Department of Epidemiology and Biostatistics for EECS 458

Agenda Basic concepts of population genetics The coalescent theory Coalescent process of two sequences Coalescent time Statistical inference Applications: reconstruction of human evolutionary history Future venues

Basic Concepts in Population Genetics f1f1 f2f2 fkfk Random genetic drift Mutation Selection

Basic Concepts in Population Genetics Mutation: limited role in evolution due to its slow effect, however contributes to the maintenance of alleles in the population Locus with 2 allelles: A 1 (p(n)) and A 2 (q(n)=1-p(n)) Non-overlapping generations A 1 ->A 2 at rate u and A 2 ->A 1 at rate v (u, v ~10 -5, ) Allele can mutate most once/generation if initial gene freq. of A1=p(0) As n->∞ “equilibrium”

Basic Concepts in Population Genetics Random genetic drift: change in gene frequency due to random sampling of gametes from a finite population. Important for small size populations Each generation 2N gametes sampled at random from parent generation y(n): # gametes of type A1, in absence of mutation and selection Wright-Fisher model One allele will be lost

Basic Concepts in Population Genetics Selection: can act at different stages of the life of an organism (e.g. differential fecundity, viability) Locus with 2 alleles A 1, A 2 Three genotypes: A 1 A 1 (w 11 ), A 1 A 2 (w 12 ), A 2 A 2 (w 22 ) with fitness w ij, relative survival chances of zygotes of genotype A i A j Under Hardy-Weinberg equilibrium If w 11 >w 12 >w 22 -> A 1 becomes fixed w 11 A 2 becomes fixed w 11,w 22 overdominance, stable polymorphism w 12 underdominance, unstable polymorphism, A 1 or A 2 becomes fixed f(0)

The Coalescent Theory Stochastic process: continuous-time Markov process Large population approximation of Wright-Fisher model, and other neutral models Probability model for genealogical tree of random sample of n genes from large population Most significant progress in theoretical population genetics (past 2 decades). Cornerstone for rigorous statistical analysis of molecular data from populations Need of: inferring the past from samples taken from present population Seminal work: Kingman, J Appl Prob 19A:27, 1982

The Coalescent Theory – Key Idea Start with a sample and trace backwards in time to identify EVENTS in the past since the Most Recent Common Ancestor (MRCA) in the sample Consider sample of n sequences of a DNA region for a population Assume no recombination between sequences N sequences are connected by a single phylogenetic tree (genealogy) where the root=MRCA Diverge Coalesce MRCA

The Coalescent Theory: Usefulness Sample-based theory By-product: development of highly-efficient algorithms for simulation of samples under various population genetics models Particularly suitable for molecular data Estimate parameters of evolutionary models (vs. history of specific locus – phylogenetics)

The Coalescent Process of Two Sequences Consider diploid organisms Wright-Fisher model: –Sequence in a population at a generation = random sample with replacement from those in the previous generation –Mutations at locus of interest: selectively neutral (do not affect reproductive success, all individuals likely to reproduce, all lineages equally likely to coalesce) P(coalescence at previous generation)=? P=1/2N, N=effective population size P(coalescence t+1 generations ago) = For haploid structures, use N rather than 2N

The Coalescent Tree Topology is independent of branch lengths Branch lengths are independent, exponential rv’s (waiting time between coalescent events) Topology is generated by randomly picking lineages to coalesce -> “all topologies are equally likely” MRCA T2T3T4T5T2T3T4T5 Genealogical relationship of sample of genes

The Coalescent Time Assume: # mutations in a given period ~Poisson mean time 2N generation between two sequences mean # mutations in two sequences  = 4N  (  : mutation rate seq/generations) Underlying assumption: randomly mating (~ organisms with high mobility) Coalescent time: time between two successive coalescent events Exponential variable, mean = 2/k(k-1) k: # ancestral sequences between the two events

Coalescent Tree Parameters P(2 lineages pick same parent) And coalesce Remain distinct Expected time to MRCA (height of the tree): Expected total branch length of the tree:

The Coalescent Theory & Statistical Inference Mutation rate Age of MRCA Recombination rate Ancestral population size Migration rate

Reconstruction of Human Evolutionary History Goal: estimate times of evolutionary events (major migrations), demographic history (population bottlenecks, expansions) Haploid sequences: mtDNA, Y chromosome Case study: recent common ancestry of human Y chromosome Source: Thomson et al. PNAS 2000; 97: Estimations: expected time to MRCA and ages of certain mutations Data: chromosomes, sequences variation at three genes (SMCY, DBY, DFFRY) in Y chromosome

Recent common ancestry of Y chromosome Gene Seq length Sample size No. polym. No. substitutions Mutation rate SMCY39, (41) x10-9 DBY8, (12) x10-9 DFFRY15, (15) x10-9 All64, (56) x10-9 Summary of gene characteristics from sample Source: Table 1 from article (#) in no. polymorphisms after removal of length variants, repeat sequences, indels For ages of major events: need mutation rate estimate (SN substitution) Substitutions between chimpanzee and human sequences Mutation rate per site per year = No. subst./2*T split *L T split : time since chimp and human split (~5M years ago) Assumptions: selective neutrality of all changes on Y since divergence

GENETREE Analysis Software: Estimate mean number of mutations: = 2N e  N e : effective number of Y chromosomes in population  : mutation rate per gene per generation Also: expected ages of mutation, time since MRCA Assumptions: coalescent process, infinitely-many-sites mutation (mutation rate low enough -> e/occurs at new site) Four insertions, three deletions, two repeat mutations (different rates from SN substitutions) Only one segregating site in SMCY appeared to have mutated >1 -> data fit infinitely-many sites model

Recent common ancestry of Y chromosome Gene T MRCA 1 95%CI T MRCA 2 95%CI SMCY0.56(0.40, 0.82)85,000(61,000, 125,000) DBY0.83(0.60, 1.10)154,000(112,000, 206,000) DFFRY0.96(0.55, 1.21)120,000(69,000, 152,000) All0.55(0.36, 0.98)84,000(55,000, 149,000) Gene T MRCA 95%CI T MRCA 95%CI SMCY0.0731(0.0618, )48,000(41,000, 68,000) DBY0.0538(0.0382, )55,000(39,000, 100,000) DFFRY0.0582(0.0440, )53,000(40,000, 65,000) All0.0853(0.0580, )59,000(40,000, 140,000) MRCA distribution under constant population MRCA distribution under exponential population growth 1 Expected age in Ne generations. 2 Value in years = N e *25

GENETREE Analysis Expected ages of mutations in tree: Mutation 1: 47,000 (35,000; 89,000) – male movement out of Africa Mutation 2: 40,000 (31,000; 79,000) – beginning of global expansion Africa Asia Oceania

Future Venues Population genetics models: incorporation of migration, population growth, recombination, natural selection Longitudinal analysis Evolutionary analysis of quantitative trait loci (QTL) Properties of CT: –Accuracy of coalescent approximation under combinations of population size, sample size, mutation rate –Properties of estimators under MCMC

References Handbook of Statistical Genetics, 2 nd edition, Vol.2 Nature 2002; 3: Theoretical Population Biology 1999; 56:1-10.