Modeling the Origin and Migration of Human Populations Michael E. Palmer.

Slides:



Advertisements
Similar presentations
Population Genetics 3 We can learn a lot about the origins and movements of populations from genetics Did all modern humans come from Africa? Are we derived.
Advertisements

Evolution of Populations
Chapter 19 Evolutionary Genetics 18 and 20 April, 2004
Molecular Evolution. Morphology You can classify the evolutionary relationships between species by examining their features Much of the Tree of Life was.
Sampling distributions of alleles under models of neutral evolution.
MIGRATION  Movement of individuals from one subpopulation to another followed by random mating.  Movement of gametes from one subpopulation to another.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Microevolution Chapter 18 contined. Microevolution  Generation to generation  Changes in allele frequencies within a population  Causes: Nonrandom.
Signatures of Selection
Population Genetics I. Evolution: process of change in allele
14 Molecular Evolution and Population Genetics
BIOE 109 Summer 2009 Lecture 6- Part I Microevolution – Random genetic drift.
Genetica per Scienze Naturali a.a prof S. Presciuttini Human and chimpanzee genomes The human and chimpanzee genomes—with their 5-million-year history.
2: Population genetics break.
Introduction to Linkage Analysis March Stages of Genetic Mapping Are there genes influencing this trait? Epidemiological studies Where are those.
The origins & evolution of genome complexity Seth Donoughe Lynch & Conery (2003)
Population Genetics What is population genetics?
Dispersal models Continuous populations Isolation-by-distance Discrete populations Stepping-stone Island model.
Inbreeding. inbreeding coefficient F – probability that given alleles are identical by descent - note: homozygotes may arise in population from unrelated.
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
Mendelian Genetics in Populations – 1
Population Genetics direct extension of Mendel’s laws, molecular genetics, and the ideas of Darwin Instead of genetic transmission between individuals,
Population Genetics. Population genetics is concerned with the question of whether a particular allele or genotype will become more common or less common.
Genetic variation, detection, concepts, sources, and forces
Out-of-Africa Theory: The Origin Of Modern Humans
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Population Genetics 101 CSE280Vineet Bafna. Personalized genomics April’08Bafna.
Chapter 3 -- Genetics Diversity Importance of Genetic Diversity Importance of Genetic Diversity -- Maintenance of genetic diversity is a major focus of.
1 Genetic Variability. 2 A population is monomorphic at a locus if there exists only one allele at the locus. A population is polymorphic at a locus if.
- any detectable change in DNA sequence eg. errors in DNA replication/repair - inherited ones of interest in evolutionary studies Deleterious - will be.
Evolution of Populations
Section 4 Evolution in Large Populations: Mutation, Migration & Selection Genetic diversity lost by chance and selection regenerates through mutation.
MIGRATION  Movement of individuals from one subpopulation to another followed by random mating.  Movement of gametes from one subpopulation to another.
1 1 Population Genetics. 2 2 The Gene Pool Members of a species can interbreed & produce fertile offspring Species have a shared gene pool Gene pool –
14 Population Genetics and Evolution. Population Genetics Population genetics involves the application of genetic principles to entire populations of.
Chapter 14 Molecular Evolution and Population Genetics
Deviations from HWE I. Mutation II. Migration III. Non-Random Mating IV. Genetic Drift A. Sampling Error.
Great Dane x Mexican Chihuahua F1 Big (Great Danes) 3 Big : 1 Small.
Models of Molecular Evolution III Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections 7.5 – 7.8.
1 Population Genetics Basics. 2 Terminology review Allele Locus Diploid SNP.
The plant of the day Bristlecone pine - Two species Pinus aristata (CO, NM, AZ), Pinus longaeva (UT, NV, CA) Thought to reach an age far greater than any.
Bottlenecks reduce genetic variation – Genetic Drift Northern Elephant Seals were reduced to ~30 individuals in the 1800s.
Introduction to History of Life. Biological evolution consists of change in the hereditary characteristics of groups of organisms over the course of generations.
Selectionist view: allele substitution and polymorphism
NEW TOPIC: MOLECULAR EVOLUTION.
By Mireya Diaz Department of Epidemiology and Biostatistics for EECS 458.
The plant of the day Pinus longaevaPinus aristata.
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College
In populations of finite size, sampling of gametes from the gene pool can cause evolution. Incorporating Genetic Drift.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
Evolution.  Darwin:  HMS Beagle  Galapagos Islands  Artificial Selection -breeding to produce offspring with desired traits-He inferred that if humans.
Random Change. In terms of genetics, it is any change in allele frequencies within a population. The H-W, provided conditions that evolution would not.
LECTURE 9. Genetic drift In population genetics, genetic drift (or more precisely allelic drift) is the evolutionary process of change in the allele frequencies.
Robert Page Doctoral Student in Dr. Voss’ Lab Population Genetics.
Lecture 6 Genetic drift & Mutation Sonja Kujala
Evolution and Population Genetics
Topics How to track evolution – allele frequencies
Bottlenecks reduce genetic variation – Genetic Drift
MIGRATION Movement of individuals from one subpopulation to another followed by random mating. Movement of gametes from one subpopulation to another followed.
Why study population genetic structure?
The neutral theory of molecular evolution
Chapter 19: Population genetics
Deviations from HWE I. Mutation II. Migration III. Non-Random Mating
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
MIGRATION Movement of individuals from one subpopulation to another followed by random mating. Movement of gametes from one subpopulation to another followed.
The Evolution of Populations
Genetic drift in finite populations
Evolution by Genetic Drift : Main Points (p. 231)
Evolution by Genetic Drift : Main Points (p. 231)
Presentation transcript:

Modeling the Origin and Migration of Human Populations Michael E. Palmer

Overview History Some Population Genetics  origins of genetic variation  evolutionary timescales  selection and drift  neutral theory Detection of Selection in Humans with SNPs Some more Population Genetics  Migration  Wright’s F ST Inference of Human Phylogenetic Tree Time to Most Recent Common Ancestor (TMRCA) Unique Origin vs. Multiregional Evolution Models Geographic Origin of Humans

History of Study of Human Variation Blood proteins (ABO gene, 1919) Radioisotopes to study DNA Polymerase Chain Reaction (PCR), 1986  method to “amplify” (copy) a piece of DNA  led to an explosion of DNA sequence data Almost every protein has genetic variants These variants are useful markers for population studies

Origins of Genetic Variation How often does this happen per generation? (germ line matters, not soma) Rate of Genetic Events (avg) in Mammals Point substitution (nuc)~0.5 x per bp Microdeletion (1-10bp)about 1/20 of point Microinsertion (1-10bp)about half of  del Recombination~ Mobile element ins’n~ Inversion?? much rarer Exceptions Hypermutable sites C->T = 10x avg point rate Simple Sequence Repeats x indel rate (some !) mitochondrial DNA x nuclear point rate 1 generation Number of cell divisions from one generation to next ~23 ~20 Female ~400 ~40 Male HumanMouse Source: A. Sidow, BIOSCI 203

Mammals:10 8 years Apes: 10 7 years Humans: 10 5 –10 6 yrs Human Family 100 years … Eukaryotes: 10 9 years Accumulation of Variation over Time Source: A. Sidow, BIOSCI 203

Drift and Selection Drift  Change in allele frequencies due to sampling  a ‘stochastic’ process  Neutral variation is subject to drift Selection  Change in allele frequencies due to function  ‘deterministic’  Functional variation may be subject to selection (more later) The two forces that determine the fate of alleles in a population Source: A. Sidow, BIOSCI 203

Genetic Drift 1 From Li (1997) Molecular Evolution, Sinauer Press, via A. Sidow BIOSCI 203

4 populations 2 at N=25 2 at N=250 Genetic Drift 2: Population Size Matters

Genetic Drift over time – expected values Principles of Population Genetics, Hartl and Clark

Genetic Drift over time – expected values Principles of Population Genetics, Hartl and Clark

Selection 1: Fitness viability = chance of survival to reproductive age  one measure of fitness If fitness depends on genotype, then we have selection  if organisms live/die independent of genotype, that’s drift Source: A. Sidow, BIOSCI 203

Selection 2: Measuring Fitness allele# at g 0 # at g 1 ∆ws = 1-w A a normalized to this allele → Normalized relative fitness: w Selection coefficient: s = 1-w Sometimes normalized to “most fit” (then s ≤ 0) Sometimes normalized to average fitness

Effective population size N e Sewall Wright (1931, 1938) “The number of breeding individuals in an idealized population that would show the same amount of dispersion of allele frequencies under random genetic drift or the same amount of inbreeding as the population under consideration". Usually, N e < N (absolute population size) N e != N can be due to:  fluctuations in population size  unequal numbers of males/females  skewed distributions in family size  age structure in population

Selection vs Drift 1: |s| and Pop Size If |s| < 1/N e, then selection is ineffective and the alleles are solely subject to drift: the alleles are “effectively neutral” What is the probability of fixation? If |s| < 1/N e, then P(fix) = q N e = effective pop size s = selection coefficient q = allele frequency If |s| > 1/N e, then P(fix) = 1 - e -4 N sq 1 - e -4 N s e e Source: A. Sidow, BIOSCI 203

Selection vs Drift 2: |s| and Pop Size Around the diagonal, where inverse of pop size is close to |s|, selection and drift are in a tug-of-war. Source: A. Sidow, BIOSCI 203

Evolutionary Change (fixation) p = 0.6q = 0.4 Let’s look at a single nucleotide site in the genome TTATTA TCATCA p = 1.0 TTATTA TCATCA Allele arises but fades away (by selection and/or drift) time Allele arises and moves to fixation (by selection and/or drift) Source: A. Sidow, BIOSCI 203

Neutral theory (Kimura) How do mutation & drift interact, in absence of selection? Probability of eventual fixation (of a neutral allele at frequency p 0 )  p 0  E.g., for a new mutation in diploid pop: p 0 =1/2N e Average time to fixation of a neutral allele  4N e generations Rate at which neutral mutations are fixed (mutation rate is  )   (does not involve N e ) Average time between consecutive neutral substitutions  1/  Average homozygosity at equilibrium, using infinite alleles model  1/(4N e  +1)

Neutral theory Diagram showing trajectory of neutral alleles in a population. New allelles enter the population by mutation and have an initial allele frequency of 1/2N. Most alleles are lost, but those that go to fixation take an average of 4N generations. The time between successive fixations of neutral alleles is 1/  generations. (A)A moderate size population. (B)The same population size; a higher mutation rate gives the same time to fixation, but less time between fixations. (C)A smaller population has alleles that go to fixation more rapidly, but the time between fixations is still 1/ . (After Kimura 1980.) Principles of Population Genetics, Hartl and Clark

Definition: SNPs SNP = Single Nucleotide Polymorphism  polymorphism = the presence of two or more alleles of a gene or other DNA sequence in a population.  for example: 42% of individuals have...CATTACTAGTG... 58% of individuals have...CATTGCTAGTG... SNPs typically arise from point mutations

Detection of Selection in Humans with SNPs Data from Cargill et al, Nature Genetics 1999 vol 22:231 Large-scale SNP-survey looked at: 106 Genes in an average of 57 human individuals 60,410 base pairs of noncoding sequence (UTRs, introns, some promoters) 135,823 base pairs of coding sequence Some salient points: We will discuss only polymorphisms in coding sequence (cSNPs) Because survey is snapshot of current frequencies, evidence for selection or drift is indirect This is about bulk properties, not about individual genes

The Degenerate Genetic Code

Null Hypothesis for SNP Survey In the average coding region, about 30% of possible point muts are silent Silent substitutions – don’t change the aa Replacement substitutions – do change the aa conservative substitutions – a functionally similar aa nonconservative substitutions – a functionally different aa But consider: 1. Silent changes usually produce no phenotype and are therefore unlikely to be subject to selection -- neutral assumption holds 2. Replacement changes can produce a phenotype, if only subtle or in synthetic combination -- neutral assumption may not hold 3. Far more replacement changes are deleterious than advantageous If there had been no selection in population history, we would expect 70% of coding region polymorphisms to be replacement and 30% to be silent Source: A. Sidow, BIOSCI 203

Results of SNP Survey 1. Silent polymorphisms outnumber replacement polymorphisms Silent Replacement Total 392 Observed Expected if no selection 2. Conservative replacements outnumber nonconservative replacements Total 185 Conservative 119 ~92 Nonconservative 66 ~93 Observed Expected if no selection Source: A. Sidow, BIOSCI 203 Implication: selection against deleterious mutations penalizes replacements especially penalizes nonconservative replacements

Overview History Some Population Genetics  origins of genetic variation  evolutionary timescales  selection and drift  neutral theory Detection of Selection in Humans with SNPs Some more Population Genetics  Migration  Wright’s F ST Inference of Human Phylogenetic Tree Time to Most Recent Common Ancestor (TMRCA) Unique Origin vs. Multiregional Evolution Models Geographic Origin of Humans

Migration: another source of allele frequency change In a subdivided population, drift and varied selection result in diversity among subpopulations Migration limits genetic divergence  Lack of migration can allow speciation to occur Only 1 migrant per generation is enough to keep drift partially in check (prevent complete fixation of alleles) !

Allele frequencies and population history What are the allele frequencies vs. heterozygosities? T/C Pop1 T/T T/C C/CCCC/C Pop2 T/T C/C CCC/CCCC/C Pop3 T/T T/C Pop4 Overdominant (balancing) selection (Heterozygote advantage) HW Expectation Population Subdivision Just a rare minor allele Source: A. Sidow, BIOSCI 203

Pop3a (Oahu)Pop3b (Kauai) Maybe: Population Subdivision T/T T/C C/CCCC/C Pop2 T/T C/C CCC/CCCC/C Pop3 Wright’s F-statistics (F ST, etc) are measures of genetic diversity Indicates population subdivision F ST measures a reduction of average heterozygosity between the subpopulations and the total population. Source: A. Sidow, BIOSCI 203

Wright’s F ST : a measure of genetic diversity among populations

“Decrease of heterozygosity” F ST = (H T -H S )/H T ( – )/ = (indicates high overall diversity of subpopulations) 0 – 0.05: little genetic differentiation : moderate : great > 0.25: very great F SR = (H R -H S )/H R ( – )/ = Variation among subpops within each region F RT = (H T -H R )/H T ( – )/ = Variation among regions within total pop (greater than variation within regions – regions capture population structure)

Inference of Human Phylogenetic Tree

Time to Most Recent Common Ancestor (TMRCA) Archeological evidence  origin in Africa kya  spread to rest of world, 50-60kya What does genetic evidence say? What about the location?

Mitochondrial DNA An organelle of the animal cell Kreb’s Cycle (aerobic respiration) takes place here Transmitted only along female lineage Haploid genome, independent from human “host” High mutation rate

Mitochondrial “Eve” Most recent matrilineal common ancestor of all living humans All our mitochondria are descended from hers Does not mean she was the only human female alive at the time  Consider the set S of all humans alive today  Take the set S’ = mothers-of(S). (now all female)  Size(S’) ≤ Size (S) ...continue until you have one member: that’s Eve Members of S have other female ancestors, but Eve is the only one with an unbroken matrilineal line to all of S She lived ~230kya She was not Eve during her own lifetime  Title of Eve depends on current set of people alive  as matrilineal lines die out, you get a more recent Eve Difficult to determine if she was Homo sapiens

Y-chromosome “Adam” Part of the Y chromosome does not recombine Hence we can do a similar trick  However, only men (XY) carry the Y chromosome  So we can only identify the most recent patrilineal common ancestor of all men living today: Estimated to live ~100kya  never met “Eve”! Why are mtDNA and Y chromosome TMRCA dates so different?  lower N E for males than for females? polygyny more frequent than polyandry? higher male mortality rates? higher male variability in reproductive success?  patrilocal marriage more common than matrilocal?  mtDNA mutation rates variable, causing error?

Tracking Human Migrations Current consensus: ~1,000 individuals (a tribe) left Africa 100kya...

Human microsatellite data individuals; 52 populations; 377 autosomal microsatellite markers “microsatellite” or Short Tandem Repeat (STR) = 2-6 bases repeated several times e.g., TCTA TCTA TCTA TCTA TCTA TCTA TCTA TCTA - “indigenous populations” only; all individuals’ grandparents lived in same place Rosenberg et al., Science 298:

Unique Origin (UO) vs. Multiregion Evolution (ME) Models for Humans Genome Research 15: UO model had long been assumed, but never formally tested. Question is: where did most change between Homo erectus and Homo sapiens accumulate?  time

10 models tested Tested 9 Multiregion Evolution (ME) models, and one Unique Origin (UO) model

Simulations to test UO model

Validating their methods... Used 5,000 runs as “pseudo-observed” data (half UO-generated, half ME-generated) Used 5,000 runs as test data Counted up how many times they could correctly distinguish UO- or ME-generated pseudo-observed data

Real data favors Unique Origin in North Africa

F ST versus distance in Humans Ramachandran et al.,PNAS 102(44)

Geographic Origin of Humans Ramachandran et al.,PNAS 102(44) - Pairwise distances used to imply an ordering - Assumes a single origin - Assumes no intermingling after a colony founded - A central African origin has the most explanatory power