Populations: defining and identifying. Two major paradigms for defining populations Ecological paradigm A group of individuals of the same species that.

Slides:



Advertisements
Similar presentations
Single Nucleotide Polymorphism And Association Studies Stat 115 Dec 12, 2006.
Advertisements

SNP Applications statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt.
METHODS FOR HAPLOTYPE RECONSTRUCTION
Multiple Comparisons Measures of LD Jess Paulus, ScD January 29, 2013.
A method of quantifying stability and change in a population.
Objectives Cover some of the essential concepts for GWAS that have not yet been covered Hardy-Weinberg equilibrium Meta-analysis SNP Imputation Review.
BMI 731- Winter 2005 Chapter1: SNP Analysis Catalin Barbacioru Department of Biomedical Informatics Ohio State University.
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap Peter Castaldi January 29, 2013.
MALD Mapping by Admixture Linkage Disequilibrium.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Signatures of Selection
Population Genetics (Ch. 16)
A coalescent computational platform for tagging marker selection for clinical studies Gabor T. Marth Department of Biology, Boston College
Population Genetics: Populations change in genetic characteristics over time Ways to measure change: Allele frequency change (B and b) Genotype frequency.
Population Genetics What is population genetics?
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
CSE 291: Advanced Topics in Computational Biology Vineet Bafna/Pavel Pevzner
CSE182-L17 Clustering Population Genetics: Basics.
Genetic variation, detection, concepts, sources, and forces
 Read Chapter 6 of text  We saw in chapter 5 that a cross between two individuals heterozygous for a dominant allele produces a 3:1 ratio of individuals.
The Hardy-Weinberg Equation
1 Genetic Variability. 2 A population is monomorphic at a locus if there exists only one allele at the locus. A population is polymorphic at a locus if.
Population Genetics Learning Objectives
Broad-Sense Heritability Index
人类群体遗传学 基本原理和分析方法 中科院 - 马普学会计算生物学伙伴研究所 中国科学院上海生命科学研究院研究生课程 人类群体遗传学 徐书华 金 力.
Advanced Algorithms and Models for Computational Biology -- a machine learning approach Population Genetics: SNPS Haplotype Inference Eric Xing Lecture.
Populations. Large populations Terns Dryopteris fragrans, a rare cliff fern Small populations.
Population assignment likelihoods in a phylogenetic and demographic model. Jody Hey Rutgers University.
Genetic Linkage. Two pops may have the same allele frequencies but different chromosome frequencies.
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
CS177 Lecture 10 SNPs and Human Genetic Variation
Gene Hunting: Linkage and Association
Changing Allele Frequency Chapter 23. What you need to know! The conditions for Hardy-Weinberg Equilibrium How to use the Hardy-Weinberg equation to calculate.
Lecture 13: Population Structure October 5, 2015.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Large-scale recombination rate patterns are conserved among human populations David Serre McGill University and Genome Quebec Innovation Center UQAM January.
1 Population Genetics Basics. 2 Terminology review Allele Locus Diploid SNP.
Genes in human populations n Population genetics: focus on allele frequencies (the “gene pool” = all the gametes in a big pot!) n Hardy-Weinberg calculations.
INTRODUCTION TO ASSOCIATION MAPPING
Lecture 13: Linkage Analysis VI Date: 10/08/02  Complex models  Pedigrees  Elston-Stewart Algorithm  Lander-Green Algorithm.
Evolution and Population GENETICS
Population and Evolutionary Genetics
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
1 Genes Within Populations Chapter Outline Gene Variation Hardy Weinberg Principle Agents of Evolutionary Change Measuring Fitness Interactions.
Lecture 13: Population Structure
1 Balanced Translocation detected by FISH. 2 Red- Chrom. 5 probe Green- Chrom. 8 probe.
The International Consortium. The International HapMap Project.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
By Mireya Diaz Department of Epidemiology and Biostatistics for EECS 458.
1.Stream A and Stream B are located on two isolated islands with similar characteristics. How do these two stream beds differ? 2.Suppose a fish that varies.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
Measuring genetic variability Studies have shown that most natural populations have some amount of genetic diversity at most loci locus = physical site.
Common variation, GWAS & PLINK
Genetic Linkage.
Population Genetics As we all have an interest in genomic epidemiology we are likely all either in the process of sampling and ananlysising genetic data.
Signatures of Selection
Genetic Linkage.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Patterns of Linkage Disequilibrium in the Human Genome
The ‘V’ in the Tajima D equation is:
Basic concepts on population genetics
Mechanisms of Evolution
Vineet Bafna/Pavel Pevzner
The Mechanisms of Evolution
Genetic Linkage.
Proportioning Whole-Genome Single-Nucleotide–Polymorphism Diversity for the Identification of Geographic Population Structure and Genetic Ancestry  Oscar.
Goals: To identify subpopulations (subsets of the sample with distinct allele frequencies) To assign individuals (probabilistically) to subpopulations.
Mechanisms of Evolution
Shuhua Xu, Wei Huang, Ji Qian, Li Jin 
Presentation transcript:

Populations: defining and identifying

Two major paradigms for defining populations Ecological paradigm A group of individuals of the same species that co-occur in space and time and have an opportunity to interact with each other. Evolutionary paradigm A group of individuals of the same species living in close enough proximity that any member of the group can potentially mate with any other member.

Cocoa from 32 abandoned estates in Trinidad 88 Imperial College Selection (ICS) clones conserved in the International Cocoa Genebank, Trinidad, assayed for 35 microsatellite loci Unweighted pair group method used to construct dendrogram of relatedness between individuals The different colored groups can be identified by eye, or identified with the computer program “STRUCTURE” (as was done here).

Yellow perch The yellow perch plays a significant role in the survival and success of the double-crested cormorant and other birds, predatory fish, commercial fisherman, and sport fisherman in the Great Lakes region. This fish must be properly managed in order to prevent the trophic structure and economy of the Great Lakes region from collapsing. The yellow perch (Perca flavescens) is found in the United States and Canada, and looks similar to the European perch but are paler. It is in the same family as the walleye, but in a different family from white perch.

mt DNA Control region haplotype frequency patterning for Yellow Perch spawning site groups across North America

Relationships among mtDNA haplotypes of Yellow Perch

Allele distribution for six representative Yellow Perch microsatellite loci among selected regions. Rings represent loci, colors within a ring represent alleles.

Bayesian assignment of Yellow Perch genetic structure, using STRUCTURE. Vertical bars represent individuals, colors within a bar represent probability of assignment to a cluster. 8 microsatellite loci, 25 collection sites, N= 495 fish, K=10

Inference of population structure using multi-locus genotype data Pritchard, Stephens, and Donnelly (2000) Falush, Stephens, and Pritchard (2003) STRUCTURE V2.1 Pritchard, J.K., and Wen, W. (2004)

Main objective of “structure” Assign individuals to populations on the bases of their genotypes, while simultaneously estimating population allele frequencies Infer number of populations “K” in the process

Other objectives Begin with a set of predefined populations and to classify individuals of unknown origin Identify the extent of admixture of individuals Infer the origin of particular loci in the sampled individuals

Structure is a Bayesian Model Based method of clustering many assumptions about parameters and distributions

Four basic models 1. Model without admixture each individual is assumed to originate in one (only one) of K populations 2. Model with admixture each individual is assumed to have inherited some proportion of its ancestry from each of K populations

Four basic models 3. Linkage model “Chunks” of chromosomes as derived as intact units from one or another K population and all allele copies on the same “chunk” derive from the same population.

Four basic models 4. F model The populations all diverged from a common ancestral population at the same time, but allows that the populations may have experienced different amounts of drift since the divergence event

Assumptions The main modeling assumptions are Hardy- Weinberg equilibrium (HW) within populations and complete linkage equilibrium (LD) between loci within populations The model accounts for the presence of HW or LD by introducing population structure and attempts to find populations groupings that (as far as possible) are not in disequilibrium

Hardy-Weinberg Gives relationship between gene frequencies and genotypic frequencies, assuming random mating  F(AA)=p 2  F(Aa)=2pq  F(aa)=q 2 The extent of a randomly mating population is predicted from STUCTURE using HW predictions

Two locus structure: linkage disequilibrium A1A1 A2A2 B1B1 B2B2

Relationship between allele frequency and haplotype frequency Haplotype frequencies  x 11 = frequency of A 1 B 1  x 12 = frequency of A 1 B 2  x 21 = frequency of A 2 B 1  x 22 = frequency of A 2 B 2 Allele frequencies are the "marginal" totals p 1 = x 11 + x 12 q 1 = x 21 + x 22 p 2 = x 11 + x 21 q 2 = x 12 + x 22

Non-random associations of alleles between loci Expected value of a product between two random variables does not equal the product of two expectations: x 11  p 1 p 2 x 12  p 1 q 2 x 21  q 1 p 2 x 22  q 1 q 2

D=covariance of alleles between loci x 11 = p 1 p 2 + D x 12 = p 1 q 2 - D x 21 = q 1 p 2 - D x 22 = q 1 q 2 + D D=x 11 - p 1 p 2 or D=x 11 x 22 - x 12 x 21

Evolution of LD Establishment: LD=1 with new mutations Declines with time Increases with “hitchhiking” or SNPs associated with selected variants Increases by chance in small populations Doesn’t decline that fast in areas of low recombination Decreases with physical distance between SNPs (recombination)

Pairwise comparison of LD along chromosomes, high LD is red, low LD is green

Bayesian procedure employed by STRUCTURE Step 1: estimate the allele frequencies for each population assuming that the population of origin of each individual is known. Step 2: estimate the population of origin of each individual, assuming that the population allele frequencies are known. Iterate several times using “Markov-Chain Monte-Carlo” procedure

Good and bad things about “structure” When populations are real, most efficient way to estimate number of populations K and the membership of individuals to populations When populations are more continuous (for example a continuous cline), can impose incorrect structure on data, and create an arbitrary number of artificial groups.

Human variation and differentiation Hundreds of microsatellites now available ALU markers Can evolutionary history be reconstructed Are there distinct “races” Are certain populations less diverse

K is set to 3 We place individuals in three groups, without prior knowledge of group membership More loci, the better identification of groups

Human Genome Diversity Panel 55 Indigenous Populations from 5 Continents: Africa, Americas, Asia, Europe, Oceania, total of 1,056 people 377 microsatellite markers assayed Noah Rosenberg et al, Science, 2002

Structure within structure

Jun Li et al, Science, 2008 Human Genome Diversity Panel, 938 individuals from 51 populations, 5 continents 650,000 SNP Markers

Bayesian prior for population assignment