Presentation is loading. Please wait.

Presentation is loading. Please wait.

Biostatistics-Lecture 19 Linkage Disequilibrium and SNP detection

Similar presentations


Presentation on theme: "Biostatistics-Lecture 19 Linkage Disequilibrium and SNP detection"— Presentation transcript:

1 Biostatistics-Lecture 19 Linkage Disequilibrium and SNP detection
Ruibin Xi Peking University School of Mathematical Sciences

2 Haplotype Freqeuncies

3 Linkage Equilibrium

4 Linkage Disequilibrium

5 Disequilibrium Coefficient DAB

6 DAB is hard to interpret
Sign is arbitrary … A common convention is to set A, B to be the common allele and a, b to be the rare allele Range depends on allele Frequencies Hard to compare between markers

7 r2 (also called Δ2) Ranges between 0 and 1
1 when the two markers provide identical information 0 when they are in perfect equilibrium

8 Raw r2 data from chr22

9 Comparing Populations
CEPH: Utah residents with ancestry from northern and western Europe (CEU)

10 Use LD for SNP imputation and detection
fastPhase

11 Use LD for SNP imputation and detection
fastPhase

12 Model for haplotypes Observed n haplotypes
Each with M markers bij = 0, 1 Assume each haplotye originates from one of K clusters zi: unknown cluster of origin of bi Since clusters of origin are unknown

13 Local clustering of haplotype
Assume zi = (zi1,…, ziM) forms a Markov chain on {1,…,K} zim denote the cluster origin for bim Initial probabilities Transition probabilities Conditional on the cluster of origin Marginal

14 Local clustering of genotype data
We have genotype data gim: genotype at marker m of individual i Take values 0, 1, 2 Initial probabilities ( unordered cluster of origins) Transition probabilities

15 Local clustering of genotype data
Genotype probabilities conditional on cluster of origins Joint likelihood

16 Algorithms for genotype imputation
fastPhase BEAGLE IMPUTE PLINK MaCH

17 Algorithms for genotype imputation
fastPhase BEAGLE IMPUTE PLINK MaCH Picture taken from IMPUTE v2

18 SNP detection with LD information
MaCH: (G: genotye, S: cluster)

19 SNP detection with LD information
For sequencing data G is not observed Coverage of base A, B are observed, we have the HMM

20 SNP detection with LD information
Nielsen et al Nature Review Genetics


Download ppt "Biostatistics-Lecture 19 Linkage Disequilibrium and SNP detection"

Similar presentations


Ads by Google