Presentation is loading. Please wait.

Presentation is loading. Please wait.

IBD Estimation in Pedigrees

Similar presentations


Presentation on theme: "IBD Estimation in Pedigrees"— Presentation transcript:

1 IBD Estimation in Pedigrees
Gonçalo Abecasis University of Oxford

2 3 Stages of Genetic Mapping
Are there genes influencing this trait? Epidemiological studies Where are those genes? Linkage analysis What are those genes? Association analysis

3

4 Relationship Checking

5 Where are those genes?

6 Tracing Chromosomes

7 Sometimes it is easy…

8 Sharing, or Not?

9 Data Polymorphic markers Task Eg. Microsatellite repeats, SNPs
Allele frequency Location Task Phase markers Place recombinants

10 Complexity of the Problem
For each meiosis In a pedigree with n non-founders, there are 2n meioses each with 2 possible outcomes For each location One for each of m markers Up to 4nm distinct outcomes

11 Elston-Stewart Algorithm
Factorize likelihood by individual Each step assigns phase for all markers for one individual Complexity  n * 4m Small number of markers Large pedigrees With little inbreeding

12 Lander-Green Algorithm
Factorize likelihood by marker Each step assigns phase For one marker For all individuals in the pedigree Complexity  m * 4n Large number of markers Assumes no interference Relatively small pedigrees

13 Markov-Chain Monte-Carlo
Approximate solutions Explore only most likely outcomes Remove restrictions Pedigree size Number of markers Inbreeding Assuming no interference Computationally intensive

14 Popular Packages Elston-Stewart Algorithm Lander-Green Algorithm MCMC
LINKAGE / FASTLINK (Lathrop et al, 1985) VITESSE (O’Connell and Weeks, 1995) Lander-Green Algorithm Genehunter (Kruglyak et al, 1995) Allegro (Gudbjartsson et al, 2000) MCMC Simwalk2 (Sobel et al, 1996) LOKI (Heath, 1998)

15 1. Enumerate Possibilities
Enumerate gene-flow patterns Gene-flow pattern: Sets transmitted allele for each meiosis Implies founder allele for each individual

16 2. Founder Allele Sets For each gene flow pattern v
Enumerate set A(G,v) All allele states a = [a1, …, a2f] Compatible with both: Gene flow v Genotypes G The likelihood is L(v|G) = 2-2nai f(ai) f(ai) is the frequency of allele ai

17 Three one alleles required.
For example ... Genotypes Gene Flow Founder Alleles Four meioses. Three one alleles required. Likelihood = ½4 f(a1)3

18 Single Marker Probabilities
We now have ... Likelihood for each gene flow pattern Conditional on genotypes Conditional on allele frequencies Conditional on a single marker Probability for each gene-flow pattern P(v) = L(v) / vL(v)

19 3. Allowing for Recombination
Transition Probability T(vavb, ) = (1-)nr(Va,Vb)r(Va,Vb) Transition Matrix Location A Location B

20 Moving along chromosome
Input Vector v of likelihoods at location A Matrix T of transition probabilities AB Output Vector v’ of likelihoods at location B Conditional on likelihoods at A For k vectors, requires k2 operations

21 Elston and Idury Algorithm
Requires k log2 k operations

22 Moving Along Chromosome

23 Markov-Chains Single Marker Left Conditional Right Conditional
Full Likelihood

24 MERLIN Fast multipoint calculations Non-parametric linkage analyses
Error detection e.g., unlikely obligate recombinants Haplotyping most likely, exhaustive lists, sampling

25 Sparse Gene Flow Trees

26 Dense maps Computational challenge Computational advantages
Require more memory Require Lander-Green algorithm Limited pedigree size Computational advantages Reduced recombination between markers Approximate solutions possible if steps with many recombinants are ignored

27 MERLIN: Example Pedigrees

28 MERLIN: Timings

29 MERLIN: Memory Usage

30 Command Line Options

31 Effect of Genotyping Error
Modest levels are likely Up to 1% may be typical Mendelian inheritance checks Detect up to 30% of errors for SNPs Effect on power Linkage vs. Association SNPs vs. Microsatellites

32 Affected Sib Pair Sample

33 Unselected Sample

34 Association Analysis

35 Error Detection Genotype errors can introduce unlikely recombinants
Change likelihood Replace (1-q) with q Test sensitivity of likelihood to each genotype Detects errors that have largest effect on linkage

36 Practical Exercise Lon Cardon Stacey Cherny


Download ppt "IBD Estimation in Pedigrees"

Similar presentations


Ads by Google