Basics of Linkage Analysis

Slides:



Advertisements
Similar presentations
Statistical methods for genetic association studies
Advertisements

Linkage and Genetic Mapping
A. Novelletto, F. De Rango Dept. Cell Biology, University of Calabria GENOTYPING CONCORDANT / DISCORDANT COUSIN PAIRS.
Mapping genes with LOD score method
Genetic Heterogeneity Taken from: Advanced Topics in Linkage Analysis. Ch. 27 Presented by: Natalie Aizenberg Assaf Chen.
Note that the genetic map is different for men and women Recombination frequency is higher in meiosis in women.
Qualitative and Quantitative traits
Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh
Tutorial #1 by Ma’ayan Fishelson
METHODS FOR HAPLOTYPE RECONSTRUCTION
Tutorial #5 by Ma’ayan Fishelson. Input Format of Superlink There are 2 input files: –The locus file describes the loci being analyzed and parameters.
Tutorial #2 by Ma’ayan Fishelson. Crossing Over Sometimes in meiosis, homologous chromosomes exchange parts in a process called crossing-over. New combinations.
Concepts and Connections
Genetic linkage analysis Dotan Schreiber According to a series of presentations by M. Fishelson.
. Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage.
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
GGAW - Oct, 2001M-W LIN Study Design for Linkage, Association and TDT Studies 林明薇 Ming-Wei Lin, PhD 陽明大學醫學系家庭醫學科 台北榮民總醫院教學研究部.
Human Gene Mapping & Disease Gene Identification Cont.
1 QTL mapping in mice Lecture 10, Statistics 246 February 24, 2004.
Transmission Genetics: Heritage from Mendel 2. Mendel’s Genetics Experimental tool: garden pea Outcome of genetic cross is independent of whether the.
1 How many genes? Mapping mouse traits, cont. Lecture 2B, Statistics 246 January 22, 2004.
MMLS-C By : Laurence Bisht References : The Power to Detect Linkage in Complex Diseases Means of Simple LOD-score Analyses. By David A.,Paula Abreu and.
Tutorial by Ma’ayan Fishelson Changes made by Anna Tzemach.
Parametric and Non-Parametric analysis of complex diseases Lecture #8
IGES 2003 How many markers are necessary to infer correct familial relationships in follow-up studies? Silvano Presciuttini 1,3, Chiara Toni 2, Fabio Marroni.
Introduction to Linkage Analysis March Stages of Genetic Mapping Are there genes influencing this trait? Epidemiological studies Where are those.
Mapping Basics MUPGRET Workshop June 18, Randomly Intermated P1 x P2  F1  SELF F …… One seed from each used for next generation.
2050 VLSB. Dad phase unknown A1 A2 0.5 (total # meioses) Odds = 1/2[(1-r) n r k ]+ 1/2[(1-r) n r k ]odds ratio What single r value best explains the data?
Thoughts about the TDT. Contribution of TDT: Finding Genes for 3 Complex Diseases PPAR-gamma in Type 2 diabetes Altshuler et al. Nat Genet 26:76-80, 2000.
Observing Patterns in Inherited Traits
Linkage and LOD score Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
Standardization of Pedigree Collection. Genetics of Alzheimer’s Disease Alzheimer’s Disease Gene 1 Gene 2 Environmental Factor 1 Environmental Factor.
Introduction to BST775: Statistical Methods for Genetic Analysis I Course master: Degui Zhi, Ph.D. Assistant professor Section on Statistical Genetics.
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
Introduction to Genetics
Non-Mendelian Genetics
Introduction to Linkage Analysis Pak Sham Twin Workshop 2003.
Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk.
Quantitative Genetics. Continuous phenotypic variation within populations- not discrete characters Phenotypic variation due to both genetic and environmental.
Quantitative Genetics
Genetic design. Testing Mendelian segregation Consider marker A with two alleles A and a BackcrossF 2 AaaaAAAaaa Observationn 1 n 0 n 2 n 1 n 0 Expected.
Recombination and Linkage
Lecture 13: Linkage Analysis VI Date: 10/08/02  Complex models  Pedigrees  Elston-Stewart Algorithm  Lander-Green Algorithm.
Lecture 12: Linkage Analysis V Date: 10/03/02  Least squares  An EM algorithm  Simulated distribution  Marker coverage and density.
Tutorial #10 by Ma’ayan Fishelson. Classical Method of Linkage Analysis The classical method was parametric linkage analysis  the Lod-score method. This.
Lecture 15: Linkage Analysis VII
1 B-b B-B B-b b-b Lecture 2 - Segregation Analysis 1/15/04 Biomath 207B / Biostat 237 / HG 207B.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Who was Mendel? Mendel – first to gather evidence of patterns by which parents transmit genes to offspring.
Types of biological variation Discontinuous (qualitative) variation: simple alternative forms; alternative phenotypes; usually due to alternative genotypes.
Logic and Vocabulary of Hypothesis Tests Chapter 13.
Mapping and cloning Human Genes. Finding a gene based on phenotype ’s of DNA markers mapped onto each chromosome – high density linkage map. 2.
Association analysis Genetics for Computer Scientists Biomedicum & Department of Computer Science, Helsinki Päivi Onkamo.
A Transmission/disequilibrium Test for Ordinal Traits in Nuclear Families and a Unified Approach for Association Studies Heping Zhang, Xueqin Wang and.
1 Balanced Translocation detected by FISH. 2 Red- Chrom. 5 probe Green- Chrom. 8 probe.
Errors in Genetic Data Gonçalo Abecasis. Errors in Genetic Data Pedigree Errors Genotyping Errors Phenotyping Errors.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
1 Genetic Mapping Establishing relative positions of genes along chromosomes using recombination frequencies Enables location of important disease genes.
Types of genome maps Physical – based on bp Genetic/ linkage – based on recombination from Thomas Hunt Morgan's 1916 ''A Critique of the Theory of Evolution'',
Mendel & Genetic Variation Chapter 14. What you need to know! The importance of crossing over, independent assortment, and random fertilization to increasing.
Association Mapping in Families Gonçalo Abecasis University of Oxford.
Lecture 17: Model-Free Linkage Analysis Date: 10/17/02  IBD and IBS  IBD and linkage  Fully Informative Sib Pair Analysis  Sib Pair Analysis with Missing.
Part 2: Genetics, monohybrid vs. Dihybrid crosses, Chi Square
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
And Yet more Inheritance
Error Checking for Linkage Analyses
Lecture 9: QTL Mapping II: Outbred Populations
Linkage Analysis Problems
Presentation transcript:

Basics of Linkage Analysis Idea of Linkage Analysis Types of Linkage Analysis Parametric Linkage Analysis Nonparametric Linkage Analysis Conclusions

Basics of Linkage Analysis Idea of Linkage Analysis Types of Linkage Analysis Parametric Linkage Analysis Nonparametric Linkage Analysis Conclusions

Gene mapping problem Lähde: Morgan Genetics Tutorial. http://morgan.rutgers.edu/morganwebframes/level1/page2/karyotype.html

Linkage Analysis One of the two main approaches in gene mapping. Uses pedigree data.

Genetic linkage and linkage analysis Two loci are linked if they appear closeby in the same chromosome. The task of linkage analysis is to find markers that are linked to the hypothetical disease locus Complex diseases in focus  usually need to search for one gene at a time Requires mathematical modelling of meiosis

Meiosis and crossover Number of crossover sites is thought to follow Poisson distribution. Their locations are generally random and independent of each other.

  L( |data ) DIS Marker The simple idea Recombination fraction Find that maximises Obtain measure for degree of evidence in favour of linkage (LOD score)

Markers and inheritance 1 2 4 3 2 1 3 4 2 3 1 2 Father Mother 2 3 1 4 3 1 Child Polymorphic loci whose locations are known Point mutations (SNP) or lengths of repetitive sequences Inherited together with the chromosomal segments

Markers and information Two individuals share same allele label  they share the allele IBS (identical by state) Two individuals share an allele with same grandparental origin  they share an allele IBD (identical by descent) IBS sharing can easily be deduced from genotypes. IBD sharing provides more information. One can try to deduce IBD sharing based on family structure and inheritance.

Markers and information The children share allele 1 IBS. 1,2 2,3 They also share it IBD. 1,2 1,3

Markers and information The children share allele 1 IBS. 1,2 1,3 They do not share alleles IBD. 1,2 1,3

Markers and information The children share allele 1 IBS. 1,1 2,3 They either share or do not share it IBD. 1,2 1,3

Building blocks of linkage analysis Marker maps Chr. 1 Pedigree structures 1 2 1 12 1 2 1 1 2 1 1 5 2 14 1 3 2 1 2 2 Chr. 2 1 3 4 5 1 2 3 2 4 2 1 3 4 4 7 1 4 3 4 4 2 3 Genotypes Chr. 22 Phenotypes 2 1 1 3 2 2 2 3 3 4

Building blocks of linkage analysis Information about disease model (in parametric analysis)  (aa), probability of a homozygote being affected  (Aa), probability of a heterozygote being affected  (AA), probability of a non-carrier being affected (phenocopy rate) Information about environmental variables

13 0 0 5 0 0.0 0.0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 1 2 0.99 0.01 1 0.001000 0.999000 0.999000 3 5 # M0 0.172 0.036 0.176 0.283 0.333 3 5 # M1 0.100 0.345 0.310 0.164 0.081 (---clip---) 3 5 # M10 0.169 0.432 0.147 0.130 0.122 3 5 # M11 0.397 0.204 0.151 0.043 0.205 0 0 0.10 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 1 0.1 0.45 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 3 1 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 4 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 5 1 2 1 1 3 1 2 1 4 5 1 6 3 2 3 3 5 4 2 5 4 3 2 5 3 2 4 5 1 6 0 0 2 2 4 4 3 4 2 5 9 7 1 7 1 3 2 5 5 3 4 6 2 5 2 1 2 1 1 7 0 0 1 1 5 1 2 2 5 2 6 6 5 1 1 3 3 2 5 5 3 4 5 5 4 4 2 1 1 8 1 2 2 1 3 3 2 2 4 5 6 6 3 5 1 4 5 4 1 5 4 3 2 5 3 2 4 5 1 9 3 4 1 1 3 4 2 2 5 2 6 6 5 5 4 1 4 5 5 3 3 6 2 5 2 3 2 1 1 10 3 4 2 1 3 4 2 5 4 3 1 6 3 2 1 3 5 5 2 3 4 2 4 5 4 3 5 1 1 11 5 6 1 1 1 4 1 4 5 5 6 7 2 7 3 3 5 5 2 3 4 6 2 5 3 1 4 1 1 12 5 6 2 2 3 4 2 4 4 5 1 7 3 7 3 3 5 5 2 3 4 6 5 5 2 1 5 1 1 13 5 6 1 2 3 4 1 4 5 5 6 7 2 7 3 3 5 5 2 3 4 6 2 5 3 1 5 1 1 14 7 8 1 1 5 3 2 2 2 5 6 6 5 5 1 4 3 4 5 5 3 3 5 5 4 2 2 5 1 15 7 8 2 1 1 3 2 2 2 5 6 6 5 5 1 4 3 4 5 5 3 3 5 2 4 3 2 4 1 16 0 0 1 1 5 5 2 4 5 6 3 1 1 1 3 4 4 4 3 7 1 3 5 4 3 2 1 4 1 17 16 12 1 2 5 4 2 4 5 5 3 7 1 7 3 3 4 5 7 3 3 6 4 5 2 1 4 5 1 18 16 12 2 1 5 3 4 2 6 4 1 1 1 3 3 3 4 5 3 2 1 6 4 5 2 1 4 1

Example of linkage analysis results for one chromosome

Basics of Linkage Analysis Idea of Linkage Analysis Types of Linkage Analysis Parametric Linkage Analysis Nonparametric Linkage Analysis Conclusions

Types of linkage analysis Parametric vs. non-parametric Dichotomous vs. continuous phenotypes Elston-Stewart vs. Lander-Green vs. heuristic Two-point vs. multipoint Genome scan vs. candidate gene

Basics of Linkage Analysis Idea of Linkage Analysis Types of Linkage Analysis Parametric Linkage Analysis Nonparametric Linkage Analysis Conclusions

Maximum likelihood estimation A common approach in statistical estimation Define hypotheses Generate likelihood function Estimate Test hypotheses Draw statistical conclusions

Hypotheses in linkage analysis  = 0.5 the disease locus is not linked to the marker(s) HA:   0.5 the disease locus is linked to the marker(s)

Likelihood function for a single nuclear family Lj = gF P(gF) P(yF | gF) gM P(gM)P(yM | gM) gOi P(gOi | gF, gM) P(yOi | gO) G = genotype probabilities y = phenotype probabilities The parameter  is incorporated here

Several independent families The likelihood functions of multiple indpendent families are combined: L =  Lj or logL =  log Lj

Testing of hypotheses Compute values of likelihood function under null and alternative hypotheses. Their relationship is expressed by LOD score (essentially derived from the likelihood ratio test statistic.

On significance levels P-value gives a probability that a null hypothesis is rejected even though it was true. A LOD-score threshold of 3 corresponds to a single-test p-value of approximately 0.0001. In genome-wide gene mapping study, one conducts several (partially dependent) statistical tests. Applying the aforementioned threhold, the global p-value of 400 mutually independent test would be 1 - (1-0.0001)400 = 0.039  0.05. What if one focuses on individual candidate regions?

An example of ML estimation Single marker, dominant disease All genotypes known 1,3 2,4 2,3 1,4 1,2 1,4 1,2 3,4 1,2

Paternal haplotype combinations Haplotype combinations of children, assuming unlinked loci

1,3 2,4 2,3 1,4 1,2 3,4 1 + D 3 1-  4 1 2 3 + 1 D ½

Recombination fraction 0.56 0.5 LOD score 0.0 0.0 0.14 0.5 Recombination fraction LOD>3 taken as evidence of linkage.

Basics of Linkage Analysis Idea of Linkage Analysis Types of Linkage Analysis Parametric Linkage Analysis Nonparametric Linkage Analysis Conclusions

Idea of nonparametric linkage analysis No assumption is made on disease model. The tests measure IBD sharing of alleles among affected relatives. ASP (Affected-Sib-Pair test) is the simplest form of NPL Requires nuclear families of two affected children Extendable to arbitrary pedigrees, missing data, and arbitrary group of affected relatives

Example analysis for one marker 3 4 3 4 1 2 1 2 2 4 1 3 1 3 2 3 1 3 2 3 3 4 3 4

Idea of ASP test Collect large number of families with two affected offspring and deduce IBD status for each pair of offspring. Let us mark the number of sib-pairs with IBD status zero by n0. Respectively, n1 ja n2 are observed counts of the sib-pairs that share 1 or 2 alleles IBD. Compare the counts against the expected distribution by computing the value of the 2 test statistic.

Test statistic for ASP where e0 = 0.25n, e1 = 0.5n ja e2 = 0.25n. 2-test with 2 degrees of freedom. homozygous parents are a problem. lots of variants and implementations.

1 1 1 1 1 1 1 1 1 1 1 7

Idea of nonparametric linkage analysis Compare to the 2 cumulative distribution function (with 2 degrees of freedom): P=0.0012 The sample is too small for the 2 test to be reliable.

Basics of Linkage Analysis Idea of Linkage Analysis Types of Linkage Analysis Parametric Linkage Analysis Nonparametric Linkage Analysis Conclusions

Conclusions Linkage analysis is a pedigree-based approach to gene mapping. Parametric vs. nonparametric methods. Hypothesis-driven vs. explorative analysis. Meta-analysis becoming increasingly popular.