Presentation is loading. Please wait.

Presentation is loading. Please wait.

European Academy Bozen/Bolzano EURAC Dec 16, 2005

Similar presentations


Presentation on theme: "European Academy Bozen/Bolzano EURAC Dec 16, 2005"— Presentation transcript:

1 European Academy Bozen/Bolzano EURAC Dec 16, 2005
The genetic structure of human populations and the search for complex disease genes Silvano Presciuttini European Academy Bozen/Bolzano EURAC Dec 16, 2005

2 Overview In order to locate genes with moderate phenotypic effect, we must use methods based on linkage disequilibrium (LD). LD is a property of the population (contrary to linkage, which is a property of the species); therefore, we need detailed knowledge of population’s genetic structure to design efficient LD mapping studies

3 Outline The classical approach to locating genes with respect to one another in experimental organisms was based on linkage analysis The same approach obviously apply to human genes; however, detecting linkage in human is not easy, and the statistical treatment is complicated For assessing linkage of Mendelian diseases, classical linkage analysis is a robust method; however, fine mapping is impractical At genetic distances where linkage analysis becomes unfeasible, LD mapping starts being useful For complex diseases, we still may apply linkage analysis, but we need a good genetic model; in addition, the power to detect linkage decreases LD may also be efficient in detecting genes that increase disease risk LD depends on population history, and varies across different populations Isolated populations founded by a small number of individuals should be preferred when planning LD mapping studies

4 Linkage analysis in experimental organisms
In the early 1900s, Bateson and Punnett were studying inheritance in the sweet pea. They crossed pure lines P/P · L/L (purple flower, long pollen) × p/p · l/l (red, round), and selfed the F1 heterozygotes. The following table shows the proportions of the four phenotypes in the F2 plants: The first evidence of linkage (deviation from the Mendelian principle of independent assortment). The first genetic map in Drosophila In Sturtevant's own words, "In the latter part of 1911, in conversation with Morgan, I suddenly realized that the variations in strength of linkage, already attributed by Morgan to differences in the spatial separation of genes, offered the possibility of determining sequences in the linear dimension of a chromosome. I went home and spent most of the night (to the neglect of my undergraduate homework) in producing the first chromosome map."

5 Linkage analysis in humans
In principle, detecting linkage in humans is exactly the same as detecting linkage in any other sexually reproducing diploid organism. The aim is to discover if two loci segregate independently and, if not, to measure the recombination rate. In this family, only these chromosomes can be scored for crossing over In practice, detecting linkage in humans has been precluded for the large part of the last century, because suitable polymorphic markers were just not available. For much of this period, human geneticists were envious spectators, because the idea of constructing a human linkage map was generally considered unattainable.

6 Some peculiarities of linkage analysis in humans
Unlike the experimental organisms, the human linkage map was never going to be based on genes because the frequency of mating between two individuals suffering from different genetic disorders is extremely small. The only way forward for a human linkage map was to base it on neutral polymorphic markers. These were found with the discovery of RFLPs in the ’70, but a true advance came with the discovery of microsatellites only in the ’80. Matings cannot be controlled in humans; therefore, geneticists try to collect as many families as possible, in the hope that there will be enough meioses informative for linkage Given the vagaries of family sampling (i.e., variable pedigree structures and different mating types), calculation of recombination fractions would be a nightmare without the help of computer programs As linkage is a property of the genome (i.e., of the species), families with rare conditions can be collected from all over the world, irrespective of their ethnic background

7 Before the ’80, detecting linkage in humans was occasional
Before 1980, only a very few human genes had been identified as genetic risk factors for hereditary disorders. Such early successes were very largely the result of exceptional characteristics: the biochemical basis of the disease had previously been established and purification of the gene product could be achieved without too much difficulty. Such advantages do not apply, however, to great majority of diseases resulting from mutation in human genes. In the 1980s, the application of recombinant DNA technology offered new approaches to mapping and identifying the genes underlying inherited single gene disorders and number of disease genes identified started to increase rapidly.

8 Linkage analysis of Mendelian traits
Over the past two decades, rapid progress has been made by using genetics to identify the molecular cause of human disease. Most of these diseases are rare, highly penetrant, traits that are found to follow Mendelian rules of inheritance in families, and are therefore often referred to as "Mendelian diseases." Linkage methods for mapping Mendelian traits are well established and have resulted in the identification of the molecular causes of hundreds of diseases. Pace of disease gene discovery 1981 – 2000 The number of disease genes included in the chart is 1112. Numbers in parentheses indicate disease-related genes that are polymorphisms ("susceptibility genes") Peltonen and McKusick (2001)

9 For Mendelian traits, linkage analysis is a powerful technique
The remarkable success of positional cloning rests not simply on the advances observed in molecular technology. It also reflects the enormous power of linkage analysis when applied to Mendelian phenotypes — that is, those characterized by a (near) one-to-one correspondence between genotypes at a single locus and the observed phenotype In this family, the disease co-segregates with a marker allele A pedigree shoving evidence of linkage of disease to marker

10 Genetic heterogeneity
One of the most common deviation from the one-to-one correspondence between a mutated gene and a disease is genetic heterogeneity (similar phenotypes caused by mutations in more than one gene). This happens when a disease is caused by mutations in different loci (belonging to different complementation groups) whose gene products participate in the same cellular processes In these cases, we have a many-to-one relationship between genes and phenotypes From the point of view of linkage analysis, genetic heterogeneity represents only a minor disturbance With the advent of dense marker maps, locating Mendelian (or nearly Mendelian) traits to chromosomes is virtually certain

11 The limitation of linkage analysis
Once linkage of a gene to a particular trait has been confirmed, the next step would be to narrow the region through the analysis of recombinants The standard procedure is to re-examine the families with markers spaced more closely in the region of interest. However, even if one has an unlimited supply of closely linked, STRs or SNPs, the limit of resolution remains the number of meioses in which crossovers might have occurred Even when large extended families are available, only a few hundred informative meiotic events can be observed, limiting the resolution of linkage mapping in the most favorable cases to about 1 cM (roughly 1% recombination, 1 cM, or ~ 1 Mb of DNA, still a large amount). In less favorable cases, there may be as many as a few hundred predicted genes that might be the relevant disease gene. Thus, linkage mapping is appropriate for low resolution mapping to localize disease loci to broad chromosome regions within a few cM (<10 cM), which could contain tens, or hundreds, of genes.

12 A random example from literature
The locus (RP1) for one form of autosomal dominant retinitis pigmentosa (adRP) was mapped on chromosome 8q11-q22 by linkage analysis in an extended family ascertained in the USA. Investigating another multigeneration Australian family with adRP, the critical region was narrowed to about 4 cM, corresponding to approximately 4 Mb Linkage mapping in two families with the RP1 form of retinal degeneration places the disease locus in a 4 mb interval between D8S601 and D8S285 Xu S-Y et al. Hum Genet: 98, (1996)

13 Genetic analysis of HPT-JT
18 families with “Hyperparathyroidism with jaw tumors” (HPT-JT) were submitted to fine mapping to narrow down the region of the locus (HRPT2); affected haplotypes were constructed and a recombination map was obtained. Therefore, the genetic interval was reduced to 12 cM, including a chromosome segment of 14 Mb. This region contained 67 candidate genes. A region of 12 cM in chromosome 1 identified by recombination analysis in 18 families with HPT-JT. A partial transcript map of the critical region. Genes highlighted in BLUE were initially prioritized for mutational analysis. C1orf28 is labeled in RED as the gene of interest.

14 Linkage disequilibrium analysis as a fine mapping tool
If the region of interest is smaller than a few Mb, then there will be very few recombinations in this region. Therefore linkage analysis becomes useless in small regions. One way to perform fine mapping and confirm linkage of a susceptibility locus is to test for allele association due to linkage (i.e. "linkage disequilibrium") between particular genetic markers and the disease. In fact, and contrary to linkage analysis, association analysis is highly efficient for fine mapping, as appreciable linkage disequilibrium exists in human between loci with recombination fractions of less than 1-2% Linkage disequilibrium (LD) analysis has often been instrumental in the final phases of gene localization. These successes have fueled hopes that similar approaches will be effective in localizing genes underlying susceptibility to common, complex diseases.

15 What is linkage disequilibrium?
Suppose that we have typed a population sample for two diallelic loci, and let the results be tabulated as follows: We can easily estimate the allele frequency at the two locy by direct count: From these frequencies, we may calculate the expected frequencies of the four haplotypes (this situation correspond to linkage equilibrium): However, if we estimate the four haplotype frequencies from the genotype data (this is not obvious), we see that they are out of equilibrium: In this particular case, D = and D’ = 1.0, as one haplotype (1-4) has zero frequency

16 The effects of LD Linkage disequilibrium is a phenomenon whereby particular alleles of different loci are associated: people who have one tend to have a second as well Linkage disequilibrium of a particular marker allele will occur when the disease locus and the marker locus are so closely positioned that recombination events between them are very rare and a certain marker allele is associated with the disease gene.

17 Design of a modified L.D. mapping test applied to HPT-JT
The main advantage of our method derives from selecting trios in which both the child and one of the parents are affected. This allows us to determine which of the two chromosomes in the child is the "case", so that we do not have to use all the four founding chromosomes to look for transmission disequilibrium.

18 A peak of haplotype sharing among the 18 HPT-JT HRPT2-carrying chromosomes
The peak points to the interval between D1S384 and D1S412, the location where HRPT2 was actually identified

19 Complex diseases What is a “complex disease”? Some definitions include: The term complex trait/disease refers to any phenotype that does not exhibit classic Mendelian inheritance attributable to a single gene; although they may exhibit familial tendencies (familial clustering, concordance among relatives). Other hallmarks of complex diseases include known or suspected environmental risk factors; seasonal, birth order, and cohort effects; late or variable age of onset; and variable disease progression (M.T. Dorak) Complex diseases are characterized by risk to relatives of an affected individual which is greater than the incidence of the disorder in the population. Complex traits may involve the interaction of two or more genes to produce a phenotype, or may involve gene-environment interactions (PhRMA Genomics) Complex diseases are those that do not show perfect cosegregation with any single locus owing to such problems as incomplete penetrance, phenocopy, genetic heterogeneity, and polygenic inheritance (Lander and Shork)

20 Complex diseases In short, from a genetical point of view complex disease are those which show many-to-many relationships between genes and phenotypes This means that if we focus our attention on a particular gene associated with a disease, we can still reason in terms of the Mendelian paradigm only if we allow for two kinds of exceptions to the one-to-one rule: Not all individuals who carry the gene are affected with the disease (incomplete penetrance) Some individuals who do not carry the gene are affected with the trait (phenocopies) Both these conditions can be treated by defining appropriate penetrance functions for the carriers and the non-carriers of the gene These penetrance functions, coupled with the specification of the gene frequency, constitute waht we call a “genetic model” Thus, a complex disease can be viewed as a collection of genetic models, each specifying the contribution of a particular gene to the development of the disease.

21 Model-based linkage analysis
Model-based is also called parametric linkage analysis, as the input must include parameters defining how we think the genotypes at the locus influence the phenotype, i.e., the mode of inheritance This is a phenocopy This is a non-penetrant case Contrary to what may appear at first sight, this pedigree may support linkage under an appropriate genetic model

22 Linkage analysis of complex diseases
Thus, classical linkage analysis can be used to map genes involved in the etiology of complex diseases; however: The genetic model must be specified correctly, otherwise spurious results (false positives) could happens; The power to detect linkage decreases very fast as far as the effect of the gene on the phenotype becomes smaller and smaller Whereas the genetics community has achieved great success in finding the genes that are responsible for a wide range of Mendelian diseases, the search for complex disease genes has been relatively frustrating, despite intense research effort in both the academic and commercial sectors. Linkage mapping, which is a powerful tool for finding Mendelian disease genes, often produces weak, and sometimes inconsistent, signals in complex disease studies. To date, only a few variants that contribute to complex diseases have been conclusively identified.

23 LD mapping of complex diseases
The rationale underlying LD mapping of complex disease genes is straightforward and similar to the justification for LD mapping of Mendelian disease genes. With both types of disease genes, the primary advantage of LD analysis remains its ability to use the effects of dozens or hundreds of past generations of recombination to achieve fine-scale gene localization. A major difference, of course, is that weak associations complicate the analysis of complex diseases and may be more extensive for these diseases than for most Mendelian diseases. Despite these challenges, LD mapping holds considerable appeal, and there is great demand to resolve the genetics of complex diseases. Consequently, many new techniques have been recently devised to carry out LD analysis, often with a view toward mapping complex disease loci.

24 How LD is generated? LD is the consequence of the genetic-demographic history of a population

25 LD is always in a dynamic state
The actual extent of LD is determined by a balance between the opposite forces of mutation, selection and drift on one side and recombination on the other side The mutation arises on a particular genetic background If the mutation increases in frequency by drift (or selection) the associated haplotype will also increase in frequency Over time the association between the new mutation and linked mutations will decay by recombination .

26 Decay of LD over generations
A mutation has occurred in the gene of the ancestral chromosome, and this has spread in the population In a series of generations (G), recombinations occur between disease allele and the surrounding marker (M) alleles, gradually dissipating the disequilibrium (gray color). Marker alleles, which are located in the close vicinity of disease allele, encompass stronger linkage disequilibrium than marker alleles located more distantly.

27 Genetic-demographic history of the human species
Analysis of data on genetic variation suggests for the human species an ancestral population size of approximately 10,000 during the period when the current pattern of genetic variation was largely established, approximately 100, ,000 years ago Thus, while we know that the human population has grown enormously since the development of agriculture approximately 15,000 years ago, most human genetic variation arose and became established in the human population much earlier than this, when the human population was still small. This means that the number of generations elapsed since the origin of the species has been insufficient to cancel the effect of the founding chromosomes on LD, so that significant LD exist at the interpopulation level at small genetic distances (~1 Mb)

28 Local patterns of LD Past human demography included population founding events, expansion, and migration, and each of these factors plays a complex role in determining local patterns of LD In particular, there seems to be less LD within African populations than in populations outside of Africa The primary reason is that human migrations out of Africa probably only sampled a subset of the total diversity that was within Africa, and the resulting founder effect could have inflated LD Plot of the decay of average LD versus the physical distance of SNPs. RED: Asians; BLUE: European Americans; BLACK: African Americans. The observed pattern varies widely across different regions of the genome.

29 Two-locus LD in European’s X chromosome
Standardized linkage disequilibrium (D') between markers of the X chromosome as a function of the intermarker distance. Large symbols: D' values with nominal P ≤ 0.01 (blue: adjacent markers; red: LD computed at 5marker intervals). Dots: D' values with P >0.01. Marker pairs with distance < 1 kb have been omitted.

30 Multilocus LD of the X chromosome
Bars represent sliding windows of 5 markers each, whose D* value is plotted. The line under the chart shows the marker location; a large gap centered at 60 Mb may be noted.

31 The interest of isolated populations
Because LD reflects the history of recombination, populations with different demographic histories will often display different LD patterns. In recently founded groups, such as the Finnish or Mennonite populations, LD may be seen for loci separated by several cM or more. These patterns have led to the suggestion that younger populations may be most useful for the initial detection of a disease locus via LD at large distances. Subsequently, older populations, in which more recombinants have accumulated, may be more useful for the fine-scale LD mapping of the disease locus. Encouraged by the singular successes of LD-based mapping of Mendelian disorders in isolated populations, many investigators are now turning to these populations in the search for loci underlying complex diseases. The reasoning is simple: isolated populations typically have a simpler population history, with fewer founders and less population admixture. In effect, the ideal isolated population is a large pedigree with many, many generations. Therefore, it is expected that allelic and locus heterogeneity should be more limited, permitting easier detection of allelic associations.

32 Conclusions When making inferences of association between genes and complex diseases, the need to understand population subdivision is critically important. For example, if one does a case-control study, and the samples under study are a mix of two somewhat isolated population groups — one with high disease incidence and one with low incidence— then there will be a spurious association between this disease and any genetic marker that shows allele frequency differences between the population strata. When embarking in linkage disequilibrium mapping, it is important to collect detailed information on the population structure and genealogical history to maximally utilize the founder populations for novel gene discoveries. A major challenge in human genetics is to learn to recognize those relatively few genetic variants that are functionally important against the large background of neutral variation that distinguishes the genome. Knowing the population genetic structure is the necessary prerequisite to investigate the genetics of complex diseases


Download ppt "European Academy Bozen/Bolzano EURAC Dec 16, 2005"

Similar presentations


Ads by Google