Maria Eugenia D’Amato Slide 1

DNA variation in Ecology and Evolution II- Technical approach and concepts
Maria Eugenia D’Amato Slide 1 In the second class of this course we will see the most general methodological approaches to detect genetic variation. BCB 703: Scientific Methodology

Methodological approaches to the study of genetic diversity
Molecular genetics techniques Types and properties of molecular makers Factors that determine the patterns of genetic variation Slide 2 Organization of the presentation In this second class we will focus in three main topics: the technical approach to detect genetic variation, the different types of molecular markers and their properties, and the most important conceptual principles and external factors that mould the genetic variation in biological species.

Molecular techniques Southern blot PCR DNA sequencing Slide 3
We will see three basic techniques in this course: the complete process of Southern Blott, and spin-offs and derivatives of it; the principles of the Polymerase Chain Reaction and DNA sequencing.

Southern blot (1977) Fragmentation of genomic
DNA in a reproducible way 2. Separation of the fragments in an electric field 3. Transfer of the fragments from gel to a membrane 4. Probing of the membrane with known DNA 5. Detection of the probe Sir Edwin Southern 1938- Nobel Price Slide 4 Southern blot This technique was developed by Sir Edwin Southern in 1977, to detect single genes in complex genomes. This technique was developed further by Sir Alec Jeffreys, who utilized the variation at minisatellite loci to identify individuals. The method consist of the following steps: Fragmentation of genomic DNA in a reproducible way Separation of the fragments in an electric field Transfer of the fragments from to a membrane Probing of the membrane with known DNA Detection of the probe

Southern blot Restriction enzymes Southern blot steps
molecular scissors Southern blot steps Slide 5 The first step consists of a controlled fragmentation of genomic DNA. For this we use RESTRICTION ENZYMES. These are protein whose enzymatic activity consists of recognition of a 4 – 6 base pairs palindrome and cutting the DNA double chain. The products of the digestion are loaded onto an agarose gel, which is subjected to an electric field. The “mesh” formed by the agarose will difficult the mobility of the DNA fragments according to their size: small fragments will migrate faster to the positive electrode, and the bigger fragments will migrate slower. When repetitive DNA is recognised by restriction enzymes, the product of the digestion are distinctive bands. The interspersed repetitive human sequences Alu were named after the restriction enzyme (Alu I) that recognised a site within these repeat units. Genomic single copy DNA is as a semar in the gels. After the DNA hast been separated in the gel according to size, this DNA is transferred to a nylon membrane by capillarity. The membrane is placed on top of the gel, and a pile of dry paper on top will “suction” a solution that will pass onto the paper while the DNA keeps stopped and attached to the membrane. This membrane is then incubated along with a probe: a fragment of DNA of known origin that is labelled with a radioactive nucleotide (with P32). The radioactivity is detected by exposure to an X- film, indicating the position in the gel where the probe found homologous DNA. Although this method was developed for locating single genes within a gemome, its great potential allowed the application of “sub-techniques “ and the development of derivatives of it. Following, we will see a few examples. Southern blot image:

DNA fingerprinting Multilocus Trout DNA digested with Hinf I Unilocus
(GATA)4 (GGAT)4 Slide 6 DNA fingerprinting (See Slide 19 Class I) DNA fingerprinting is a thechnique that derived from the genral Southern blot method. By contrast to the original Southern Blot technique, instead of using a probe that is himilogous to a single copy gene, DNA fingerprinting use probes that come from the repetitive regions of the genome. Sometimes these repreat motifs are scattered throughout the genome and are located in the same or in different chromosomes. These are called multilocus fingerprinting. Unilocus fingerprinting is produced by probes that detect repeat unit that occur at ONLY ONE locus. These are more important in the studies of individual identification. In the graph on top of the slide, we see a diagram of a single locus minisatellite in an heterozygote individual: the two chromosomes present different number of repeat units, two different alleles. The arrows are indicating the recognition site for a restriction enzyme. The diagram on the left shows the radioautography of the DNA fingerprinting profile of 5 individuals of rainbow trout, digested with the restriction enzyme Hinf I. Two probes were used to detect multilocus minisatellites with the repeats motifs GATA and GGAT. Multilocus DNA fingerprinting was extensively used in the ‘80s to detect genetic variation in natural populations. The radioautography on the right shows a single locus DNA fingerprinting profile. An homozygote and an heterozygote individuals are indicated. DNA fingerprinting of trout image: Genet. Mol. Biol. vol. 21 n. 2 São Paulo June 1998 Unilocus DNA fingerprinting image: homozygote heterozygote

RFLPs Restriction Fragment Length Polymorphism.
mtDNA PCR 500 250 500 bp Slide 7 RFLP For PCR see Slide 9, this presentation Restriction Fragments Length Polymorphism is a technique that allows detecting changes (point mutations) at a very low cost, with the use of restriction enzymes. The main objective is to subject a fragment of DNA, usually obtained by PCR, to digestion with multiple restriction enzymes to detect as many polymorphic sites as possible. This technique was also applied in the initial studies of polymorphism in mtDNA. The whole mtDNA was extracted from tissues, digested with restriction enzymes and the fragments were separated by electrophoresis in agarose gels. Although relatively cheap and fast, some information is lost in the process as it is impossible to know which of the bases at the recognition site has changed, unless we obtain the sequence information for the fragment of interest. The RFLP in the slide shows a restriction analysis of a 500 bp of the mitochondrial gene COI in the oyster Crassostrea ariakensis. Each lane is representing DNA of one individual digested with a restriction enzyme. We can say, then that this population or species shows two haplotypes with, e.g., the restriction enzyme Hae III. More accurate information about the level of genetic diversity in this species would be obtain with the use of multiple restriction anzymes. Image : Restriction site

PCR (1981) Polymerase Chain Reaction In vitro replication of DNA
Slide 8 PCR Kary Mullis Polymerase Chain Reaction I a technique that revolutionized the study of DNA, and replaced the utilization of the Southern Blot in studies of single genes. The technique was visualized by Kary Mullis in 1981, and received the Nobel Price in 1993. Image; Kary Mullis 1938- Nobel Price 1993

PCR DNA Copies = 2n , n = number of cycles
After 30 cycles: 107 million copies Slide 9 PCR Conceptually, PCR is the replication of DNA in a tube, in a controlled manner. Unlike the DNA replication before cell division, these replication only copies a fragment of DNA framed by small complementary DNAs called primers. Primers are usually base pairs in length. The separation of the strands for DNA replication is performed by the action of enzymes within the cell. However, in the reaction tube we use temperature (93 to 95˚ C) to disrupt the Hydrogen bond between complementary bases. In high temperature conditions, the DNA polymerases or organisms living in standard conditions of temperature would be denatured. Therefore, the DNA polymerase utilized in PCR was isolated from thermofilic bacteria that live in temperatures above 100 C). The DNA polymerisation is carried out from the starting point of the primers. This reaction is repeated for about cycles, and then increase in the number of chains present in the reaction tube can be expressed as 2n. Thus, at the end of the 30th cycle, the reaction tube would contain 107 x106 copies of the original template DNA. PCR machines

Applications of PCR: microsatellite genotyping
priming site x ♂  ♀  Slide 10 Applications of PCR: microsatellite genotyping. See Slide 20, Class I. In this slide we show an example of the application of the PCR. The upper scheme shows an heteroztgote individual for a microsatellite locus, and the flanking regions that are used as a priming site for the PCR. Microsatellites are co-dominant multiallelic markers: both alleles can be detected and multiple alleles per loci are frequent. For instance, an average number of alleles per loci in fish species ranges from 10-25; some herring loci have 60 alleles. The PCR products are resolved in a gel. Each individual is expected to show 1 or 2 bands for each microsatellite locus. In the example below, there is a controlled crossing between two heterozygote fish and the offspring shows all the possible expected combinations of genotypes. See Slide 20, Class I. Pedigree analysis

Applications of PCR microsatellites for mating strategies
Polyembryony in bryozoans? Slide 11 Applications of PCR: microsatellites for study of reproductive strategies. See Slide 20, Class I. In this slide we show an example of the application of microsatellite markers to elucidate mating strategies in bryozoan organisms. The marine bryozoans form the order Cyclostomata, have incubating chamber for their embryos, that are occupied by many embryos at different stages of development before they are released into the environment. These embryos were proposed to have originated from a single fertilization event, followed by subdivision and multiplication of embryos within the chamber. This reproductive strategy is called polyembryony. However, the hypothesis of polyembryony was only based on observation of embryos in the chamber; multiple fertilization events or parthenogenesis (the development of an egg without contribution from father) could not therefore be excluded till confirming the genotypic composition of the offspring and mother. For this, in many colonies, many embryos within a chamber were genotyped along with their mother. The results are shon in the Table of the slide. The numbers indicate the allele sizes in base pairs, just as there are read after resolving the PCR products in an automatic sequencer. The table in the slide shows the genotyping of two colonies (A and BA) at 5 microsatellite loci. For the first, 8 chambers were genotyped, and 3 chambers were genotyped for the second colony. N is indicating the number of embryos that were genotyped per chamber. As these animals have external fertilization, it was not possible to genotype the father. The most important results can be summarized as follows There is a father. Paternal alleles are indicated in red. This observation discards parthenogenesis all embryos within the same chamber show the same genotype, therefore confirming the hypothesis of polyembryony. The genotypes for loci Cd17-3 at the first chamber show unexpected genotypes: some of the chambers shows alleles that are not present in the mother. The mother’s and the embryo’s genotypes are all apparently homozygotes. However, these are not the true genotypes and this example is reflecting one problem not rarely encountered in microsatellite loci and other PCR-based genotyping techniques: the problem of null alleles. Sometimes, the mutations occur at the priming sites, preventing the primer from binding to the template DNA, and, therefore, the PCR reaction from proceeding. The result is that in heterozygote individuals only one allele will be amplified. In our case, the genotype of the mother in 233/ null allele. The genotypes 229/229 and 237/237 in the embryos are showing inheritance of the mother’s null alleles and the paternal (either 229 or 2370 allele. The problem of null alleles induces then to underestimation of genetic diversity in natural populations. Reference Roger N. Hughes1, M. Eugenia D’Amato, John D. D. Bishop, Gary R. Carvalho, Sean F. Craig, Lars J. Hansson, Margaret A. Harley and Andrew J. Pemberton (2005).Paradoxical polyembryony? Embryonic cloning in an ancient order of marine bryozoans. Biol. Lett. 1, 178–180. Incubating chamber

Applications of PCR. Anonymous loci
AFLPs (Amplified Random Length Polymorphism) RAPDs (Random Amplified Polymorphic DNA) Slide 12 Anonymous loci Sometimes, the generation of genetic profiles is difficult for some organisms of relatively unknown genomes: no “universal primers” are available, if they are available they may not match well enough as to allow the PCR reaction to proceed, ‘universal probes” for multilocus DNA fingerprinting do not detect variation. In these cases, a PCR-based technique may solve the problem. In relaxed PCR conditions, using very short primers, only one primer per reaction (oligomers 10 bases long) and low annealing temperatures (e.g. 37 ˚C) to allow these primers to match the genome, a relatively large amount of PCR products are obtained. These products, however, are ‘anonymous” as we their content is unknown. However, the structure of the amplified region, an intervening stretch of DNA situated between inverted repeats, resembles the structure of transposon-like elements; therefore, the RAPD fragments are more likely to contain repetitive elements. Although it is a very straightforward, easy and cheap technique, the RAPD profiles are sometimes difficult to reproduce. The scoring of the diversity in these RAPD loci is as follows: each band is considered a biallelic locus, one allele being “band present” and another allele being “band absent”. The associated problem with this, is that a heterozygote individual BAND PRESENT/ BAND ABSENT cannot be distinguished from an homozygote individual BAND PRESENT/ BAND PRESENT. For this reason RAPDs are called DOMINANT MARKERS. In contrast: microsatellites are co-dominant multiallelic markers. AFLPs (Amplified Fragment Length Polymorphism) are conceptually similar to RAPDs: dominant biallelic systems. However, are more reproducible across labs after a “molecular trick” that adds adaptor, fragments of DNA of known sequence, that are used as priming sites for PCR. On the left: RAPD profiles of different strains of Bradyrhizobium japonicum, a nitrogen-fixing microorganism, frequently used for seed inoculation of soybean to increase the crop yield. Dominant multilocus biallelic markers

DNA sequencing The old days…. Automatic sequencing A C G T Slide 13
Slide 13 DNA sequencing There are several methods available to determine the actual sequence of nucleotides in a segment of DNA. One procedure uses specially altered nucleotides called dideoxynucloetides that have been made either radioactive or fluorescent. When enough of the targeted DNA fragment has been amplified by PCR, the mix is put through a series of DNA sequencing reactions that are a variation of PCR. The amplified product from the PCR is added to a reaction tube containing the same Taq DNA polymerase used in the PCR, one primer that can hybridize at the desired location (as opposed to both strands in PCR), and all four of the nucleotide bases (A, T, C, G.) In addition, small amounts of fluorescence labelled dideoxynucleotides (A, T, C, G) are added to the mixture. Dideoxynucleotides are human-made nucleotides whose sugar component is slightly different from that of the nucleotides that make up DNA. The sugar lacks of the hydroxyl group that covalently links to the phosphate of the next nucleotide in the growing chain. Therefore, the DNA polymerisation process stops at a dideoxynucleotide as no more nucleotides can be added to that chain. Each dideoxynucleotide is labelled with a different fluorescent compound so that it will give off an identifying colour in a laser beam. Prior to the automatic sequencing, the ddNTPS were labelled with radioactivity and these products were resolved in gels (see Figure on the right) and exposed to X-ray films. We can read the DNA sequence from bottom to top. After cycles of the PCR heating and cooling, the resulting mixture will contain a series of fragments of different lengths depending on how many bases had been added to the chain before one of the dideoxynucleotides sneaked in and blocked further growth. In evolutionary studies we often want to see the pattern of genetic variation at a certain loci. For this, we obtain DNA sequence information for many individuals and we compare the variation through aligning the different sequences, an example of a DNA alignment is given in Slide 25, in the previous class. See graphic explanation at Images: CTCCGGCTGTAACCTTCAC…

Molecular Markers Individual identification Genic variation
Physical location in a genome whose inheritance can be monitored polymorphic Parentage, relatedness, mating systems Individual identification Genic variation Gene genealogies Gene flow, drift Slide 14 Molecular markers Molecular Marker: is an identifiable physical location or LOCUS in the genome (e.g., restriction enzyme cutting site, gene) whose inheritance can be monitored. Markers can be expressed regions of DNA (genes) or some segment of DNA with no known coding function but whose pattern of inheritance can be determined. In general, we use markers of biparental, Mendelian inheritance, or markers of only-maternal or only-paternal inheritance. The molecular markers should detect genetic variation or polymorphism. The development of the PCR, automated sequencing and advent of microsatellites made a profound impact in Molecular Ecology and population genetics. Also important are the latest developments of statistical methods and user-friendly computer packages. In general, different molecular markers can suit better the type of question we want to answer. In terms of molecular change we can distinguish three different levels of population biology at which we can access information: 1. individual identification 2. genic variation 3. gene genealogies As for individual identification and analysis of relatedness you can see examples in Slides 10 and 11. The changes in allele frequencies in population can be originated by many different processes. First, we will see how mendelian principles apply to genetics in populations. Phylogeography, speciation, deeper phylogenies

Genes in populations Aa AA pq p2 aa Aa pq q2 A a p = 0.6 p = 0.4 A
Slide 15 Population genetics 1 The genetic information contained in a group of organisms in the gene pool. More practically, we can think of the set of all copies of all alleles than can be potentially contributed to the next generation. One important, central question in the understanding of how the gene pool is transmitted to the next generation is if we can predict the genetic composition of the next generation. The answer is YES, applying the principles of mendelian inheritance (Slide 5, Class I). We can think of the following example in a biallelic system: If there are N GAMETES carrying allele A and N gametes carrying the allele a, we can call their respective frequencies in the population p and q. In a graphical example similar to that of the Punnet square (Slide 5, Class I)., in which the frequencies are represented by a fraction of the sides of the square, we can predict the genotypic and allelic frequencies of the next generation. As the sum of all alleles frequencies for a given locus in a population should sum up 1, the side of the square is 1, and the area of the square is also 1. In the present example, the frequency of ovules with allele A is 0.6, and the frequency of ovules with allele a is 0.4; the frequency for spermatozoid carrying allele A is 0.6 and the frequency of spermatozoids carrying the allele a is 0.4. In this example the allele frequencies in sperm and eggs are equal. From the Mendelian laws, we can predict all the possible genotypic combinations that can result from a fertilization event. Now, imagine what happens to a population with several hundreds of individuals: we can still predict all the possible combinations and their proportions, the frequency in which they will appear in the next generation, and the allele frequencies in the next generation. We obtain the frequency of a specific combination simply by multiplying the frequency in the population of the gametes that gave them origin. For instance, the frequency of individuals of genotype AA is the frequency of ovules carrying allele A multiplied by the frequency of spermatozoids carrying allele A. In our case, this frequency is 0.36. Again, the genotypic proportions 1:2:1 for homozygotes for the dominant allele, heterozygotes and homozygote for the 2nd or recessive allele and their respective frequencies in the population are p2, 2pq and q2 respectively. All these frequencies also sum 1. 0.36 0.24 aa a p = 0.4 Aa pq q2 0.24 0.16

Genes in populations: equilibrium of Hardy Weinberg
(p + q) 2 = p2 + 2pq + q2 p = freq A q = freq a the organism is diploid with sexual reproduction generations are non overlapping loci are biallelic allele frequencies are identical in males and females random mating population size is infinite no migration, no mutation, no selection Slide 16 HWE We can summarize the combination of ovules and spermatozoids by the expression (p + q) 2 and the resulting combinations as the expansion of the above expression, which results in p2 + 2pq + q2. In other words, the chance of two gametes with allele A coming together is, p x p = p2 , the chance of gametes A and a coming together (or the occureence of heterozygotes has a probability of. ) 2pq, and finally, the chance of two gametes a coming together is q2. This is the fundamental concept of the equilibrium of Hardy-Weinberg. This is no more than a model that can help us making predictions about the fate of alleles in populations. As all models, it is based on assumptions: · the organism is diploid · with sexual reproduction · generations are non overlapping · loci are biallelic · allele frequencies are identical in males and females · random mating · population size is infinite · no migration · no mutation · no selection Assumptions

Hardy Weinberg Equilibrium
Consequences of the model Allele frequencies remain constant, generation after generation Genotype frequencies can be determined from allele frequencies Slide 17 Implications of HWE The Hardy-Weinberg model is only a reference model in which no evolutionary forces are acting. The main basic implications of this model are that (1) Allele frequencies in a population will no change, generation after generation (2) If allele frequencies in a population are given by p and q, the genotype frequencies can be predicted from allele frequencies by p2 ,2pq and q2. A similar principle can be applied to multiallelic loci. If populations are out of equilibrium, and allele frequencies are the same in males and females, and generations do not overlap, the equilibrium of allele frequencies are attained after only one genration. If genberations are overlapping the HWE is attained gradually after a few generations of random mating.

HWE- Mathematical example of deviation from equilibrium
Expected genotype freqs In pop I: ( )2 = x 0.6 x = Slide 18 Example HWE In the examples of the table, we show the genotypic observed frequencies of 4 hypothetical populations. Although all show the same allelic frequencies, the genotypic frequencies differ. For the first population, we estimated the expected genotypc frequencies from the expansion of the expression (p + q)2 using the allele frequencies. After obtaining the expected genotypic frequencies we are interested in knowing how different they are from the observed frequencies. For that we use a X square non-parametric test, at the 0.5 level of significance for 2 degrees of freedom. Degrees of freedom are calculated as the product of (columns –1) multiplied by (rows –1) of the contingency with observed and expected values. The chi-square table for 0.5 significance level and 2 degrees of freedom show a value of 5.99, which is much lower than our obtained chi-square value of Therefore, we reject the null hypothesis of no difference between observed and expected values. In biological terms, the population is out of HW equilibrium. Follow a similar approach, we will see that populations 3 and 4 are also out of HWE. If generations are non overlapping and female and male allelic frequencies are the same, all these populations will attain equilibrium after one generation of random mating. 2 = ∑ (O – E)2 2 = 44.4 d.f. = (R-1) x (C-1) = 2 2 d.f =2 = highly significant

Departures from HWE: Selection
Charles Darwin Differential survival and reproductive success of genotypes Balancing selection Frequency dependent selection Directional selection Slide 19 Selection In the following slides we will show several examples of factors that determine departures from equilibrium. This happens when the assumptions of the model are violated. We should start with “mutation”, an assumption that is clearly violated in diploid organisms as new combinations can arise by recombination (see e.g. unequal crossing over in microsatellites an other repeated elements). Point mutations and other types of mutation along with mutation rates are reviewed in Class 1, Slides 25, 26, 27, 28. Selection is differential survival and reproductive success of genotypes. Evolution by natural selection was the central idea in Darwin’s “On the Origin of Species by Means of Natural Selection”, published in A famous slogan “survival of the fittest” transcended the barriers of biology and was unfortunately taken to justify political agendas during the XIX and early XX century. Selection can change allele frequencies generation after generation, and therefore is not possible to calculate genotype frequencies from allelic frequencies, thus violating the consequences of HWE. Sometimes the adaptive value of heterozygotes is higher than in homozygotes. This type of heterozygote superiority is called heterosis. In human ß-hemoglobin, a point mutation changes the glutamic acid (allele A) for valine (allele S), which is obtained by a transversion at second codon of aminoacid 6. This mutation in the hemoglobine causes the erythrocytes to change their shape dramatically: they are more fragile than normal cells and they can block blood flow in capillaried, depriving the tissues from oxygen and causing anemia. This condition is called sickling anemia. However, the individuals that carry a normal and a mutated allele are more resistant to malaria, as parasited cells are selectively destroyed. The selective pressure of malaria parasites, then, favours the occurrence of allele S at a higher frequency in populations exposed to malaria. The situation in which selection maintains two or more alleles at a locus is called balancing selection. Sickle anemia is a case of balancing selection in which heterozygotes have a higher selective advantage than homozygotes. Another type of selection favours only one allele instead of many, and therefore the frequency of this allele shifts continuously in one direction. This principle also applies to phenotypes. This type of selection is called directional selection and favours individuals with extreme traits in a population. We present several examples: Mosquitos Culex pipiens exposed to the insecticide chlorpyrifos blocks the ACE gene (acetyl choline esterase) and produces death for malfunction of the nervous system. However, some mosquitos carry a resistant gene ACER. The study of the allele frequencies of several genes along a transect of 54 km in the Mediterranean coast in France, taking in a few sites where the insecticide was previously used, showed higher frequency of the resistance gene only in this sites. All other studied genes showed allele frequencies equally distributed among populations (Chevillon et al., 1995). This shows the effect of selection on the resistance allele in areas where it conferred higher adaptive value. In the figure, the first 4 sites were affected by utilization of insecticide. Other examples of directional selection are breeding programs of domestic animals and plants, where extreme phenotypes are artificially selected. Resistance to antibiotics and pesticides function as selective agents for the resistant phenotypes. Other type of selection depends on the frequency of certain variants or phenotypes in the population. The central American butterflies Heliconius erato have many morphs, all poisonous for birds. When a morph is common, birds learn to avoid them, therefore favouring the rare morphs to become more frequent. The same process starts again for the formerly rare but at present frequent morph. This is frequency-dependent selection. Nowadays we know other mechanisms than selection that are responsible for changes generation after generation. Selection is a non-random process. There are other non-selective processes of evolution. For example, the random fluctuation of allelic or genotypic frequencies when they do not confer any adaptive advantage. This is genetic drift. Sickle cells image: Mosquito image: Recommended read: Interview to Douglas Futuyma Reference; C Chevillon, N Pasteur, M Marquine, D Heyse, M Raymond (1995) Population Structure and Dynamics of Selected Genes in the Mosquito Culex pipiens. Evolution, 49,  0.5 f ACER Normal and sickling forms of erythrocytes Heliconius erato sites

Deviations from HWE: Genetic drift
Random variation of allele frequencies generation after generation Generated by the random sampling process of drawing gametes to form the next generation dq = q1 – q0 Slide 20 Genetic drift Drift results from the vagaries of sampling of the gametes that will unite to form the next generation. This process by which allele frequencies do not change in a predetermined way in called random genetic drift. The change in allelic frequencies from one generation to another can be expressed as dq = q1 – q0, where q1 is the allele frequency in generation 1 and q 0 the allele frequency in generation 1. The magnitude of variation of allelic frequencies because of the random sampling is given by s 2 pq = (p0 q0 ) / 2N where N is the effective population size (number of individuals that are actually reproducing in the population). It is obvious then, that smaller populations will display a higher variance. In a simulation process in which several biallelic genes with initial frequency 0.5. we see the effect of genetic drift in a population of size 10 (above) and a population of size 100. We can see that the frequencies fluctuate longer (for more generations) in the population of bigger size than in the other. Another interesting observation is that the alleles have either of two finals fates: they become fixed (frequency = 1) or they are lost (frequency = 0). The direct consequence of fixation is the decrease in diversity: loci loose heterozygosity and become monomorphic. If alleles are lost in multiallelic loci, the loose in diversity is seen in the reduction of allelic number (Na). It is important to emphasize that the effect of genetic drift is more pronounced in small populations. Alleles become fixed (freq = 1) or lost (freq = 0) 2 dq p0 q N = The effect is more pronounced in small populations Genetic diversity decreases Variance in 1 generation

Genetic drift: Bottlenecks
  Original population                                    Population crash    recovery Slide 21 Bottlenecks A special case of random sampling is the bottleneck effect. Bottlenecks are a extraordinary reduction in the population size due to outbreak of diseases, natural disasters, habitat reduction of predation. The direct consequence of bottlenecks is that the natural diversity is lost: rare alleles tend to disappear, heterozygosis is reduced and inbreeding increases. An example of severe bottleneck is that of the cheetah, species that suffered an important reduction in population size by the end of the last glacial age, years ago (Menotti-Raymond and O’Brien 1993), when massive extinction of large mammals took place on several continents. Another example of bottleneck effects is that of overhunted species, such as the American bison, with estimated population sizes of over 60 million before 1492, that were reduced to only 750 animals in The present population size recovered to animals. References Menotti-Raymond M and O'Brien S J (1993). Dating the genetic bottleneck of the African cheetah. Proc Natl Acad Sci U S A. 15; 90: 3172–3176. Cheetah image: Bison image: Slide 22 Founder effect Founder effect is another special case of random sampling in populations. It is a special case of bottleneck in which a new population is established by a small number of individuals from the original population that are not representative of the total genetic variation. The occurrence of porphyria in South Africa at a higher frequency than in other populations is a case of founder effect. Variegate porphyria is an autosomal dominant trait that consists of a mutated (and therefore deficient) protoporphyrinogen oxidase, the last enzyme in the chain to synthesize the hoem group. This causes foto sensibility in the skin and neurovisceral crisis. The origin of all south Africans suffering from prophyria can be traced back to Ariaantje and Gerrit Jansz, who emigrated in the 1680s from Holland to South Africa, one of them bringing along an allele for the mild metabolic disease porphyria. They had 8 children, 4 of them suffered from porphyria. Today more than South Africans carry this allele. Porphyiria Hift RJ, Meissner PN, Corrigall AV, Ziman MR, Petersen LA, Meissner DM, Davidson BP, Sutherland J, Dailey HA, Kirsch RE.(1997) Variegate porphyria in South Africa, new developments in an old disease. S Afr Med J. 87: image: Cheetah: Late Pleistocene bottleneck American bison: Over hunting bottleneck

Genetic drift: Founder effect
                                  Slide 22 Founder effect Founder effect is another special case of random sampling in populations. It is a special case of bottleneck in which a new population is established by a small number of individuals from the original population that are not representative of the total genetic variation. The occurrence of porphyria in South Africa at a higher frequency than in other populations is a case of founder effect. Variegate porphyria is an autosomal dominant trait that consists of a mutated (and therefore deficient) protoporphyrinogen oxidase, the last enzyme in the chain to synthesize the hoem group. This causes foto sensibility in the skin and neurovisceral crisis. The origin of all south Africans suffering from prophyria can be traced back to Ariaantje and Gerrit Jansz, who emigrated in the 1680s from Holland to South Africa, one of them bringing along an allele for the mild metabolic disease porphyria. They had 8 children, 4 of them suffered from porphyria. Today more than South Africans carry this allele. Porphyiria Hift RJ, Meissner PN, Corrigall AV, Ziman MR, Petersen LA, Meissner DM, Davidson BP, Sutherland J, Dailey HA, Kirsch RE.(1997) Variegate porphyria in South Africa, new developments in an old disease. S Afr Med J. 87: image: 1 couple carrying the allele immigrated SA in 1688 Today: descendant South Africans are affected Skin photo-sensitivity in a porphyria patient

HWE departure/restoration Migration
Migration = Gene flow transfer of alleles from one gene pool to another After m, 80% of the island is A1A1 and 20% A2A2 Slide 23 Migration Migration in the evolutionary sense implies movement of alleles from one population to another population. In other words, there is gene flow between populations. Mechanisms of gene flow are dispersal of juveniles or larvae, transport of pollen, sores, etc. by wind, occasional long distance dispersal, etc. One of the most widely utilized models of migration is the continent-island model, in which the direction of the migration in one direction is more important than the almost negligible migration from the island to the continent. In the slide we see the picture of the continent-island migration model. Imagine that allel A1 is fixed in the island and allele A2 is fixed in the continent [ in other words genotypic frequencies for the locus A are, in the island: A1A1=1; A1A2 = 0 and A2A2 = 0; and in the continent A2A2=1; A1A2 = 0 and A1A1 = 0]. After a major migration event from the continent, 80 % of the island population is A2A2 (q2 = 0.8) and 20% A1A1 (p2 = 0.8). These genotypes are out of HWE, but the eqiolibrium is attained after 1 generation of random mating. Migration can thus act as an homogenizing factor. m Genotypes out of HWE A1A1 = 1 After 1 generation genotypes are in HWE A2A2 = 1

non random mating- drift – no gene flow Population structure
Differential allele frequencies between subpopulations inbreeding coefficients : measure of H deficiency at different hierarchical levels Wahlund effect: H deficiency due to subdivision, drift and inbreeding FIS = (Hs – Ho) / Ho within a subpopulation Slide 24 Population structure Population structure is almost universal among organisms. Many occur as flocks, herds, colonies, and display patchy distribution. Besides, the natural habitat can become fragmented because of the onset of barriers and subpopulations consequently undergo isolation. In this scenario, genetic differentiation occurs because different subpopulations would acquire different allelic frequencies through the processes of selection and drift in the presence of non-random mating (inbreeding) and restricted migration (gene flow). The opposite of this situation is panmixia or random mating. The differences in allele frequencies among subpopulations caused by local inbreeding and fixation of alleles by random drift, determine that the overall heterozygosity is lower than the expected under HWE in panmixia. This effect was described by sewen statistitian Sten Wahlund, and therefore was so called “Wahlund effect” after him. The breeding structure of a population can be described by a hierarchical grouping of inclusive levels: within a subpopulation, between subpopulations, overall subpopulations. The inbreeding effect at each of these levels is described by a specific fixation index F for multiple loci (Nei 1977): FIS = (Hs – Ho) / Ho Where Hs is the average expected H within a certain subpopulation. Ho is the average expected H within subpopulations overall loci Fis measures the correlation of alleles with individuals, the correlation between 2 uniting gametes within the subpopulation. It indicates the degree of inbreeding of a subpopulation; it is a measure of deviation of HWE within a subpopulation. A positive value indicates deficiency of heterozygotes. FIT = (HT – H0) / HT Where HT is the average expected H in the total population overall loci. FIT is the correlation between 2 uniting gametes relative to the total population. It is the overall inbreeding coefficient (F) of an individual relative to the total population (Individual within the Total population). Fit is rarely used FST = (HS – HT) / HT Fst is the correlation between two gametes drawn at random from each subpopulation and measures the degree of differentiation between subpopulations. It measures the reduction in heterozygosity in a subpopulation due to genetic drift. In other words, it is the proportion of the total heterozygosity that is explained by the subpopulation heterozygosity. When Fst = 0 subpopulations are not differentiated. See example in Slide 11, Class III. Population genetics webpages and links FIT = (HT – H0) / HT among individuals overall populations FST = (HS – HT) / HT between subpopulations Ho = aver. observed H within a subpopulation over loci Hs = aver. expected H within subpopulation over loci Ht = aver. expected H overall

Examples of population structure
1 Out of HWE 2 Slide 25 Examples of population subdivision In this slide we see two numerical examples of a study of genetic differentiation between 2 populations. Imagine we want to investigate the level of gene flow between two sites on the geographical distribution of a species. In the first example we have two population with same allele frequencies, but one of them is out of HWE, with deficiency of heterozygotes, which is indicated by the deviation of Fis from 0. The deficit of heterozygote in the second population deviates Fit from 0. However, because the allelic frequencies between these populations are similar, the Fst value is 0, indicating gene flow and no genetic differentiation. The second example shows two population in HWE, but because of the variation in allele frequencies between them, Fit and Fst are different from 0. Fst ≠ 0 indicates no gene flow, population structure, population subdivision. In HWE

Gene genealogies: a historical perspective
Lineage: individuals or taxa related by a common ancestor Slide 26 Gene lineages The examples we saw in the previous slides show a snapshot of the present situation of a population or group of populations. If we add history to the distribution of genetic diversity we have a historical perspective: lineages are represented by phylogenetic trees or networks, that implicit time. An evolutionary lineage (also called a clade) is composed of species, taxa, or individuals that are related by descent from a common ancestor. Lineages are often determined by the techniques of molecular systematics. A lineage can be distinguished from a mere collection of species by the fact that it contains only and all individuals that share a common ancestor. In phylogenetic reconstructions above the species level, the lineages are represented by dicotomic trees, where the nodes represent an extinct common ancestor and the legth of the branches represent the amount of change accumulated since the divergence. Below the species level, the lineages can also be represented by phylogenetic trees, but are better represented by networks, as the “ancestors” (represented by individuals at the nodes) are alive, and networks allow polytomies. In the slide we present an example of a phylogenetic tree of several Diptera, and a network of all detected mtDNA haplotypes for the microfrog Microbatrachella capensis. Within species, the most common genetic markers to reconstruct genealogies are those of uniparental inheritance: DNA sequencies in mtDNA and markers (DNA sequencies or microsatellites) in the Y chromosome in mammals. When the lineages are superimposed to their biogeographic distribution, the analysis of population structure takes a biogeographic and historical perspective. Fly phylogeny image : Definition in: Phylogenetic tree

Diversity with uniparental markers
n haplotypes h Haplotype diversity = Total n individuals Slide 27 Diversity in uniparental markers The most utilized measure of diversity for diploid markers is H (heterozygosity). Equivalent measures of diversity were also designed for uniparental markers like mtDNA sequences. The simplest one is the “haplotype diversity” h, which is simply the relative number of haplotypes present in a population. It is obtained by dividing the different haplotype counts by the total number of individuals. The other well-extended measure is the nucleotide diversity, which is equivalent to the heterozygosity at the nucleotide level. It measure the average number of nucleotide differences between haplotypes at the population. It is given by the equation p = n / (n-1) Σ xixjp ij where p ij is the proportion of different nucleotide between sequence i and sequence j. n is the total number of sequences in the sample. Xi and xj are the frequencies of sequences xi and xj in the population. For instance, a sample with high haplotype diversity means that the proportion of different haplotypes in the population is high. A value of h = 0.9 indicates that 90% of the individuals carry a different haplotype. A value of p indicates that, on average, the sequences in this sample differ in 0.1 %, or that, in pairwise comparisons, 2 any sequences taken from this sample would differ in 1 out of 1000 nucleotides. This is a very low nucleotide diversity value. Now, assuming that these two measure were taking in the same population we can say that this sample is composed by a high number of haplotypes, and that all are very similar to each other. A possible evolutionary scenario for this case is that of a recent population expansion, where a considerable number of new haplotypes have recently originated from one or a few haplotypes. When this is portrayed as a network, it is represented as a star-shaped phylogeny. n Σ xixjpij Nucleotide diversity p = n -1

Phylogeography Population bottlenecks, expansions Gene flow
Study of geographic distribution of lineages Slide 28 Phylogeography (See example in Slides 1-3, Class III) Phylogeography is the study of the processes controlling the geographic distributions of lineages by constructing the genealogies of populations and genes (Avise 2000). Phylogeographical inferences are usually made by studying the reconstructed genealogical histories of individual genes (gene trees) sampled from different populations. The processes that can be inferred from a tree or network are historical demographical factors like population bottlenecks, population expansions, and levels of gene flow (e.g. vicariance). The network in the slide was reconstructed from variation at the mtDNA control region of 300 haplochromine cichlids from Lake Victoria region. Cichlids underwent an explosive radiation (rapid diversification and speciation) during the Pleistocene. The size of the circles represent the frequency in the sample, the colours indicate the geographical origin of the sample. The star-shaped phylogenies represent a population expansion: the diversification process starts from a frequent haplotype. The presence of the same haplotype in different regions indicates gene flow (e.g. the biggest circle on the left hand side represent a haplotype that is present in Lake Kivu and Uganda lakes. References Avise, J. C. (2000). Phylogeography: the history and formation of species, Harvard University Press. The figure was taken from Verheyen E, Salzburger W, Snoeks J, Meyer A (2003). Origin of the Superflock of Cichlid Fishes from Lake Victoria, East Africa. Science 300:

ESUs Waples 1991: populations that are reproductively separate from other populations and have unique or different adaptations. Moritz 1994: populations that are reciprocally monophyletic for mtDNA alleles and show significant divergence of allele frequencies at nuclear loci. Crandall et al ecological exchangeability genetic exchangeability Reciprocal monophyly Slide 29 ESU Several factors have to be taken into account to decide whether or which populations are of value to maintain the evolutionary potential of the species. The principles of population genetics and the utilization of a variety of molecular markers facilitated in the last 2 decades the identification of population or species that can be particularly vulnerable to extinction due to, e.g., reduction in diversity, or populations that represent unique or rare lineages. Most of the phylogenetic reconstruction and identification of lineages utilized neutral variation. Thus, the concept of ESU gave major importance to identification of lineages and phylogenetic reconstruction. Nowadays, the adaptive value of a population, although sometimes difficult to measure, it proposed to be taken into account. Following, we review the concepts of ESU that have been utilized in the last 15 years: Ryder 1986: populations that actually represent significant adaptive variation based on concordance between sets of data derived by different techniques. Waples 1991: populations that are reproductively separate from other populations and have unique or different adaptations. Moritz 1994: populations that are reciprocally monophyletic for mtDNA alleles and show significant divergence of allele frequencies at nuclear loci. More recently, Crandall (Crandall 2000) incorporated the concepts of “genetic” and “ecological exchangeability” to define ESUs Ecological exchangeability implies that individuals can be moved to the area occupied by other individuals of the same species and occupy the same niche or selective regime. Characters that confer ecological exchangeability should be heritable. Ecological exchangeability is rejected when populations underwent differentiation due to drift or selection, if they display different loci under selection or different life history traits. Genetic exchangeability is directly related to the level of gene flow. Unique alleles, low gene flow estimates and phylogenetic divergence concordant with geographic barriers are criteria for rejecting genetic exchangeability. References Crandall KA, Bininda-Emonds ORP, Mace GM, Wayne RK (2000) Considering evolutionary processes in conservation biology. Trends Ecol. Evol., 15, Moritz, C. (1994) Defining ‘evolutionary significant units’ for conservation. Trends Ecol. Evol. 9, 373–375. Ryder, O.A. (1986) Species conservation and systematics: the dilemma of subspecies. Trends Ecol. Evol. 1, 9–10 Waples, R.S. (1991) Pacific salmon, Oncorhynchus spp., and the definition of ‘species’ under the endangered species act. Mar. Fish. Rev. 53, 11–22

Maria Eugenia D’Amato Slide 1

Similar presentations

Presentation on theme: "Maria Eugenia D’Amato Slide 1"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Maria Eugenia D’Amato Slide 1

Similar presentations

Presentation on theme: "Maria Eugenia D’Amato Slide 1"— Presentation transcript:

Similar presentations

About project

Feedback