Chloroplast DNA Small, 120-220 kb Circular, usually with inverted repeat No recombination Inheritance usually maternal in angiosperms, paternal in gymnosperms Constant gene order in all green plants. rpl2 16S 23S rpl2 16S 23S rbcL atpB atpE large single copy region small scr matK psbA trnH
Mitochondrial DNA AnimalPlant 14-26 kb150-2500kb Circular, usually homogeneous among cells Set of different-sized circles, which arise from processes that interconvert between mother circle & subgenomic circles No recombination, inheritance maternal Mutation rates high at sequence level; substitutions Rapid evolution in gene order but slower at sequence level (ca x100 slower than in animals)
Main sources of DNA evidence Control centres turn genes on & off Genes single-copy multi-copy code for proteins Inter-genic spacers non-coding sequences between genes Introns non-coding sequences within genes transposons & retroviruses
Gene structure Exons are composed of start, amino acid & stop codons. Highly conserved regions. Useful at higher taxonomic levels, e.g. genus & above. Introns are non-coding regions within a gene. Spacers are non-coding regions between genes. Both potentially highly variable regions. Useful at genus level and below, sometimes down to population level. upstream enhancer promoter TATA box 5’ UTR3’ UTR exon 1exon 2exon 3 intron 1intron 2 spacer
Multi-copy genes: rDNA Tandem repeats: 100s to 1000s of copies. Nuclear genome: biparental inheritance. sometimes problem with concerted evolution. Coding regions (nS) highly conserved 18S gene of soyabean shares 75% nucleotide homology with yeast. ITS & IGS regions highly variable. 25S 18S 5.8S IGS ITS1 ITS2 18S IGS
Making inferences from the data Gene trees vs species or organism trees often only two genes (or regions) studied [out of ca 25,000 genes present] Data from the different genomes may or may not be congruent each genome tells its own story, which may not be that of the whole organism
Sequencing Dideoxynucleotide method deoxythymidine triphosphate P PP OH T OH H = dideoxynucleotide (stops sequencing reaction)
Sequencing reaction T A G C A Gsingle-stranded target DNA template ddTTP A T * T A G C A G and A T C G T * T A G C A G ddATP A * T A G C A G ddCTP A T C * T A G C A G and A T C G T C * T A G C A G ddGTP A T C G * T A G C A G Divide the template into four samples. To each sample add: all 4 deoxynucleotides (G, C, A, T, one of which is dye- or radio-labelled one of the 4 dideoxynucleotides, i.e. ddTTP, ddATP, ddCTP or ddGTP DNA polymerase Start reaction. In each strand of new DNA the last base is a ddNTP because it terminates chain.
Pattern of gel fragments bases ddTTPddATPddCTPddGTP 6------- C 5------- T 4------- G 3------- C 2------- T 1------- A 0 Newly synth DNA is isolated & run out on a gel. Radio-labelling -> visualised.
Phylogenetic systematics parsimony. Identifies tree with minimium number of mutations (character-state changes). maximum likelihood. Identifies tree that has the highest probability of producing the observed data, given a particular model of evolution. Bayesian inference. Like maximum likelihood but much more sophisticated. Hurts the brain! ALL TREES CAN BE TESTED STATISTICALLY!!! bootstrap jacknife decay index
Phylogenetic definitions A BC A BC A BC A BC AB monophyletic defined by a synapomorphy BC paraphyletic defined by a symplesiomorphy BC paraphyletic defined by a false synapomorphy BC polyphyletic defined by a false synapomorphy
Sequencing: pros & cons large amounts of easily scored, robust data inter-taxon comparisons easy universal primers mean prior sequence knowledge unnecessary screen one ‘locus’ at a time (time-consuming but now automated) technical problems: alignments, 2y structure, etc. heterozygosity, requires cloning to resolve
st 0.447 st 0.555 st 0.468 st 0.390 st 0.287 st 0.289 Genepool & population phenomena
RFLPs Restriction Fragment Length Polymorphisms Use restriction enzymes to cut DNA at recognition sites (usually 6b long). Separate fragments on an agarose gel. Stain fragments with ethidium bromide & view with UV.
Fragment patterns in hybrids Different patterns are the result of gains/losses of restriction sites or inversions. Co-dominant in nuclear DNA: good for detecting hybrids. enzyme 1 enzyme 2 probe 7 12 4 5 9 enzyme 1 fragments AAABBB 19 ------ 12------ 7------ enzyme 2 fragments AAABBB 14 ------ 9------ 5------ 4--------- nuclear DNA
Mapping When divergence is great, RFLP fragments are too complex to analyse. Need to map. Cloned fragments from one genome are used to probe Southern blots of different enzyme digests of that (or another) genome.
Mapping example b b a c 1 2 4 10 Single digests abc 17 ____ 10__ 7__ Double digests a+bb+ca+c 15__ 10____ 6__ 4__ 3__ 2__ 1__ Data scored as site mutations; ordered. Can be used for phylogenetic purposes. Use single and double digests
RFLP properties uniparental inheritance in plastid DNA biparental, co-dominant inheritance in nuclear DNA when divergence is great, RFLP fragments are too complex to analyse: need to use mapping approach applications: detecting hybridisation, analysis of genepool structure (phylogeography)
RFLP pros & cons Robust, repeatable data PCR-based Capable of detecting much variation if enough enzyme combinations used Logarithmic migration of fragments means small changes in large fragments are hard to detect Some restriction enzymes are sensitive to methylation
RAPD Randomly Amplified Polymorphic DNA arbitrary 10bp primers target sequences flanked by inverted repeat primer sites permits multiple annealing throughout all three genomes coding & non-coding regions; single- & multi-copy DNA inherited as a dominant (cannot distinguish htz from hmz) indiv A indiv B gel AB-- --
RAPD properties each prime site treated as +/- (diallelic) inheritance dominant (primer site present in homozygote + heterozygote but not homoz 2) presence/absence of primer sites due to many possible causes (substitutions, indels, secondary structure between prime sites) identifies multi-locus genotypes applications: gene diversity, clonality, population structure
RAPD: pros & cons simple PCR-based no prior sequence info needed non-destructive can screen large number of loci cheap problems of reproducibility product competition product homology genome sampling non-independence of loci estimates of population differentiation may be inflated
AFLPs Amplified Fragment Length Polymorphsims cut DNA with pair of enzymes: one rare cutter & one common cutter attach known DNA sequences to the products amplify products using the known sequences as priming sites rather like RAPDs but much more reproducible dominant inheritance
AFLPs Amplified Fragment Length Polymorphsims A+pr pr+G *(N) 3 A+pr pr+G(N) 3 * DNA restriction x2 digestion double-stranded adaptor ligation PCR1: preselective amplification PCR2: selective amplification common rare [primers complementary to adaptor, plus extra base pair] [primers as in PCR1, plus up to 3 extra base pairs, labelled]
AFLP properties number of fragments determined by no. bases in selective primer (1 base more fragments than 2 or 3 bases) scored as diallelic loci usually dominant inheritance applications: gene diversity, clonality, population structure, hybridisation
AFLPs: pros & cons reproducible (long primers used) PCR-based no prior sequence info needed non-destructive can screen large number of loci: 50- 100 per run technically demanding product homology? genome sampling non-independence of loci estimates of population differentiation may be inflated
Microsatellites (SSRs: Simple Sequence Repeats) Short (1-6bp), tandem repeats (10-50 copies) Mono- to tetra-nucleotides, e.g. (AT) n Random distribution assumed Primers designed for conserved flanking regions Variation in repeat number polymorphism Co-dominant inheritance GAGAGAGAGAGAGAGAGAGAGAGAGAGA (GA)7 GAGAGAGAGAGAGAGAGAGA (GA)5 pri. flanking pri. flanking pri. flanking pri.
Microsatellite properties Homologous chromosomes may have different repeat lengths, hence inheritance is co-dominant. SSRs abundant across genome (but commoner in animals than in plants) applications: population level studies, esp. gene flow
Microsatellite pros & cons co-dominant inheritance allows full genetic analysis abundant uniformly distributed thro’ genome mutation rates high large no. alleles/locus time-consuming to develop primers primer-pairs often species-specific stutter bands make interpretation hard homoplasy between alleles may be high few loci sampled