Presentation on theme: "Molecular Epidemiology. This is the principle technique of scientific inquiry: by changing the scale of description, we move from unpredictable, unrepeatable."— Presentation transcript:
This is the principle technique of scientific inquiry: by changing the scale of description, we move from unpredictable, unrepeatable individual cases to collections of cases whose behavior is regular enough to allow generalizations to be made. (S. Levin, 1947)
Epidemiology Originally Study(ology) upon(Epi) populations of people(demes) Now much broader. inquiry into events that take place over very different temporal scales: From identification of organisms that have diverged millions of years ago, to the tracing of contacts.
Identity of an infectious agent in an outbreak On a Large Scale:
Ribosomal RNA Coding Regions: Highly conserved across widely divergent species. Transcribed Spacer Regions: Less conserved. Different between closely related Species. Non-transcribed Spacer Regions: Vary between and among species.
Where is it?
What is it? Microscopy: Molecular Methods: Figure 1.-Oocysts of a Cyclospora Species (Panel A), Cryptosporidium muris (Panel B), and C. parvum (Panel C) (Modified Acid-Fast Stain) DNA sequencing. Use “Moderately variable” regions. Such as the transcribed spacer.
Cyclospora ITS-1 1. Use conserved primers from the flanking (coding regions), amplify and sequence ITS. 2. Design primers common to all Cyclospora isolates. 3. Test sensitivity and specificity of primers. 1.Use conserved primers from the flanking (coding regions), amplify and sequence ITS. 2. Design primers common to all Cyclospora isolates. 3. Test sensitivity and specificity of primers. Amplified 36 C. cayetanensis from around the world Did NOT amplify 20 species with similar pathology Among them Cryptosporidia. Faint band from Babesia gibsoni.
Assumptions: False Positives: Less stringent PCR conditions False negatives: Overly stringent conditions, combined with unforeseen mutation in primer regions
Zooming in: The way to study events on a large scale may not be the way to study events on a small scale (think physics) What is TRUE on one scale may not be true on another scale.
On a Smaller Scale: Strains: Transmission cycles
GIARDIA Giardia has Two Heads!
Mycobacterium tuberculosis According to the WHO…
Mycobacterium tuberculosis According to the WHO… 2 Billion infected
Mycobacterium tuberculosis According to the WHO… 2 Billion infected 1/10 will become sick
Mycobacterium tuberculosis According to the WHO… 2 Billion infected 1/10 will become sick 2.7 million die each year
Mycobacterium tuberculosis According to the WHO… 2 Billion infected 1/10 will become sick 2.7 million die each year TB is the largest single agent killer of:
Mycobacterium tuberculosis According to the WHO… 2 Billion infected 1/10 will become sick 2.7 million die each year TB is the largest single agent killer of: Women.
Mycobacterium tuberculosis According to the WHO… 2 Billion infected 1/10 will become sick 2.7 million die each year TB is the largest single agent killer of: Women. Young.
Mycobacterium tuberculosis What is the frequency of exogenous re-infection? With MDR-TB? What are the transmission dynamics in endemic countries?
Isoenzymes Isoenzymes/allozymes: electrophoresis to determine differences in enzymes. Allozymes detect differences between alleles of a given enzyme. Very weak. Detect 60% of change, only at enzyme loci. Giardia divided into 2 clades evidence for zoonosis
RFLP Restriction fragment length polymorphism Usually a true sequence surrogate—a difference in RFLP pattern is ideally due to a change in the nucleotide sequence at one or many restriction sites. RFLP’s are highly dependent on experimental conditions.
GIARDIA RFLP of Intergenic rRNA Spacer (IGS) RFLP of the IGS locus differentiates Four strains compared to 2 identified By isoenzyme analysis.
TB-RFLP with Insertion Sequences IS6110- Fingerprinting: use alu to digest genome. Little variation in RFLP. Question is, in which fragments is the insertion element present? IS6110 is a transposon that jumps around the genome. IS6110 is not purely a “sequence surrogate,” it is also a “transposon surrogate”
IS6110 The ruler is ALIVE It is dynamic, and reaches equilibrium slower than TB in an outbreak.
IS6110 # of IS6110 copies in TB genomes varies from 0 to 25. When copy number is low, k<5, there is less change in fingerprints -contact investigation is very hard.
RAPD or AP-PRC RAPD/AP-PCR- Amplify with random primers. Sequence surrogate—Tests whether there is a change in the template regions only. Analysis is the same as that for RFLP. Cycles of low-stringency leads to amplification of contaminants. Highly dependent on reaction conditions. Groupings correspond to Isozymes.
AFLP’s AFLPs: digest DNA, ligate to adaptors, PCR Don’t need low-stringency steps, less non-specific amplification. Same analysis as RFLPs, need.2 to 1mg of DNA. No good for Giardia and other parasites—need too much DNA.
Microsatellites Simple Sequence Repeats Repeating motifs for 2- 5bp Scattered throughout the genome Amenable to PCR and cloning due to small allele size.
Minisatellites Repeating motifs 10-100 bp Analysed with DNA probes specific for a single locus.
TB Spoligotyping Spacer Oligotyping Direct repeat (DR) locus 36bp, freq. varies Use primers somewhere in the DR, amplify non- repetitive spacer sequences 34-41bp Identify the spacers by hybridization to know sequence oligonucleotides –Need sequence to generate the oligos
Depends on: Dynamics of DR regions. Change in sequence in non-repetitive regions. DR regions-are they at equilibrium? How often do they repeat? -Not yet known
Spoligotypes vs. IS6110 # IS6110#IS6110 types# Spoligotypes 1110 2-57 8 >58052 Spoligotyping can identify M. bovis (BCG vaccine) Detection and strain differentiation can be done Simultaneously without culture.
DNA sequence of small subunit (SSU) ribosomal RNA (highly conserved) suggests four groups of Giardia. Groups 3 and 4 are only in Dogs. 293bp 1-------GCG------_G---------T-------C------------------- 2-------ATC-------AC---------G------G------------------- 3-------ATC-------AC---------A------G---------T-------- 4-------ATC-------AC---------A------A----------T----A- 1 and 2 are mainly in humans, though some dogs have 3. 2,3,4 and four are nearly identical Is this good evidence against zoonosis?
Models of Nucleotide Substitution On a large scale, we can calculate the rate of substitution, then estimate the likelihood of any given substitution and control for confounders (transition-transversion, codon bias etc). On a small scale we do not know rate, the process is nearly random, and confounders may be irrelevant
Distributions BINOMIAL: Pr(Y=y)=n!/(y!(n-y) * P y (1-P) n-y Mean= nP Variance= nP(1-P) POISSON: Pr(Y=y) = u y e -u / y! Mean and Variance= u Central Limit Theorem: Large number of events normal distribution Binomial- coin toss. Poisson- rare events. Tossing a 100,000 sided die.
Kimura’s 2 parameter For instance, as the rate of transition and transversion become small Kimura’s 2 parameter model reduces to a one parameter model K= -(1/2) ln[1-2P-Q√(1-2Q) K=P + Q where K is the distance per site and P and Q are the fractions of sites with transition vs/ transversion changes.
How to Analyze RFLP and other sequence surrogates Two sources of information: number of bands, and size of each fragment. -In practice, it can be difficult to score changes in fragment size. Most studies look only at the presence or absence of a certain pattern.
Nei and Li’s model for RFLP The expected frequency of restriction sites with r nucleotide pairs depends on G+C content and G+C content of restriction site sequence: A= (g/2) r1 [(1-g)/2] r2 G= G+C of genome r1, r2 are G+C, and A+T frequencies in Restriction site. r1+r2=r
mt=number of nucleotide pairs in genome mt*a= n, the expected # of restriction sites What is the probability that the n changes over time t?
Mutations are a Poisson process. P= e -r t Mutation rate/nucleotide r= Length of restriction sequence t= Time
Nei and Li continued n(t) = number of bands at time t = n 1 (t)+ n 2 (t) n 1 (t)= # of sites that do not change n 2 (t)= number of new sites. E(n)=n 0 P + mta(1-P) or E(n 2 )+ E(n 1 ) Variance: n 1 (t) and n 2 (t) are independent Var [n(t)]= Var[n 1 (t)]+Var[n 2 (t)] n 1 (t) is binomial, n 2 (t) is poisson Var [n(t)]=n 0 P(1-P) + mta(1-P)
IS6110 is modelled similarily Transposition is rare—modeled as a Poisson process: Prob of at least 1 change= 1-e kqt Where k= # of copies of transposon in genome And q is the rate of transposition when k=1
Really Small-New Technology Genetic marking of drug resistance, or virulence -Represenational Difference Analysis (RDA) -High-throughput genotyping -Microarrays
Representational Difference Analysis “Cloning the Differences Between Two Complex Genomes” Lisitsyn Science, feb 1993 Uses Subtractive and Kinetic enrichment to purify fragments present in one population, but absent in another. –Basically differential amplification of polymorphic fragments
High-Throughput Genotyping Flourescent labels incorporated into RAPDs, microsatellites and AFLP Can run in ONE electrophoresis lane. Result: complicated fingerprints that take into account variation at different levels.
Conclusions 1 The strongest analyses will be those that consider variation on multiple temporal levels. 2. Everyone says their technique is economically feasible for use in endemic countries; no one says how much their technique costs. 3. Stay away from Guatemalan raspberries.