Presentation is loading. Please wait.

Presentation is loading. Please wait.

Outbreak of E. coli O104:H4 heralds a new paradigm in responding to disease threats Nicola J. Holden Leighton Pritchard.

Similar presentations


Presentation on theme: "Outbreak of E. coli O104:H4 heralds a new paradigm in responding to disease threats Nicola J. Holden Leighton Pritchard."— Presentation transcript:

1 Outbreak of E. coli O104:H4 heralds a new paradigm in responding to disease threats Nicola J. Holden Leighton Pritchard

2 EHEC O104:H4 outbreak, Europe 2011 Unprecedented: scale of outbreak (3950 affected, 53 deaths; multiple import restrictions) emerging pathogen (one previous case in S.Korea) rapid production of sequence data crowd-sourcing of assembly, and annotation via GitHub https://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/wiki

3 EHEC O104:H4 outbreak, Europe 2011 Unprecedented: scale of outbreak (3950 affected, 53 deaths; multiple import restrictions) emerging pathogen (one previous case in S.Korea) rapid production of sequence data crowd-sourcing of assembly and annotation via collaborative revision control site: GitHub https://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/wiki

4 EHEC O104:H4 outbreak – timeline 1 st May: onset of outbreak 26 th May: strain characteristics (Scheutz et al., 2012 Eurosurveill) 30 th May: diagnostic laboratory information released (Muenster) 2 nd June: first draft assembly available (GitHub) 9 th to 21 st June: additional sequences announced 22 nd June: Microbiological characteristics published (Bielaszewska et al., 2011 LID) 26 th July: official end of the outbreak (RKI) refs: https://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/wiki; RKI; Institute of Hygiene, Muenster

5 EHEC O104:H4 outbreak – timeline

6 1 st May: onset of outbreak 26 th May: strain characteristics (Scheutz et al., 2012 Eurosurveill) 30 th May: diagnostic laboratory information released (Muenster) 2 nd June: first draft assembly available (GitHub) 9 th to 21 st June: additional sequences announced 22 nd June: Microbiological characteristics published (Bielaszewska et al., 2011 LID) 26 th July: official end of the outbreak (RKI) refs: https://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/wiki; RKI; Institute of Hygiene, Muenster

7 EHEC O104:H4 outbreak – timeline 27 th July: Publication of open-source genomic analysis

8 A changing paradigm? Kwan et al. (2011)

9 Meanwhile: diagnostics 27 th June – 6 th July 1.Outbreak isolate-specific, sub-serotype diagnostics 2.Exploit rapid sequencing: work directly from incomplete and unordered draft genome sequences 3.Rapidly generated (perhaps ahead of the biology?) 4.Validated (good estimates of error rates) 5.Easy to use and distribute 6.Cheap(er than sequencing everything)

10 Meanwhile: diagnostics 27 th June – 6 th July 1.Outbreak isolate-specific, sub-serotype diagnostics 2.Exploit rapid sequencing: work directly from incomplete and unordered draft genome sequences 3.Rapidly generated (perhaps ahead of the biology?) 4.Validated (good estimates of error rates) 5.Easy to use and distribute 6.Cheap(er than sequencing everything) Alignment-free PCR primer design: no need to identify conserved signature sequences prior to primer design

11 Alignment-free primer design: strategy ‘Positive’ genome set: 11 genome assemblies of 9 EHEC O104:H4 outbreak isolates (GitHub crowdsourcing) ‘Negative’ genome set: 31 genomes of E. coli and E. fergusonii (GenBank) Design many (>1000) primers to positive genome set: target CDS; optimise for qRT; 20 mers; 100 bp amplicons; T A = 58 o C Filter primers in silico: Exclude sets with predicted productive amplification in negative genomes. Screen primers to exclude sets with strong sequence similarity to any of a larger set of off-target genomes: (GenBank Enterobacteriaceae)

12 Alignment-free primer design: strategy ‘Positive’ genome set: 11 genome assemblies of 9 EHEC O104:H4 outbreak isolates (GitHub crowdsourcing) ‘Negative’ genome set: 31 genomes of E. coli and E. fergusonii (GenBank) Design many (>1000) primers to positive genome set: target CDS; optimise for qRT; 20 mers; 100 bp amplicons; T A = 58 o C Filter primers in silico: Exclude sets with predicted productive amplification in negative genomes. Screen primers to exclude sets with strong sequence similarity to any of a larger set of off-target genomes: (GenBank Enterobacteriaceae)

13 Automation https://github.com/widdowquinn/find_differential_primers

14 Alignment-free primer design Positive Negative... III II IV V I 1. Process configuration files: Locations and classes of input sequence files. 1. Process configuration files: Locations and classes of input sequence files. 2. Convert to single (pseudo)chromosomes: Concatenate draft genome sequence. 2. Convert to single (pseudo)chromosomes: Concatenate draft genome sequence. 3. Genome feature locations: From GBK file or predicted from Prodigal. 3. Genome feature locations: From GBK file or predicted from Prodigal.

15 Primer prediction (on positive set) Positive Negative III II IV V I 4. Predict primer locations: > 1000 thermodynamically plausible primer sets on each (pseudo)chromosome, using Primer3. 4. Predict primer locations: > 1000 thermodynamically plausible primer sets on each (pseudo)chromosome, using Primer3.

16 Test cross-amplification in silico Positive Negative III II IV V I 5. Check cross-amplification: All primer sets tested against other organisms, using PrimerSearch. 5. Check cross-amplification: All primer sets tested against other organisms, using PrimerSearch. 6. BLAST screen: All primers screened for off- target sequences with BLAST: 7 possible primer sets 6. BLAST screen: All primers screened for off- target sequences with BLAST: 7 possible primer sets

17 Classify primers and validation III II IV V I... IIIIVV+ve-ve 7. Classify primers: Classified primer sets according to their ability to amplify specific classes of input sequence. 7. Classify primers: Classified primer sets according to their ability to amplify specific classes of input sequence. 8. Validate primers: Primer set validated on positive and negative targets in vitro. 8. Validate primers: Primer set validated on positive and negative targets in vitro. 5 target sequences: prophage gp20 (2) hypothetical CDS (2) impB (1)

18 Validation In silico, diagnostic primers are just another classifier Validation on unseen data is critical (avoid overfitting, estimation of performance) Direct experimental validation of primer candidates (Münster): ‘Positive’ set = 21 clinical outbreak isolates ‘Negative’ set = 32 HUSEC / EPEC isolates Positive control = LB

19 Primer design: validated in vitro positivenegative

20 Alignment-free primer design: summary Individual primer sets: 100 % sensitivity; 82–94 % specificity; 9% < FDR < 22% Combining primers: 100 % sensitivity and specificity A minimal combination of two primer sets discriminated absolutely between outbreak O104:H4 isolates and non-outbreak E. coli isolates, including HUSEC 041 Flexibility in strategy allows for targeted design, e.g. multiplex PCR / different organisms / large gene families etc.. Same approach used for  Resolving Dickeya plant pathogens  Discriminating between RxLR effectors in Phytophthora infestans

21 Alignment-free primer design: summary Bypass the need for: multiple genomic alignments biological justification for primer choice (maybe even reveal biology…) Produce diagnostic primers for any subgroup of organisms (possibly…) Limitations Scaling issue: PrimerSearch is slow (modular pipeline allows use of alternative programs) Low specificity of primers -> use qPCR Very similar organisms may not be distinguished Time from genomes to primer sets: 90 hours possibility for improvements as collaborative bioinformatics projects (speed up off-target primer mapping, make into user-friendly tool…)

22 Acknowledgements Thanks to Nadine Brandt, Kath Wright and Sean Chapman

23 Sprouted seeds as a source of infections

24 ‘Sproutbreak’ - Jimmy Johns restaurant

25 Colonisation of spinach by VTEC O157:H7 Sakai (vt-)

26 Referencec :


Download ppt "Outbreak of E. coli O104:H4 heralds a new paradigm in responding to disease threats Nicola J. Holden Leighton Pritchard."

Similar presentations


Ads by Google