Presentation is loading. Please wait.

Presentation is loading. Please wait.

Poxviruses, Biodefense and Bioinformatics Working towards a better understanding of viral pathogenesis and evolution.

Similar presentations


Presentation on theme: "Poxviruses, Biodefense and Bioinformatics Working towards a better understanding of viral pathogenesis and evolution."— Presentation transcript:

1 Poxviruses, Biodefense and Bioinformatics Working towards a better understanding of viral pathogenesis and evolution

2 PBR Bioinformatics  Managing Complexity –Technology development  Enhancing Understanding –Research

3 PBR Managing Complexity  Data –Acquisition –Storage –Manipulation –Retrieval

4 PBR Managing Complexity…  Data Analysis –Development and Utilization of Analytical tools Visualization tools

5 PBR Enhancing Understanding What distinguishes one organism from another?  Sequence  Molecular Biology  Physiology  Pathogenesis  Epidemiology  Evolution Will the genomic sequence provide an explanation for the differences?

6 PBR What is Bioinformatics?  Computer-aided analysis of biological information  Discerning the characteristic (repeatable) patterns in biological information that help to explain the properties and interactions of biological systems.  Caveat: –In the end, bioinformatics (a.k.a. computers) can only help in making inferences concerning biological processes. –These inferences (or hypotheses) have to be tested in the laboratory

7 The Poxvirus Bioinformatic Resource www.poxvirus.org PBR

8 PBR Collaborators  UAB –Elliot Lefkowitz  St. Louis University –Mark Buller  University of Victoria –Chris Upton  ATCC –Charles Buck  Medical College of Wisconsin –Paula Traktman

9 PBR The UAB MGBF Contingent Molecular and Genetic Bioinformatics Facility  Programmers –Jim Moon –Don Dempsey –Uma Dave –Bei Hu  Students –Chunlin Wang  Fellows –Shankar Changayil –Xiaosi Han

10 PBR Poxviruses  Large dsDNA genome –150,000 – 300,000 base pairs –150 – 260 genes  Complex virion morphology  Cytoplasmic replication  Array of immunoevasion strategies.  Human pathogens –Molluscum contagiosum –Variola –Monkeypox

11 PBR The PBR is Designed to Support Basic and applied research on Poxviruses including the development of new:  Environmental Detectors  Diagnostic Reagents  Animal Models  Vaccines  Antiviral Compounds

12 PBR PBR Design Philosophy  Useful and Used  Supporting all poxvirus investigators –UAB PBR Web-based application requirements Web Browser Java plugin  In-depth analyses –UVic analytical tools

13

14

15

16

17

18

19

20 PBR BLAST  Search a sequence database for primary sequence similarities to some query sequence  Provides a measure of the significance of the similarity  Does not necessarily imply common evolutionary origin  Developed at NCBI –Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. (1990) "Basic local alignment search tool." J. Mol. Biol. 215:403-410.

21

22

23

24

25

26

27

28

29

30 18 Genomes; 563 genes = Avg. 31 genes/genome

31

32

33

34

35 PBR PBR Knowledge Database  Mini review of available structure-function information –Human-curated database based on the literature  Bibliographic information  Available scientific resources clones, mutants, and antibodies  Empirically-derived properties –MW, pI... –Post-translational modifications –Expression  Functional Assignments –Gene Ontology controlled vocabulary Molecular function Biological Process Cellular component –Virulence Ontology

36 PBR

37 Molecular Evolution and Genomic Analyses of Poxviruses

38 PBR Objectives  To better understand the role individual genes and groups of genes (or other genetic elements) play in poxvirus (especial smallpox ) host range and virulence  Try to describe and understand poxvirus diversity via reconstruction of the families evolutionary history

39 10 nucleotide changes Orthopoxvirus Phylogeny DNA Polymerase CMPV-M96 VARV-BSH VMNV-GAR ECTV-MOS CPXV-BR VACV-COP MPXV-ZAI 100 59 Nucleoside triphosphatase MPXV-ZAI CMPV-M96 VARV-BSH VMNV-GAR ECTV-MOS CPXV-BR VACV-COP 100 78 94

40 Orthopoxvirus Phylogeny 132 gene tree possible

41 65 gene tree possible for Chordopoxviruses

42 PBR Horizontal Gene Transfer  The acquisition of genetic material from another organism that becomes a “permanent” addition to the recipient’s genome  Many poxvirus genes involved in immune evasion may have been acquired thorough HGT  Detection of HGT –Alternative base composition –Alternative codon usage pattern –Alternative evolutionary inheritance pattern

43 Detecting HTGs by plotting codon usage

44 MOCV-SB1_011 MOCV-SB1_055 MOCV-SB1_132 GC distribution in Molluscum Contagiosum genome. It is smoothened by wavelet technique. The blue number is the position in genome. The green bars mark significant deviation and a putative gene is marked there. GC distribution of Molluscum Contagiosum

45 VARV Proteins with Similarity to Human Proteins  3-beta-hydroxysteroid dehydrogenase  Ankyrin  CD47 antigen  Carbonic Anhydrase  Casein kinase 1  Complement control protein  DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide  DNA ligase  Glutaredoxin  Hypothetical protein  JNK-stimulating phosphatase  Kelch-like protein  Lymphocyte activation-associated protein  Makorin zinc-finger protein  Myosin heavy chain  Plasminogen activator inhibitor  Profilin  RNA polymerase  Ribonucleotide reductase M2  SNF2 transcription activator  Serine proteinase inhibitor  Squamous cell carcinoma antigen  Superoxide dismutase  Thymidine kinase  Tumor necrosis factor receptor

46 Ribonucleotide Reductase Homolog Evolution

47

48 TNF Receptor Homolog Evolution

49

50 TNF Receptor GenBank nr Hits

51

52

53

54

55

56 VARV B22R BLASTN Results

57 Genome Comparison: Variola major vs. minor

58 Genome vs. Gene Phylogeny

59 Molecular Evolution and Genomic Analyses of Poxviruses We have a problem…

60

61

62 PBR

63

64 Poxvirus Gene Prediction  Little consistency from one genome to another  Methods employed –Minimum ORF size –Similarity with previously described proteins

65 PBR Consistently predict and annotate the gene set for all Poxvirus genomes  Development of a comprehensive gene prediction tool –Discovery of new or “missed” genes –Removal of “pseudo” genes  As an added bonus: –Computational annotation of each predicted gene

66 PBR What is a gene?  Does it looks like a gene? –Open Reading Frame –Base composition –Codon usage  Is it expressed? –Regulatory signals –Transcription –Translation  Has it been previously recognized? –Similarity searching

67 PBR Proposal gene finding tool  Combination of a series of complementary gene prediction algorithms  DNA Signals –ORF detection –Base composition –Codon preference –HMM gene models  Similarity searching –BLAST similarity searches –Similarity to identified poxvirus protein domains using an HMM-based domain database  Promoter detection –Neural Network promoter detection tool  Patterns of amino acid sequence conservation –Biodictionary-based analysis  Knowledge-based integration of all predictive methods –Computational conclusions –Visualization tool for human inspection

68 Using High Performance Computing to Speedup Bioinformatic Applications

69

70 PBR Features to consider in porting an application to a cluster environment  Balancing the processing workload among nodes is critical to successful implementation  A computational method with a lower percentage load imbalance (PLIB) is more efficient than one with a higher PLIB. The workload is perfectly balanced if PLIB is equal to zero.  Similarity searching workload can be difficult to estimate –Dependent on the nature of both the database and query sequences sequence length number of sequences complexity of the sequences

71 PBR Data Segmentation  Database Sequences –Utilize when the database size is larger than physical memory of each computational node –Results need to be combined and statistics recalculated –Not possible with some applications (PSI-BLAST)  Query Sequences –Flexible and allows for better balancing of the workload –Statistics remain valid –Database remains intact –Best performance when the database can be fully loaded into available memory

72 PBR Work Flow for Database segmentation  Database is split evenly and formatted  Database fragments are sent to each node  Query file is distributed to all nodes  The search is initiated  Output is collected for merging and formatting

73 PBR Work Flow for Query Segmentation  Database is distributed to all nodes  90% of the query sequences are split into bins and distributed among the available nodes –Balanced for sequence length and number  The remaining 10% query of the query sequences are delivered to nodes as they finish the initial search  Individual results are merged and reported

74 PBR Implementation  Utilizes the LAM/MPI Message Passing Interface package from Indiana University  The application executables are not altered –The implementation wraps the executable and data and sends it to each node –Easily accommodate application updates –Easily extends to similar applications  Currently have implemented two wrappers –BLAST –HMMPFAM Sean Eddy, Washington University School of Medicine, St. Louis, Missouri  Benchmarks performed on the UAB School of Engineer Linux cluster –2 storage servers (IBM x345). –one compile node and 64 compute nodes (IBM x335) 2 x 2.4 GHz Xeon processors per node 2-4 GB of RAM per node 18 GB SCSI hard drive connected via Gigabit Ethernet to a Cisco 4006 switch

75

76

77

78 Comparison of gene finding methods MethodsProsCons DNA Signal sensorBased on empirically- derived, statistical evidence distinguishing biological signals. Difficult to distinguish background noise from real signals. Frequently not sensitive enough. Content sensor (Glimmer) Dependent on having a reasonable gene model. Short genes and genes present due to HGT are more difficult to detect. Similarity searching (BLAST, HMM) Relies on accumulated pre- existing biological data. Clearly detects highly relevant matches. Limited to pre-existing biological data; Sensitive to database errors in; Difficult to detect more distant relationships. Promoter detection Reflects actual poxvirus biology (gene expression). Weak signals difficult to detect. Bio-dictionariesUseful for detecting novel genes. Difficult to implement; no biological evidence.

79 PBR Gene prediction: Putting it all together 32000 38000 36000 40000 Similar searching ORFs Glimmer Promoter detector 34000 G/C plotting Bio-Dictionary

80 PBR Now the real work can begin:  More rigorous comparative analysis –Shared and unique sets of gene composition –SNP analysis of gene differences  Whole genome phylogenetic prediction  Individual gene phylogenetic prediction  Unique patterns of evolutionary inheritance  “Clustering” of evolutionary inheritance with pathogenesis


Download ppt "Poxviruses, Biodefense and Bioinformatics Working towards a better understanding of viral pathogenesis and evolution."

Similar presentations


Ads by Google