Presentation is loading. Please wait.

Presentation is loading. Please wait.

Completed genomes: viruses

Similar presentations


Presentation on theme: "Completed genomes: viruses"— Presentation transcript:

1 Completed genomes: viruses
Chapter 16: Completed genomes: viruses Jonathan Pevsner, Ph.D. Bioinformatics and Functional Genomics (Wiley-Liss, 3rd edition, 2015) You may use this PowerPoint for teaching purposes

2 Learning objectives After studying this chapter you should be able to:
define viruses; explain the basis of the classification of viruses; describe the genomes of HiV, influenza, measles, Ebola, and herpesviruses; describe bioinformatics approaches to determining the function of viral genes and proteins; describe key bioinformatics resources for studying viruses; and compare and contrast DNA and RNA viruses.

3 Outline Introduction Classification of viruses
Bioinformatics approaches to virology Human immunodeficiency virus (HIV) Influenza virus Measles virus Ebola virus Herpesvirus: from phylogeny to gene expression Giant viruses Perspectives

4 Introduction to viruses
Viruses are small, infectious, obligate intracellular parasites. They depend on host cells to replicate. Because they lack the resources for independent existence, they exist on the borderline of the definition of life. The virion (virus particle) consists of a nucleic acid genome surrounded by coat proteins (capsid) that may be enveloped in a host-derived lipid bilayer. Viral genomes consist of either RNA or DNA. They may be single-, double, or partially double stranded. The genomes may be circular, linear, or segmented.

5 Outline Introduction Classification of viruses
Bioinformatics approaches to virology Human immunodeficiency virus (HIV) Influenza virus Measles virus Ebola virus Herpesvirus: from phylogeny to gene expression Giant viruses Perspectives

6 Introduction to viruses
Viruses have been classified by several criteria: -- based on morphology (e.g. by electron microscopy) -- by type of nucleic acid in the genome -- by size (rubella is about 2 kb; HIV-1 about 9 kb; poxviruses are several hundred kb). Mimivirus (for Mimicking microbe) has a double-stranded circular genome of 1.2 megabases (Mb). -- based on human disease

7 The International Committee on Taxonomy of Viruses
(ICTV) offers a resource for classification

8 Virus taxonomy from the ICTV website
B&FG 3e Fig. 16-1 Page 757 Year

9 Classification of viruses
B&FG 3e Fig. 16-2 Page 759

10 Classification of viruses
B&FG 3e Fig. 16-2 Page 759

11 Classification of viruses based on nucleic acid composition
B&FG 3e Table 16.1 Page 760

12 Mutation rate as a function of genome size
hammerhead viroid CChMVd tobacco mosaic virus, human rhinovirus, poliovirus, vesicular stomatitis virus, bacteriophage Φ6, measles virus C. elegans, Mus, Drosophila, human E. coli bacteriophages λ, T2, T4; herpes simplex virus B&FG 3e Fig. 16-3 Page 761

13 Human disease relevance of viruses
Vaccine-preventable viral diseases include: Hepatitis A Hepatitis B Influenza Measles Mumps Poliomyelitis Rubella Smallpox Source: Centers for Disease Control website

14 Human disease relevance of viruses
Disease Virus Hepatitis A Hepatitis A virus Hepatitis B Hepatitis B virus Influenza Influenza type A or B Measles Measles virus Mumps Rubulavirus Poliomyelitis Poliovirus (three serotypes) Rotavirus Rotavirus Rubella Genus Rubivirus Smallpox Variola virus Varicella Varicella-zoster virus Source: Centers for Disease Control website

15 Vaccine-preventable viral diseases
B&FG 3e Table 16.2 Page 762 From

16 Seven viruses that cause cancer in humans
B&FG 3e Table 16.3 Page 762

17 Viral genomes page (NCBI)
B&FG 3e Fig. 16.4 Page 763

18 Viral metagenomics Viral metagenomics refers to the sampling of representative viral genomes from the environment. A typical viral genome is ~50 kilobases (in comparison, a typical microbial genome is ~2.5 megabases). A sample is collected (e.g. seawater, fecal material, or soil). Cellular material is excluded. Viral DNA is extracted, cloned, and sequenced. B&FG 3e Page 764

19 Henle-Koch postulates: criteria needed to establish a causal relationship between a microbe and a disease [1] The parasite occurs in every case of the disease in question and under circumstances which can account for the pathological changes and clinical course of the disease. [2] It occurs in no other disease as a fortuitous and nonpathogenic parasite. [3] After being fully isolated from the body and repeatedly grown in pure culture, it can induce the disease anew. These postulates (especially [3]) are very hard to prove with viruses. Huebner suggested alternate postulates. B&FG 3e Page 765

20 Huebner’s postulates: criteria needed to establish a causal relationship between a virus and a disease [1] The virus must be a “real” entity established by passage in culture. [2] The virus must originate from human specimens. [3] Active infection should produce an antibody response. [4] A new virus should be fully characterized and compared with other agents. [5] The virus must be constantly associated with a specific illness. [6] Human volunteers inoculated with the newly recognized agent in double‐blind studies should reproduce the clinical syndrome. [7] Epidemiological studies should identify patterns of infection and disease. [8] A specific vaccine should prevent the disease, establishing an agent as the cause. B&FG 3e Page 765

21 Outline Introduction Classification of viruses
Bioinformatics approaches to virology Human immunodeficiency virus (HIV) Influenza virus Measles virus Ebola virus Herpesvirus: from phylogeny to gene expression Giant viruses Perspectives

22 Bioinformatic approaches to viruses
Some of the outstanding problems in virology include: -- Why does a virus such as HIV-1 infect one species (human) selectively? -- Why do some viruses change their natural host? In 1997 a chicken influenza virus killed six people. -- Why are some viral strains particularly deadly? -- What are the mechanisms of viral evasion of the host immune system? -- Where did viruses originate?

23 Diversity and evolution of viruses
The unique nature of viruses presents special challenges to studies of their evolution. viruses tend not to survive in historical samples viral polymerases of RNA genomes typically lack proofreading activity viruses undergo an extremely high rate of replication many viral genomes are segmented; shuffling may occur viruses may be subjected to intense selective pressures (host immune respones, antiviral therapy) viruses invade diverse species the diversity of viral genomes precludes us from making comprehensive phylogenetic trees of viruses

24 Outline Introduction Classification of viruses
Bioinformatics approaches to virology Human immunodeficiency virus (HIV) Influenza virus Measles virus Ebola virus Herpesvirus: from phylogeny to gene expression Giant viruses Perspectives

25 Bioinformatic approaches to HIV
Human Immunodeficiency Virus (HIV) is the cause of AIDS. Some have estimated that 70 million people were infected with HIV (2017) and 35 million died. HIV-1 and HIV-2 are primate lentiviruses. The HIV-1 genome is 9181 bases in length. Note that as of 2017 there are 700,000 Entrez nucleotide records for this genome (but only one RefSeq entry). Phylogenetic analyses suggest that HIV-2 appeared as a cross-species contamination from a simian virus, SIVsm (sooty mangebey). Similarly, HIV-1 appeared from simian immunodeficiency virus of the chimpanzee (SIVcpz).

26 HIV phylogeny based on pol suggests five clades
1. Simian immunodeficiency virus from the chimpanzee Pan troglodytes (SIVcpz) with HIV-1 Hahn et al., 2000

27 HIV phylogeny based on pol suggests five clades
2. SIV from the sooty mangabeys Cerecocebus atys (SIVsm), with HIV-2 and SIV from the macaque (genus Macaca; SIVmac) Hahn et al., 2000

28 HIV phylogeny based on pol suggests five clades
3. SIV from African green monkeys (genus Chlorocebus)(SIVagm) Hahn et al., 2000

29 HIV phylogeny based on pol suggests five clades
4. SIV from Sykes’ monkeys, Cercopithecus albogularis (SIVsyk) Hahn et al., 2000

30 HIV phylogeny based on pol suggests five clades
5. SIV from l’Hoest monkeys (Cercopithecus lhoesti); from suntailed monkeys (Cercopithecus solatus); and from mandrill (Mandrillus sphinx) Hahn et al., 2000

31 Evolutionary relationships of primate lentiviruses
Full-length Pol protein sequences were aligned and a tree was created using the maximum‐likelihood method. There are five major lineages (arrows 1–5). B&FG 3e Fig. 16.5 Page 767

32 Evolutionary relationships of primate lentiviruses
The HIV‐1/SIVcpz lineage is displayed based on a maximum‐likelihood tree using Env protein sequences. Three major HIV‐1 groups (M, N, O; arrows 6–8) are distinguished. B&FG 3e Fig. 16.5 Page 767

33 Bioinformatic approaches to HIV: NCBI
NCBI offers a retrovirus resource with reference genomes and protein sets, and several tools (alignment, genotyping).

34 Retrovirus resources (NCBI)
B&FG 3e Fig. 16.6 Page 769

35 Bioinformatic approaches to HIV: LANL
Los Alamos National Laboratory (LANL) databases provide a major HIV resource. See LANL offers -- an HIV BLAST server -- Synonymous/non-synonymous analysis program -- a multiple alignment program -- a PCA-like tool -- a geography tool

36 Map of HIV infection subtypes (worldwide)
LANL geography tool B&FG 3e Fig. 16.7 Page 770 Map of HIV infection subtypes (worldwide)

37 LANL map of HIV infection subtypes (Europe)
LANL geography tool B&FG 3e Fig. 16.7 Page 770 LANL map of HIV infection subtypes (Europe)

38 Outline Introduction Classification of viruses
Bioinformatics approaches to virology Human immunodeficiency virus (HIV) Influenza virus Measles virus Ebola virus Herpesvirus: from phylogeny to gene expression Giant viruses Perspectives

39 Influenza virus Influenza virus leads to 200,000 hospitalizations and ~36,000 deaths in the U.S. each year. Influenza viruses belong to the family Orthomyxoviridae. The viral particles are about nm in diameter and can be spherical or pleiomorphic. They have a lipid membrane envelope that contains the two glycoproteins: hemagglutinin (H) and neuraminidase (N). These two proteins determine the subtypes of Influenza A virus. Influenza A

40 Influenza virus Since 1976, the H5N1 avian influenza virus has infected at least 232 people (mostly in Asia), of whom 134 have died. A major concern is that a human influenza virus and the H5N1 avian influenza strain were to combine, a new lethal virus could emerge causing a human pandemic. In a pandemic, 20% to 40% of the population is infected per year. ►The 1918 Spanish influenza virus killed tens of millions of people (H1N1 subtype). ►1957 (H2N2) ► 1968 (H3N2) ► Asia (H5N1) ► 2009 (H1N1, “swine flu”)

41 Influenza virus There are three types: A, B, C
► A and B cause flu epidemics ► Influenza A: 20 subtypes; occurs in humans, other animals. For example, in birds there are nine subtypes based on the type of neuraminidase expressed (group 1: N1, N4, N5, N8; group 2: N2, N3, N6, N7, N9). The structure of H5N1 avian influenza neuraminidase has been reported (Russell RJ et al., Nature 443:45, 2006). ► Influenza A genome consists of eight, single negative-strand RNAs (from 890 to 2340 nucleotides). Each RNA segment encodes one to two proteins.

42 examples of family Orthomyxoviridae complete genomes
Influenza viruses: examples of family Orthomyxoviridae complete genomes B&FG 3e Table 16.4 Page 771

43 Genes in a representative Influenza A virus complete genome (A/Puerto Rico/8/34(h1N1))
B&FG 3e Table 16.5 Page 772

44 Schematic of the eight segments from a typical Influenza A virus
B&FG 3e Fig. 16.7 Page 770

45 Chronological summary of influenza A strains
B&FG 3e Fig. 16.9 Page 773

46 Outline Introduction Classification of viruses
Bioinformatics approaches to virology Human immunodeficiency virus (HIV) Influenza virus Measles virus Ebola virus Herpesvirus: from phylogeny to gene expression Giant viruses Perspectives

47 Measles virus Measles virus is one of the deadliest viruses in human history Leading cause of death in children in many countries Morbillivirus of the Paramyxoviridae family (includes mumps and respiratory syncytial virus) Reference genome (see NCBI): accession NC_ Nonsegmented, negative‐sense RNA genome protected by nucleocapsids and an envelope. Genome has 15,894 bases and six genes that encode eight proteins. Six genes are designated N (nucleocapsid), P (phosphoprotein), M (matrix), F (fusion), H (hemagglutinin), and L (large polymerase). B&FG 3e Page 774

48 Eight proteins encoded by six genes of the measles virus genome
B&FG 3e Fig. 16.7 Page 770

49 Outline Introduction Classification of viruses
Bioinformatics approaches to virology Human immunodeficiency virus (HIV) Influenza virus Measles virus Ebola virus Herpesvirus: from phylogeny to gene expression Giant viruses Perspectives

50 Ebola virus Ebola is a filovirus that is transmitted between people by contact with body fluids. First reported outbreak occurred in 1976. Largest outbreak began in 2014, centered initially in West Africa. The virus causes hemorragic fever that is often fatal. Ebola virus is an enveloped, single‐stranded RNA negative‐strand virus of the family Filoviridae. The Zaire Ebola virus reference genome is 18,959 bases in length (accession NC_ ), with seven genes encoding nine proteins. Bioinformatics resource include a UCSC Ebola virus browser. B&FG 3e Page 775

51 Outline Introduction Classification of viruses
Bioinformatics approaches to virology Human immunodeficiency virus (HIV) Influenza virus Measles virus Ebola virus Herpesvirus: from phylogeny to gene expression Giant viruses Perspectives

52 Herpesvirus Herpesviruses are double-stranded DNA viruses that
include herpes simplex, cytomegalovirus, and Epstein-Barr. The genomic DNA is packed inside an icosahedral capsid; with a lipid bilayer the diameter is ~200 nanometers.

53 Herpesvirus Phylogenetic analysis suggests three major groups
that originated about MYA. Mammalian herpesviruses are in all three subfamilies. Avian and reptilian herpesviruses are all in the Alphaherpesvirinae.

54 Herpesvirus: three main groups
Millions of years before present

55 Herpesvirus taxonomy McGeoch et al. (Virus Res. 117:90-104, 2006) describe a new herpesvirus taxonomy. Family Herpesviridae Subfamilies Alpha-, Beta-, Gammaherpesvirinae New family Alloherpesviridae (piscine, amphibian herpesviruses)

56 Phylogeny of the herpesviruses:
comparison to the evolution of host genomes B&FG 3e Fig Page 777

57 Phylogeny of the herpesviruses:
comparison to the evolution of host genomes B&FG 3e Fig Page 777

58 Out‐of‐Africa hypothesis: herpesviruses infected marine invertebrates such as oyster ~500 MYA, and various members of the Herpesviridae (listed to the right) established host species specificity. B&FG 3e Fig Page 778

59 Herpesvirus taxonomy Genome sizes range from 124 kb (simian varicella virus from Alphaherpesvirinae) to 241 kb (chimpanzee cytomegalovirus from Betaherpesvirinae). ► GC content ranges from 32% to 75%. ► Protein-coding regions occur at a density of one gene per 1.5 to 2 kb of herpesvirus DNA. ► There are immediate-early genes, early genes (nucleotide metabolism, DNA replication), and late genes (encoding proteins comprising the virion). ► Introns occur in some herpesvirus genes. ► Noncoding RNAs have been described (e.g. latency-associated transcripts in HSV-1).

60 Bioinformatic approaches to herpesvirus
Consider human herpesvirus 8 (HHV-8)(family Herpesviridae; subfamily Gammaherpesvirinae). Its genome is ~140,000 base pairs and encodes ~80 proteins. Its RefSeq accession number is NC_ We can explore this virus at the NCBI website. Try NCBI  Genomes  viruses  dsDNA

61 HHV8 genome overview (NCBI)
B&FG 3e Fig Page 779

62 HHV8 proteins (graphic and tabular summaries)
B&FG 3e Fig Page 779

63 Viruses can acquire host genes
HHV-8 proteins include structural and metabolic proteins. There are also viral homologs of human host proteins such as the apoptosis inhibitor Bcl-2, an interleukin receptor, and a neural cell adhesion-related adhesin. Mechanisms by which viruses may acquire host proteins include recombination, transposition, splicing. A DELTA BLAST search using HHV-8 interleukin IL-8 receptor as a query reveals several other viral IL-8 receptor molecules.

64 A viral protein (HHV‐8 ORF74) is a G‐protein coupled receptor that is homologous to a superfamily of mammalian G‐protein coupled receptors, including a high‐affinity interleukin 8 (IL‐8) receptor. B&FG 3e Fig Page 780 Results of a DELTA BLAST search (query YP_ ) are shown.

65 Bioinformatic approaches to herpesvirus
Functional genomics approaches have been applied to human herpesvirus 8 (HHV-8). For example, changes in viral gene expression have been measured at different stages of infection. Conversely, gene expression changes have been measured in human cells following viral infection.

66 Outline Introduction Classification of viruses
Bioinformatics approaches to virology Human immunodeficiency virus (HIV) Influenza virus Measles virus Ebola virus Herpesvirus: from phylogeny to gene expression Giant viruses Perspectives

67 Mimivirus: mimicking microbe
Mimivirus is a member of the Mimiviridae family of nucleocytoplasmic large DNA viruses (NCLDVs). It was isolated from amoebae growing in England. The mature particle has a diameter of ~400 nanometers, comparable to a small bacterium (e.g. a mycoplasma). Thus, mimivirus is by far the largest virus identified to date.

68 Mimivirus: mimicking microbe
The mimivirus genome is 1.2 Mb (1,181,404 base pairs). It is a double-stranded DNA virus. ► Two inverted repeats of 900 base pairs at the ends (thus it may circularize) ► 72% AT content (~28% GC content) ► 1262 putative open-reading frames (ORFs) of length >100 amino acids. 911 of these are predicted to be protein-coding genes ► Unique features include genes predicted to encode proteins that function in protein translation. The inability to perform protein synthesis has been considered a prime feature of viruses, in contrast to most life forms. See Raoult D et al. (2004) Science 306:1344.

69 Largest virus genomes All are double-stranded DNA (no RNA stage). the order Megavirales has been proposed to reflect the genome size of at least 1 megabase. B&FG 3e Table 16.6 Page 782

70 MUMmer: alignment of two genomes
Use MUMmer on the command line. Obtain viral genome sequences in the FASTA format (e.g. from NCBI). B&FG 3e Page 784

71 Comparison of megabase‐size viral genome sequences using MUMmer software
The Acanthamoeba polyphaga Mimivirus (reference, x axis) and Acanthamoeba castellanii mamavirus (y axis) genomes are largely collinear. B&FG 3e Fig Page 785

72 Comparison of megabase‐size viral genome sequences using MUMmer software
Comparison of Acanthamoeba polyphaga Mimivirus versus Acanthamoeba polyphaga moumouvirus. Forward MUMs are indicated in red, while reverse MUMs are colored blue. A prominent inversion is evident and a translocation (arrow 1). B&FG 3e Fig Page 785

73 Virus resources available on web
B&FG 3e Table 16.7 Page 787

74 Outline Introduction Classification of viruses
Bioinformatics approaches to virology Human immunodeficiency virus (HIV) Influenza virus Measles virus Ebola virus Herpesvirus: from phylogeny to gene expression Giant viruses Perspectives

75 Perspectives There are relatively few species of viruses because of their specialized requirements for replication in host cells. Thousands of viral genomes have been sequenced, and metagenomics projects are helping to define the diversity of viral genomes. It remains a challenge to use bioinformatics knowledge to successfully create vaccines. Many viruses (such as measles) remain deadly to humans, and some such as influenza pose potentially catastrophic threats. B&FG 3e Page 785


Download ppt "Completed genomes: viruses"

Similar presentations


Ads by Google