Presentation on theme: "What Have We Learned From Unicellular Genomes? Propionibacterium acnes Bacteroides thetaiotaomicron Mycoplasma genitalium Mimivirus Cyanobacteria Plasmodium."— Presentation transcript:
What Have We Learned From Unicellular Genomes? Propionibacterium acnes Bacteroides thetaiotaomicron Mycoplasma genitalium Mimivirus Cyanobacteria Plasmodium Yeast
Why do I get so many pimples? The genome of Propionibacterium acnes was sequenced in July of 2004. P. acnes lives in sebaceous cysts and sometimes stimulates and immune response. A group in Paris, along with two groups in Germany sequenced P. acnes. They found 2,333 genes in its 2.6 Mb genome. 68% of these had orthologs in other species. 20% had none, and 12% encoded only RNA.
Anatomy of a pimple
Genome-wide evaluations A first step following bacterial genome sequencing is finding the ori and terminus for replication. GC skewing (non-uniform distribution of G’s & C’s Oris tend to have the lowest skew, while termini have the highest. Genes that have originated by horizontal transfer are identified using a sliding window to find segments with abnormal GC content. Codon bias is also used to detect HT. Immunogenic and metabolic genes were detected.
Transcriptional Phase Variation During finishing, it was found that P. acnes had a variable # of G’s associated with some genes. It is hypothesized that the initiation of transcription depends on the # of consecutive G’s. As rows of G’s are replicated, the # will change. This leads to a mixed population of bacteria with varying degrees of protein production. This diverse population is optimized to respond differentially to various skin treatments.
Digesting Our Cells For Food P. acnes was found to be able to grow anaerobically as well as aerobically. Cells produce many enzymes that are able to degrade lipids, ester, and amino acids. Some of these degradation products increase adhesion to our cells. Many of the digestive enzymes contain a motif (LPXTG) that targets them to the cell wall. Hyaluronate lyase is also found on the surface of the bacteria, this destroys the extracellular matrix that binds our cells together.
Stimulating the Immune Response P. acnes produces 5 CAMP factors (secreted proteins that bind antibodies) that can form pores in the cell membrane. A dipeptide motif (PT) is present in certain proteins, this motif is also found in M. tuberculosis. The bacteria also has at least 7 heat shock protein genes. Porphyrin is also secreted, which produces toxic forms of oxygen, further stimulating the immune response.
Withstanding the Environment P. acnes can signal nearby cells that something has changed in the environment. Sensors called two-component systems (1 to sense & 1 to signal) exist in some bacteria, P. acnes has 10 pairs. Quorum sensing is the ability to detect conditions of overcrowding. The LuxS gene is expressed in these instances, which produces a universal signal for interspecies communication among bacteria. Biofilms of meshed-together cells protect themselves.
Are all bacteria living in us bad for us? An average human body is composed of about 10 13 cells. Our intestines have about 10 10 microbes/ml and contain at least 1,000 ml. A majority of the cells in our bodies may be bacteria! (500 - 1,000 different species) This accounts for 2-4 million non-human genes Bacteroides thetaiotaomicron constitutes a substantial portion of our intestinal flora. A group from Wash. U. in St. Louis sequenced it’s genome.
Overview of the Genome B. thetaiotaomicron’s genome contains 6.3 Mb, as well as 4,779 genes (and a 33 kb plasmid). 58% of ORFs have known function, 18% have orthologs of no known function, and 24% have no homology with known proteins. COGs (functional categories of genes) are determined following sequencing to create an overview of a given genome. Many of the genes specialize in sugar uptake, cell wall synthesis, environmental sensing and signaling, as well as transposition.
Major COGs Sugar metabolism- 170 genes fit into this category, most bacteria have a set of 23. 61% of these appear to be secreted, this not only benefits other bacteria but us as well. 163 paralogs of 2 genes (SusC & SusD) import sugars into the cytoplasm of the microbe. Many two-component genes are present for signaling, some of these interact with factors. 63 tranposons are present, which may help spread antibiotic resistance.
Does Size Matter? The coding capacity for this genome is very high (89% coding DNA) but it has a lower ratio of gene # to genome size than expected. This was a paradox until it was determined that the ORFs of this microbe are unusually large. It is unclear why this is the case. Summary Gut symbionts provide us with predigested sugars, stimulated blood vessel formation, crowd out pathogens, sequester limited resources, and stimulate our mucosal layer.
Can Microbial Genomes Become Dependent Upon Us? In the microbial world, if you don’t use it- you lose it. Mycoplasma genitalium has one of the most reduced microbial genomes and the 2nd smallest bacterial genome with 580 kb (the smallest is N. equitans with 490 kb). TIGR sequenced its genome in 1995. 470 ORFs were found, 96 of which have no known orthologs. M. genitalium has an 88% coding capacity.
Genes that have been lost: M. genitalium has presumably lost many genes involved in the synthesis of amino acids, cofactors, cell envelope, and regulatory factors. It has only 1 factor. The microbe has retained genes for energy metabolism, fatty acid and phospholipid metabolism, nucleotide production, replication, transcription, and protein transport. The only category overrepresented is translation, namely rRNA and tRNA genes.
What is the Minimum # of Genes? Craig Venter, along with Hamilton O. Smith, is trying to construct an organism with the fewest possible genes. A new field called synthetic biology seeks to synthesize a functioning genome de novo. A better understanding of evolutionary principles and genome circuitry is sought. Japanese & European scientists have tried to identify the essential genes of B. subtilis. They have found that only 192 genes are indispensable to life.
Do all Viruses have Small Genomes? Most viral genomes are much smaller than bacterial ones: HIV- 9,200 nt WNV- 10,962 nt SARs- 29,727 nt T7- 39,900 nt - 48,502 nt In 2003, a new virus that infects amoeba was isolated that has 1.2 Mb! A group in Marseille, France sequenced Mimivirus, as it is called.
Mimivirus Genome 1,262 ORFs were identified, the coding capacity is 90.5%. Like most viruses, the genome is linear, but it has inverted repeats at both ends by which it may circularize, perhaps during replication. Isoleucine is used twice as often as usual, and there is a strong codon bias for codons lacking G or C. The genome is 28% GC. Mimivirus is overrepresented in genes for translation, posttranslational modification, and amino acid transport and metabolism.
Is Mimivirus Alive? The genome of Mimivirus resembles bacterial, Mimivirus even stains Gram +, is it a virus? In 1957, the definition of a virus was proposed: 1) smaller than.2 microns 2) possesses DNA or RNA, not both 3) not able to synthesize its own proteins 4) cannot generate energy from substrates 5) cannot grow by binary fission Mimivirus only satisfies the 4th category, we are not sure about the 5th.
What is it then? Mimivirus has blurred the distinction between prokaryotes and viruses. It is hypothesized that, like M. genitalium, Mimivirus has lost genes over time. We will learn of more obligate intracellular parasites later in class. Mimivirus may resemble some of the earliest forms of life that was able to replicate independently until it became a parasite.
Genomes Reflect an Organism’s Ecological Niche Cyanobacteria are the most productive phytoplankton in the world. The two most abundant genera of cyano- bacteria are Prochlorococcus and Synecho- coccus. 3 genomes in the former group and 1 in the latter were sequenced in 2003. Individual cells from both genera are referred to using a numbering system to indicate different ecotypes. Species designations are difficult to assign still, Prochlorococcus was discovered in the 1990s.
Dot Plot Align- ment
Prochlorococcus MED4 vs. MIT9313 These ecotypes share 1,352 orthologs. Short diagonal segments indicate synteny. A negative slope indicates that the segment was inverted in one type relative to the other. Segments with positive slope but located off the diagonal indicate chromosome recombinations. Genes along the axis means they are missing from the other ecotype, MED4 has 364 genes not found in MIT9313, which has 923 genes not found in the other.
pcb gene family A major difference between the ecotypes is in the pcb gene family, which encode chlorophyll-binding, light-harvesting antenna complex proteins that help capture a wider spectrum of light. MED4 (high light) has only 1 pcb gene MIT9313 (medium light) has 2 (A & B) SS120 (low light) has 8 (A-H) MED4’s gene does not respond to changes in Fe +3 but MIT9313’s is induced 7-fold and SS120’s is induced 23-fold.
MED4’s Small Genome MED4’s genome is the smallest known for a photoautotroph and may represent the minimum for a photosynthetic organism. MED4 appears to have lost genes over time. A more stream-lined genome means a narrower ecological range that an organism is adapted for. Synechococcus has the largest genome of this group and the largest ecological range as well. People have proposed seeding the ocean with Fe +3 to help stimulate CO 2 consumption.
Gene deletions in Cyanobacteria
Malaria Malaria, although it rarely makes news headlines, is a daily threat to the 3 billion people who live in tropical climates. In 2002, about 500 million people were infected. About 2.7 million people die each year (about 90% of these are < 5 years old). The cause of malaria has been known for 100 years but we still can’t stop its spread. The most lethal form of malaria is caused by Plasmodium falciparum.
Lifecycle of Plasmodium
RBC Infection The most vulnerable time for Plasmodium is during the RBC infection stage. The parasite must force its way into a RBC without rupturing any plasma membranes. Three structures are important during infection: 1) extracellular coating to make cells sticky 2) apical end of cell must be oriented downward 3) apicoplast is an internalized algal symbiont
Plasmodium Genomes Plasmodium actually has three genomes: nuclear, mitochondrial, and apicoplastic. Pulse-field gel electrophoresis to separate chromosomes, followed by shotgun genome sequencing was used on Plasmodium. This proved to be the most AT-rich genome sequenced so far (19.4% GC). The 22.9 Mb genome has 52.6% coding capacity and 5,268 ORFs (60% of which have no known function, the largest of any genome).
Tricking the Immune System The genes of Plasmodium that are responsible for binding to RBC’s and for avoiding the immune system are located near the telomeres of this eukaryote. Genes located near Plasmodium telomeres are replicated many times, all three gene families in these categories (var, rif, & stevor) are polymorphic. There are 59 var paralogs, 149 rif, and 28 stevor. This may account for our immune system’s lack of ability to deal with this parasite
The Plasmodium Proteome 1% of proteins are used for host cell invasion 4% help evade the immune response 31% are integral to the membrane 14% are enzymes (about 4x < most proteomes) 10% are transported to the apicoplast 60% have unknown function The Krebs cycle is present, but the organism grows anaerobically and only uses this cycle for heme biosynthesis (which it could get from us)
Apicoplast Proteome Similar to a chloroplast in origin but used for a different purpose now. Only two photosynthetic orthologs remain. This organelle synthesizes fatty acids, isoprenoids, and heme groups. Nuclear proteins sent here assist in DNA replication & repair, transcription, translation, posttranslational glycosylation, protein import, and protein degradation.
Comparing Plasmodia The Plasmodium sequencing project took 45 people 6 years to complete. At the same time, other groups were working on P. yoelii, which infects rats and is used as a model organism for malaria research. Unfortunately, this latter genome was never finished, making comparisons difficult. P. yoelii has 600 additional ORFs, and the two have 3,310 genes in common (56%). Is this similar enough to make a good model organism?
Malaria Treatment Options? Recently, a German & American team used reverse genetics (starting with a gene sequence and deducing its function) to target a gene in the production of a knock- out strain. This strain is expected to be less pathogenic than wild type. Mice injected with this strain were protected for 30 days. Even if a better drug were produced, funding and health care infrastructure are lacking in many problem areas. Very little $ is spent on malaria research.
Yeast Genome The S. cerevisiae genome was sequenced in 1996. It took over 600 scientists in Europe, North America, and Japan working together to seqeunce the 12 Mb genome. Yeast has a 70.3% coding capacity, higher than Plasmodium but lower than all bacteria. There is a gene every 2 kb in yeast, one every 6 kb in C. elegans, and one every 30 kb in humans. Eukaryotes have more junk DNA than prokaryotes and enhancers, promoters, and introns add substantially to the size of eukaryotic genes.
Chromosome Structure in Yeast The 4 smallest chromosomes in yeast have a unique structure. It was known from using YACs that chromosomes smaller that 150 kb were not stable in yeast. These chromosomes are relatively gene-poor and undergo recombination at high frequencies, perhaps to protect the larger ones from the same fate. Transcriptionally silent genes are found in the sub-telomeric regions of many chromosomes, this may help identify the right and left sides of a chromosome.
Evolutionary History of Yeast There were a substantial number of genes found in duplicate copies in yeast. It was proposed that yeast had undergone “duplication events” at some point in time. Many regions of chromosomes are syntenic with regions on other chromosomes. Such paralogs are seen as evolutionary experiments where one gene can drift to provide new specialized functions. Some genes were initially thought to be extra copies but experiments proved their difference
Predictions for the Future The authors of the landmark 1996 yeast sequencing publication made the following predictions: 1) they described plans to produce a collection of single, double, and even triple KO mutations 2) they addressed the value of making all genome sequences publicly available. 3) They felt WGS sequencing of large genomes was not feasible. 4) They looked forward to comparing yeast with the S. pombe as well as the human genome.
Better Annotation A number of yeast genomes have been sequenced since 1996. With these, the need to annotate genes based on GO, Gene Ontology, became clear. Improvements in computers, search algorithms, and the increased volume of genes in the databases lead to better annotation. The original 5,885 ORFs annotated has been increased to 6,672, many below the original cutoff of 100 codons