Presentation is loading. Please wait.

Presentation is loading. Please wait.

Comparative Genomics Virulence in E. coli Diversity of Genomes How Many Genomes are There? Different Genome Perspectives.

Similar presentations


Presentation on theme: "Comparative Genomics Virulence in E. coli Diversity of Genomes How Many Genomes are There? Different Genome Perspectives."— Presentation transcript:

1 Comparative Genomics Virulence in E. coli Diversity of Genomes How Many Genomes are There? Different Genome Perspectives

2 Virulence in E. coli 1997- Fred Blattner lab at UWis sequenced E. coli K12 strain 1997- Fred Blattner lab at UWis sequenced E. coli K12 strain 2001- sequenced pathogenic strain O157:H7 2001- sequenced pathogenic strain O157:H7 This strain causes hemorrhagic colitis which affects 75,000 people each year This strain causes hemorrhagic colitis which affects 75,000 people each year Genome has 5.5 Mb instead of 4.6 Mb Genome has 5.5 Mb instead of 4.6 Mb Has 1.3 Mb of “O-islands” not found in K12, K12 has.5 Mb of “K-islands” not found in O157:H7 (1387 and 528 genes, respectively) Has 1.3 Mb of “O-islands” not found in K12, K12 has.5 Mb of “K-islands” not found in O157:H7 (1387 and 528 genes, respectively)

3 Island Genes Many of the O157:H7 unique genes are predicted to be virulence genes, including toxins, metabolic pathways, transporters, and adhesion molecules. Many of the O157:H7 unique genes are predicted to be virulence genes, including toxins, metabolic pathways, transporters, and adhesion molecules. K-12, however, also have genes in these categories but the strain is not virulent. K-12, however, also have genes in these categories but the strain is not virulent. A striking difference between O-islands and K- islands is their base compositions, which differ from that of the backbone. A striking difference between O-islands and K- islands is their base compositions, which differ from that of the backbone. Many of the island genes have orthologs in other species and viruses and may have resulted from horizontal transfer. Many of the island genes have orthologs in other species and viruses and may have resulted from horizontal transfer.

4 Chi-square Analysis How to tell if base compositions, such as those associated with O- and K- islands really are different from the norm. Base Seq 1 Seq 2 Total A1,0006001,600 C1,0008001,800 G1,0007001,700 T1,0009001,900 Total4,0003,0007,000

5 Hypothesis: the base composition is equal  2 = 35.32 ObservedExpected (O - E) 2 (O - E) 2 /E 1,000914.37344.58.03 1,0001028.6818.0.80 1,000971.4818.0.80 1,0001085.77344.58.03 600685.77344.58.03 800771.4818.0.80 700728.6818.0.80 900814.37344.58.03

6 Differences Between Two Strains Virulence may be due to genes on the “O- islands” or to differences between shared genes Virulence may be due to genes on the “O- islands” or to differences between shared genes Although they share 75% of their DNA, only 25% of their genes are identical Although they share 75% of their DNA, only 25% of their genes are identical The rest have at least 1 base difference The rest have at least 1 base difference While this amount of difference is small, it can mean the difference between healthy individuals and those with sickle-cell anemia or cystic fibrosis

7 460 Genomes, and counting… The more genomes we sequence, the wide diversity of these genomes becomes more evident. The more genomes we sequence, the wide diversity of these genomes becomes more evident. These genomes range in size from.5-10 Mb and in GC content from 25-75%. These seem to correlate, since GTP and CTP take more energy to make. These genomes range in size from.5-10 Mb and in GC content from 25-75%. These seem to correlate, since GTP and CTP take more energy to make. One trend is that stable niches tend to accommodate small genomes while volatile environments do not. One trend is that stable niches tend to accommodate small genomes while volatile environments do not. One thing that remains fairly constant is coding capacity, prokaryotes all have about 1 gene/kb. One thing that remains fairly constant is coding capacity, prokaryotes all have about 1 gene/kb.

8

9 Circular Prokaryotic Chromosomes Another thing we have learned are that not all prokaryotic chromosomes are circular. Another thing we have learned are that not all prokaryotic chromosomes are circular. 3 distantly related groups of bacteria have linear chromosomes that seem to have evolved independently. 3 distantly related groups of bacteria have linear chromosomes that seem to have evolved independently. In regards to chromosome #, some confusion exists whether particular pieces of DNA are chromosomes or plasmids. In regards to chromosome #, some confusion exists whether particular pieces of DNA are chromosomes or plasmids. Two criteria are used to define a chromosome: Two criteria are used to define a chromosome: 1) Does it contain essential genes? 2) Does it contain ribosomal genes?

10 Genomes are Constantly Changing The size of a genome may change rapidly due to horizontal transfer or fusing of genomes. The size of a genome may change rapidly due to horizontal transfer or fusing of genomes. The cost of replicating additional DNA must be balanced with the benefit of having genes that may lend a selective advantage. The cost of replicating additional DNA must be balanced with the benefit of having genes that may lend a selective advantage. If the cell evolves to fill a new niche, losing unused genes may be advantageous. If the cell evolves to fill a new niche, losing unused genes may be advantageous. Most bacteria in similar niches have similar sized genomes. Gut bacteria, for instance, have genomes in the 4-5 Mb range. Most bacteria in similar niches have similar sized genomes. Gut bacteria, for instance, have genomes in the 4-5 Mb range.

11 How Many Genomes are There?

12

13 Experimental Procedures 1,500 liters of surface water was collected 7 times from 4 different sites around the sea. 1,500 liters of surface water was collected 7 times from 4 different sites around the sea. This was passed through filters which trapped particles between.1 and 3  m. This was passed through filters which trapped particles between.1 and 3  m. Collected cells were lysed and their DNA cut into <1 kb pieces which were then cloned. Collected cells were lysed and their DNA cut into <1 kb pieces which were then cloned. Genomic DNA was extracted from the filters and subjected to shotgun sequencing. Genomic DNA was extracted from the filters and subjected to shotgun sequencing.

14 Results: About 1 million separate sequences were obtained, totaling 1.6 billion base pairs of DNA About 1 million separate sequences were obtained, totaling 1.6 billion base pairs of DNA At least 1,412 different rRNA genes are represented in this sample, including 148 which are new to the database. At least 1,412 different rRNA genes are represented in this sample, including 148 which are new to the database. Using 6 other genes for comparison, a range of 341-569 phylotypes (ie. species) were sampled (including 12 complete genomes). Using 6 other genes for comparison, a range of 341-569 phylotypes (ie. species) were sampled (including 12 complete genomes). As the cost of sequencing DNA continues to drop, this approach may become the “next wave” of research into biodiversity As the cost of sequencing DNA continues to drop, this approach may become the “next wave” of research into biodiversity

15

16 Sampling Problems One problem with this method is that favors more abundant species. The coverage for a particular gene in an abundant species is better and a greater number of genes/species exist. One problem with this method is that favors more abundant species. The coverage for a particular gene in an abundant species is better and a greater number of genes/species exist. 53% of all DNA from sample #1 were from two genera: Shewanella & Burkholderia. This is a mystery since the former prefers nutrient-rich water and the latter is usually terrestrial. 53% of all DNA from sample #1 were from two genera: Shewanella & Burkholderia. This is a mystery since the former prefers nutrient-rich water and the latter is usually terrestrial. Calculations to correct for lost species estimate that 1,800 different species may have been present. Calculations to correct for lost species estimate that 1,800 different species may have been present.

17 New Genes Discovered A total of 1.2 million genes were characterized in this study, including 70,000 novel ones. A total of 1.2 million genes were characterized in this study, including 70,000 novel ones. Bacteriorhodopsin was one popular gene family, previous sampling using PCR had uncovered 67 homologs, but this study found 782 new ones. Bacteriorhodopsin was one popular gene family, previous sampling using PCR had uncovered 67 homologs, but this study found 782 new ones. 13 families of bacteriorhodopsin were characterized, from a wider range of bacteria than previously thought. 13 families of bacteriorhodopsin were characterized, from a wider range of bacteria than previously thought. One must keep in mind that this data was collected using 1.5 x 10 3 l of water, while the ocean’s estimated volume is 1.37 x 10 15 l. One must keep in mind that this data was collected using 1.5 x 10 3 l of water, while the ocean’s estimated volume is 1.37 x 10 15 l.

18 Families of Bacterio- rhopsin

19 Different Genome Perspectives What you see using comparative genomics depends on what perspective you take. What you see using comparative genomics depends on what perspective you take. Zooming out, from small to large, we get: Zooming out, from small to large, we get: 1) amino acids 2) genes 3) gene families 4) segments of chromosomes 5) whole chromosomes

20 Out with the Old, In with the New One group decided to look at proteomes at the amino acid level. Instead of worrying about the proteins encoded, the researchers identified amino acids that were identical in 2 distantly related species but different in 2 closely related species. This focuses on evolutionary drift. One group decided to look at proteomes at the amino acid level. Instead of worrying about the proteins encoded, the researchers identified amino acids that were identical in 2 distantly related species but different in 2 closely related species. This focuses on evolutionary drift. One pattern was seen: amino acids predicted to be among the 1st incorporated into the genetic code are decreasing, while those predicted to be newer are increasing in frequency. This is true across all 3 domains of life. One pattern was seen: amino acids predicted to be among the 1st incorporated into the genetic code are decreasing, while those predicted to be newer are increasing in frequency. This is true across all 3 domains of life.

21 Figure 3.4

22 Gene Family Level A German group led by Svante Pääbo studied the evolution of olfactory receptor (OR) genes in 19 primates + mouse. A German group led by Svante Pääbo studied the evolution of olfactory receptor (OR) genes in 19 primates + mouse. They plotted the number of OR pseudogenes in each species studied. They plotted the number of OR pseudogenes in each species studied. New World monkeys clustered around 18% pseudogenes, while Old World monkeys had around 30%. Humans had >50% pseudogenes. New World monkeys clustered around 18% pseudogenes, while Old World monkeys had around 30%. Humans had >50% pseudogenes. The one exception is the howler monkey, which seems out of place. Interestingly, all Old World monkeys see in 2 colors, with the exception of the howler monkey, which sees in 3 colors like New World monkeys. The one exception is the howler monkey, which seems out of place. Interestingly, all Old World monkeys see in 2 colors, with the exception of the howler monkey, which sees in 3 colors like New World monkeys.

23

24 Whole Chromosome Level Evan Eichler at Case Western Reserve examined human chromosome 7, looking for recombination hot spots. There were a total of 27, 12 on the short arm (p) and 15 on the long arm (q). Evan Eichler at Case Western Reserve examined human chromosome 7, looking for recombination hot spots. There were a total of 27, 12 on the short arm (p) and 15 on the long arm (q). A team of researchers mapped the recombination events that have produced syntenic regions in human, mouse, rat, and dog. A team of researchers mapped the recombination events that have produced syntenic regions in human, mouse, rat, and dog. CTVM is a genetic disease in dogs that leads to thickened heart valves, it has been mapped to canine chromosome 9. This region is syntenic with chromosome 17 in humans. CTVM is a genetic disease in dogs that leads to thickened heart valves, it has been mapped to canine chromosome 9. This region is syntenic with chromosome 17 in humans.

25 Dot Plots of Recom- bination

26 Comparing 4 Chromosomes When all 4 chromosomes (dog, human, mouse & rat) are compared simultaneously, colored lines are used to highlight the recombinational hotspots, with shaded regions showing the 2 large human recombined areas. When all 4 chromosomes (dog, human, mouse & rat) are compared simultaneously, colored lines are used to highlight the recombinational hotspots, with shaded regions showing the 2 large human recombined areas. Crossing lines show inversions, while bent lines that do not cross show translocations. Crossing lines show inversions, while bent lines that do not cross show translocations. The site of recombination, as well as gene loss, is often conserved across species. Highly repetitive DNA is often involved in recombination The site of recombination, as well as gene loss, is often conserved across species. Highly repetitive DNA is often involved in recombination

27

28 Most Recent Common Ancestor Chromosomes can be Constructed using recombination data.


Download ppt "Comparative Genomics Virulence in E. coli Diversity of Genomes How Many Genomes are There? Different Genome Perspectives."

Similar presentations


Ads by Google