Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genomic island analysis: Improved web-based software and insights into an apparent gene pool associated with genomic islands William Hsiao Brinkman Laboratory.

Similar presentations


Presentation on theme: "Genomic island analysis: Improved web-based software and insights into an apparent gene pool associated with genomic islands William Hsiao Brinkman Laboratory."— Presentation transcript:

1 Genomic island analysis: Improved web-based software and insights into an apparent gene pool associated with genomic islands William Hsiao Brinkman Laboratory Simon Fraser University Burnaby, BC, Canada William Hsiao Brinkman Laboratory Simon Fraser University Burnaby, BC, Canada

2 Prokaryotic Genomic Islands (GIs) Definition: Genomic DNA segments with particular characteristics that indicate horizontal origins Definition: Genomic DNA segments with particular characteristics that indicate horizontal origins GI A bacterium

3 Genomic Island Characteristics  Often contain genes encoding adaptive functions of medical and environmental importance  Pathogenicity Islands: virulence factors (genes contribute to diseases)  Resistance Islands: antibiotic resistance  Metabolic Islands: secondary metabolism (e.g. sucrose)  Often contain genes encoding adaptive functions of medical and environmental importance  Pathogenicity Islands: virulence factors (genes contribute to diseases)  Resistance Islands: antibiotic resistance  Metabolic Islands: secondary metabolism (e.g. sucrose) tRNA gene mob Direct Repeats Genomic Island (e.g. PAI) (%G+C, sequence composition bias) mob: mobility genes chromosome  Exhibit sequence and annotation features VF

4 A yellow circle: %G+C above high cutoff A green circle: % G+C between cutoffs A pink circle: %G+C below low cutoff A black bar: transfer RNA A purple bar: ribosomal RNA A deep blue bar: both tRNA and rRNA A black square: transposase A black triangle: integrase A strike-line: regions with dinucleotide bias (Hsiao et al 2003 Bioinformatics p418-20) IslandPath: Aiding identification of GIs Vibrio cholerae N16961 Chr1 TCP island TCP = toxin co-regulated pili

5 IslandPath V.2

6 Which Features Best Identify GIs Examined prevalence of features in 95 published islands 85% of islands with >25% dinucleotide bias coverage (62% have > 50% dinucleotide bias coverage) 85% of islands with >25% dinucleotide bias coverage (62% have > 50% dinucleotide bias coverage) Mobility genes identified in >75% of the islands Mobility genes identified in >75% of the islands tRNA genes observed in <50% of known islands tRNA genes observed in <50% of known islands Only 20% of the islands show atypical %G+C Only 20% of the islands show atypical %G+C Examined prevalence of features in 95 published islands 85% of islands with >25% dinucleotide bias coverage (62% have > 50% dinucleotide bias coverage) 85% of islands with >25% dinucleotide bias coverage (62% have > 50% dinucleotide bias coverage) Mobility genes identified in >75% of the islands Mobility genes identified in >75% of the islands tRNA genes observed in <50% of known islands tRNA genes observed in <50% of known islands Only 20% of the islands show atypical %G+C Only 20% of the islands show atypical %G+C

7 Properties of genes in GIs? Defined a “putative island” as 8 or more genes in a row with dinucleotide bias 8 or more genes in a row with dinucleotide bias 8 or more genes in a row with dinucleotide bias + an associated mobility gene 8 or more genes in a row with dinucleotide bias + an associated mobility gene Any difference for genes in islands versus outside of islands in terms of their protein Functional categories? 63 genomes (67 chromosomes) analyzed 63 genomes (67 chromosomes) analyzed COG: cluster of orthologous groups of proteins COG: cluster of orthologous groups of proteins Defined a “putative island” as 8 or more genes in a row with dinucleotide bias 8 or more genes in a row with dinucleotide bias 8 or more genes in a row with dinucleotide bias + an associated mobility gene 8 or more genes in a row with dinucleotide bias + an associated mobility gene Any difference for genes in islands versus outside of islands in terms of their protein Functional categories? 63 genomes (67 chromosomes) analyzed 63 genomes (67 chromosomes) analyzed COG: cluster of orthologous groups of proteins COG: cluster of orthologous groups of proteins

8 Paired-t-test P value: 1.27E-18 More novel genes inside of islands Hsiao et al. PLOS Genetics e62, Nov. 2005

9 Control for Analysis Biases Control for mis-prediction of genes in sequence composition biased regions Control for mis-prediction of genes in sequence composition biased regions Excluded genes < 300bps Excluded genes < 300bps Control for bias of COG Protein Classification Control for bias of COG Protein Classification Used SUPERFAMILY classification which is better at detecting distant homologs Used SUPERFAMILY classification which is better at detecting distant homologs Control for compositional bias due to other factors Control for compositional bias due to other factors Used the dinucleotide bias plus mobility gene dataset Used the dinucleotide bias plus mobility gene dataset Control for mis-prediction of genes in sequence composition biased regions Control for mis-prediction of genes in sequence composition biased regions Excluded genes < 300bps Excluded genes < 300bps Control for bias of COG Protein Classification Control for bias of COG Protein Classification Used SUPERFAMILY classification which is better at detecting distant homologs Used SUPERFAMILY classification which is better at detecting distant homologs Control for compositional bias due to other factors Control for compositional bias due to other factors Used the dinucleotide bias plus mobility gene dataset Used the dinucleotide bias plus mobility gene dataset

10 Island DatasetClassification Method Paired t-test p- value DINUC (all genes)COG1.27E-18 DINUC+MOB (all Genes)COG1.20E-18 DINUC (all genes)SUPERFAMILY1.13E-18 DINUC+Mob (all genes)SUPERFAMILY4.43E-14 DINUC (>300bps)COG1.05E-17 DINUC+MOB (>300bps)COG7.65E-16 DINUC (>300bps)SUPERFAMILY3.01E-16 DINUC+MOB (>300bps)SUPERFAMILY2.04E-10 Hsiao et al. PLOS Genetics e62, Nov More novel genes in islands in all experiments

11 Phage may be the predominant donors of GIs Some GIs are clearly of bacteriophage origin, but more may be from phage as well Some GIs are clearly of bacteriophage origin, but more may be from phage as well Predicted subcellular localizations of proteins encoded in our GIs similar to phage genomes (lower proportion of cytoplasmic membrane proteins) Predicted subcellular localizations of proteins encoded in our GIs similar to phage genomes (lower proportion of cytoplasmic membrane proteins) Hsiao et al. PLOS Genetics e62, Nov Hsiao et al. PLOS Genetics e62, Nov Many GI encoded genes have sequence characteristics similar to phage genes (A+T rich and short) Many GI encoded genes have sequence characteristics similar to phage genes (A+T rich and short) Daubin et al. Genome Biol. 4(9): R57 Daubin et al. Genome Biol. 4(9): R57 Some GIs are clearly of bacteriophage origin, but more may be from phage as well Some GIs are clearly of bacteriophage origin, but more may be from phage as well Predicted subcellular localizations of proteins encoded in our GIs similar to phage genomes (lower proportion of cytoplasmic membrane proteins) Predicted subcellular localizations of proteins encoded in our GIs similar to phage genomes (lower proportion of cytoplasmic membrane proteins) Hsiao et al. PLOS Genetics e62, Nov Hsiao et al. PLOS Genetics e62, Nov Many GI encoded genes have sequence characteristics similar to phage genes (A+T rich and short) Many GI encoded genes have sequence characteristics similar to phage genes (A+T rich and short) Daubin et al. Genome Biol. 4(9): R57 Daubin et al. Genome Biol. 4(9): R57

12 Higher proportions of genes in Islands are VFs P value: < 2.2E-16 Fedynak, Hsiao, and Brinkman (unpublished)

13 Certain classes of VFs over- represented in GIs Most of these are “offensive” virulence factors Fedynak, Hsiao, and Brinkman (unpublished)

14 Conclusions Genomic islands contain disproportionately higher number of novel genes, suggesting a large and understudied gene pool contributing to horizontal gene transfer Genomic islands contain disproportionately higher number of novel genes, suggesting a large and understudied gene pool contributing to horizontal gene transfer These novel genes appear to be drawn from a large pool of phage - metagenomics studies useful These novel genes appear to be drawn from a large pool of phage - metagenomics studies useful These novel genes may contribute to microbial adaptation and may play a role in pathogenesis and in antibiotic resistance These novel genes may contribute to microbial adaptation and may play a role in pathogenesis and in antibiotic resistance Genomic islands contain disproportionately higher number of novel genes, suggesting a large and understudied gene pool contributing to horizontal gene transfer Genomic islands contain disproportionately higher number of novel genes, suggesting a large and understudied gene pool contributing to horizontal gene transfer These novel genes appear to be drawn from a large pool of phage - metagenomics studies useful These novel genes appear to be drawn from a large pool of phage - metagenomics studies useful These novel genes may contribute to microbial adaptation and may play a role in pathogenesis and in antibiotic resistance These novel genes may contribute to microbial adaptation and may play a role in pathogenesis and in antibiotic resistance

15 Acknowledgements Fiona Brinkman Fiona Brinkman Amber Fedynak -VF studies Amber Fedynak -VF studies Brian Coombes, Michael Lowden, and Brett Finlay (UBC) - Microarray data Brian Coombes, Michael Lowden, and Brett Finlay (UBC) - Microarray data Jenny Bryan (UBC) -Stats analysis Jenny Bryan (UBC) -Stats analysis Brinkman Laboratory Brinkman Laboratory Fiona Brinkman Fiona Brinkman Amber Fedynak -VF studies Amber Fedynak -VF studies Brian Coombes, Michael Lowden, and Brett Finlay (UBC) - Microarray data Brian Coombes, Michael Lowden, and Brett Finlay (UBC) - Microarray data Jenny Bryan (UBC) -Stats analysis Jenny Bryan (UBC) -Stats analysis Brinkman Laboratory Brinkman Laboratory

16 Other categories more common in islands Category In putative islands: Paired t-test p-value In putative islands + mobility genes: Paired t-test p-value Cell motility 7.73E (may be a sampling size issue) Intracellular trafficking, secretion, and vesicular transport 8.124E (may be a sampling size issue) * Novel genes not included in analysis due to potential skew of other category results Several metabolism-associated categories are under-represented in islands

17 P value 3.0E-16

18 IslandPath V.2

19 Experiment: S. typhimurium LT2 ssrB gene KO Track 1: IslandPath Track 2: Microarray expression (overexp & underexp )


Download ppt "Genomic island analysis: Improved web-based software and insights into an apparent gene pool associated with genomic islands William Hsiao Brinkman Laboratory."

Similar presentations


Ads by Google