Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sahar Abubucker, Nicola Segata,

Similar presentations


Presentation on theme: "Sahar Abubucker, Nicola Segata,"— Presentation transcript:

1 Sahar Abubucker, Nicola Segata,
Scalable metabolic reconstruction for metagenomic data and the human microbiome Sahar Abubucker, Nicola Segata, Johannes Goll, Alyxandria M. Schubert, Jacques Izard, Brandi L. Cantarel, Beltran Rodriguez-Mueller, Jeremy Zucker, Mathangi Thiagarajan, Bernard Henrissat, Owen White, Scott T. Kelley, Barbara Methé, Patrick D. Schloss, Dirk Gevers, Makedonka Mitreva, Curtis Huttenhower Harvard School of Public Health Department of Biostatistics

2 What’s metagenomics? Total collection of microorganisms within a community Also microbial community or microbiota Total genomic potential of a microbial community Study of uncultured microorganisms from the environment, which can include humans or other living hosts Total biomolecular repertoire of a microbial community

3 Valm et al, PNAS 2011

4 What to do with your metagenome?
Reservoir of gene and protein functional information Comprehensive snapshot of microbial ecology and evolution Who’s there? What are they doing? Who’s there varies: your microbiota is plastic and personalized. What they’re doing is adapting to their environment: you, your body, and your environment. Public health tool monitoring population health and interactions Diagnostic or prognostic biomarker for host disease

5 The Human Microbiome Project for a normal population
300 People/ 15(18) Body Sites Multifaceted data Multifaceted analyses >6,000 samples >50M 16S seqs. 4Tbp unique metagenomic sequence >1,900 reference genomes Full clinical metadata Human population Microbial population Novel organisms Biotypes Viruses Metabolism 2 clin. centers, 4 seq. centers, data generation, technology development, computational tools, ethics…

6 Metabolic/Functional Reconstruction: The Goal
Healthy/IBD BMI Diet SNP genotypes Taxon abundances Enzyme family abundances Pathway abundances Gene expression LEfSe: LDA Effect Size Metagenomic biomarker discovery Nicola Segata

7 HMP: Metabolic reconstruction
Functional seq. KEGG + MetaCYC CAZy, TCDB, VFDB, MEROPS… 100 subjects 1-3 visits/subject ~7 body sites/visit 10-200M reads/sample 100bp reads BLAST HUMAnN: HMP Unified Metabolic Analysis Network BLAST → Genes WGS reads Genes → Pathways MinPath (Ye 2009) Genes (KOs) Taxonomic limitation Rem. paths in taxa < ave. ? Pathways (KEGGs) Pathways/ modules Smoothing Witten-Bell Xipe Distinguish zero/low (Rodriguez-Mueller in review) Gap filling c(g) = max( c(g), median )

8 HUMAnN: Metabolic reconstruction
← Pathways→ ← Samples → Vaginal Skin Nares Gut Oral (SupP) Oral (BM) Oral (TD) ← Samples → ← Pathways→ Vaginal Skin Nares Gut Oral (SupP) Oral (BM) Oral (TD) Specifically, HUMAnN produces two outputs for any metagenomic sample. It first makes pathway presence/absence or coverage calls, indicating in a binary way which microbial pathways are present in the community. It then also assigns relative abundances to the pathways that are there, resulting in that matrix I described of communities by pathways, with each cell representing one pathway’s abundance in one community. The pathway coverage matrix looks similar, with each pathway marked as 0 if it’s absent or 1 if it’s present, independent of abundance. What you’re looking at here is an overview of the real HUMAnN results for about 700 HMP samples spanning these seven body sites, but it’s not the most informative view of the data. Pathway coverage Pathway abundance

9 HUMAnN: Validating gene and pathway abundances on synthetic data
Validated on individual gene families, module coverage, and abundance 4 synthetic communities: Low (20 org.) and high (100 org.) complexity Even and lognormal abundances Best-BLAST-hit overshoots false positives, undershoot real pathways as a result HUMAnN FNs: short genes (<100bp), taxonomically rare pathways HUMAnN FPs: large and multicopy (not many in bacteria) Individual gene families ρ=0.91

10 A portrait of the healthy human microbiome: Who’s there vs
A portrait of the healthy human microbiome: Who’s there vs. what they’re doing ← Relative abundance → ← Phylotypes → ← Relative abundance → Nares Oral (BM) Vaginal Skin Gut Oral (SupP) Oral (TD) ← Relative abundance → ← Pathways → Instead, consider this view that Dirk showed earlier of the taxa in a collection of the HMP samples, specifically the gut communities. Each color here represents a taxon, and as he described, there’s huge variation among individuals, with Bacteroides ranging from tremendously dominant to a minority organism. We can use this same view for the metabolic reconstruction data, where now each color represents a specific metabolic pathway. There’s much less variation in community function among individuals than there is in community composition – roughly the same pathways are present at the same abundances in each individual’s gut, regardless of variation in membership. This is true not only in the gut, but in all of the HMP’s body sites. The body sites themselves differ somewhat in metabolism, as I’ll discuss later, but again not nearly as dramatically as they differ in composition. This is now an overview of most of the HMP’s 16S data and all 700 of our metabolic reconstructions, and it captures the important functional patterns in the human microbiome perhaps better than the previous slide. From this high-level perspective, though, I’d like to zoom in to a specific pathway or set of enzymes, though, and talk about… ← Relative abundance →

11 Niche specialization in human microbiome function
Metabolic modules in the KEGG functional catalog enriched at one or more body habitats Linking this back to the HMP’s metabolic reconstruction data, we used LEfSe to summarize pathways with significant differences in abundance between body sites, resulting in this set of pretty colors plotted on top of the KEGG functional hierarchy. So here, you’re looking at a tree of pathways and processes drawn from KEGG, in which each outermost leaf node represents an individual small metabolic module. Portions of the hierarchy are colored based on the body site where they’re most significantly enriched, so for example this yellow dot corresponds to an overabundance of spermidine biosynthesis in the GI tract, which is also fairly abundant throughout the oral cavity. You’ll notice that there’s a tremendous amount of niche specialization in the form of over- and under-abundant processes, which is one way of seeing metabolic function in the human microbiome adapting specifically to each niche throughout the body. This regulation of metabolic abundances is somewhat in contrast to the complete absence of pathways, however, in that relatively few processes tend to be completely absent from any body site. This outer ring, for example, shows just 24 pathways that show such a presence/absence pattern, such as type six secretion in the oral cavity, which is completely absent from other body sites. Such absences seem to be rare, though, with most functionality following an everything-is-everywhere pattern, and the environment selects for how abundant particular metabolic processes are based on their utility to the organisms in each habitat. Nicola Segata

12 Proteoglycan degradation by the gut microbiota
Glycosaminoglycans (Polysaccharide chains) AA core

13 Proteoglycan degradation: From pathways to enzymes
10-3 10-8 Enzyme relative abundance Heparan sulfate degradation missing due to the absence of heparanase, a eukaryotic enzyme Other pathways not bottlenecked by individual genes HUMAnN links microbiome-wide pathway reconstructions → site-specific pathways → individual gene families

14 Patterns of variation in human microbiome function by niche
Three main axes of variation Eukaryotic + oxidative metabolism on exterior surfaces (skin/airways) Low-complexity vaginal community Unique carbohydrate metabolism in the gut Oral habitats covary consistently, with significant differences on the non-mucosal tooth surface These are only broad patterns: as seen above, every human-associated microbial habitat is functionally distinct!

15 Patterns of variation in human microbiome function by niche
Three main axes of variation Eukaryotic exterior Low-diversity vaginal Gut metabolism Oral vs. tooth hard surface Only broad patterns: every human-associated habitat is functionally distinct! Three main axes of variation Eukaryotic + oxidative metabolism on exterior surfaces (skin/airways) Low-complexity vaginal community Unique carbohydrate metabolism in the gut Oral habitats covary consistently, with significant differences on the non-mucosal tooth surface These are only broad patterns: as seen above, every human-associated microbial habitat is functionally distinct!

16 How do microbes and function vary within each body site across the population?

17 How do body sites compare between individuals across the population?

18 HMP: Prevalence of species (OTUs) across the population
Cumulative prevalence

19 HMP: Prevalence of pathways across the population
16 (of 251) modules strongly “core” at 90%+ coverage in 90%+ individuals at 7 body sites 24 modules at 33%+ coverage 71 modules (28%) weakly “core” at 33%+ coverage in 66%+ individuals at 6+ body sites Contrast zero phylotypes or OTUs meeting this threshold! Only 24 modules (<10%) differentially covered by body site Compare with 168 modules (>66%) differentially abundant by body site Cumulative prevalence

20 Linking function to community composition
Plus ubiquitous pathways: transcription, translation, cell wall, portions of central carbon metabolism… ← 52 posterior fornix microbiomes → Lactobacillus crispatus Phosphate and peptide transport Lactobacillus jensenii Sugar transport Embden-Meyerhof glycolysis, phosphotransferases Lactobacillus gasseri Lactobacillus iners F-type ATPase, THF ← Taxa and correlated metabolic pathways → As Dirk mentioned, the vaginal community is particularly low-diversity, and it’s been observed by Jacques Ravel and Larry Forney in their HMP Demonstration Project to assemble into one of about five different states within each individual. Each state is dominated by just one or two specific organisms, and you can see that here in a view of these Lactobacilli and other organisms in the HMP’s 52 posterior fornix metagenomes. However, we’ve co-clustered these organisms’ abundances with that of microbial metabolism that’s also differentially abundant and specific to these community types – the vaginal community tends to adopt one of just a few possible states within each individual, dominated by one or two organisms that bring a specific set of unique metabolism to the community. These examples are, of course, complemented by processes that are present in every community type, including basic housekeeping and metabolism. Gardnerella/Atopobium AA and small molecule biosynthesis Candida/Bifidobacterium Eukaryotic pathways

21 Linking communities to host phenotype
Body Mass Index Top correlates with BMI in stool Vaginal pH (posterior fornix) Normalized relative abundance Finally, it’s important to emphasize that in this case, community structure and function also correlate with host phenotype in the form of vaginal pH. If you compare host vaginal pH in these 50 subjects to the abundance of specific organisms and metabolism, there’s a relatively large set of processes that are depleted in high-pH communities, plus the organism Lactobacillus crispatus. These are instead replaced by a collection of other organisms and processes that come to dominate high-pH communities. This is one of the strongest examples we’ve observed in the HMP of a direct correlation all the way from community membership to function to host phenotype, and I should emphasize the strength of this effect in comparison to other phenotypes of interest like BMI in the gut, where this is the same view of the best correlations we observe there. Vaginal pH (posterior fornix) Vaginal pH, community metabolism, and community composition represent a strong, direct link between phenotype and function in these data.

22 Who’s there varies even in health
Microbial biomolecular function and metabolism in the human microbiome: the story so far? HUMAnN Accurate metagenomic metabolic reconstruction Sequences → genes → pathways → phenotypes Validated on 4x synthetic communities Who’s there varies even in health What they’re doing doesn’t (as much) There are patterns in this variation Communities in related environments adapt using related functions Function correlates with membership and phenotype ~1/3 to 2/3 of human metagenome characterized Job security!

23 Ask both what you can do for your microbiome and what your microbiome can do for you
This, along with an egregious misquote of JFK, is the message I’d like to conclude with today. It’s important to remember that you and your microbiome interact constantly, and that its biomolecular functionality is an important complement to your own. The human genome is a fixed structure – we all carry the same genes for our entire lives. To draw a parallel with Waddington’s energy landscape, though, our microbiome and its functionality is fluid – it’s plastic, constantly responding to selection from its environment. That selection includes you, the landscape of microbial environments throughout the human body, and each of our unique diets and environments that shape and personalize the functionality of our microbiomes over the course of our lives. Waddington

24 Interested? We’re recruiting students and postdocs!
Thanks! Human Microbiome Project Dirk Gevers George Weinstock Owen White Rob Knight Makedonka Mitreva Erica Sodergren Mihai Pop Vivien Bonazzi Jane Peterson Lita Proctor Nicola Segata Levi Waldron Fah Sathira Johannes Goll Yuzhen Ye Beltran Rodriguez-Mueller Jeremy Zucker Mathangi Thiagarajan Brandi Cantarel Qiandong Zeng Maria Rivera Barbara Methe Bill Klimke Daniel Haft Sahar Abubucker Jacques Izard HMP Metabolic Reconstruction Bruce Birren Ramnik Xavier Doyle Ward Eric Alm Ashlee Earl Lisa Cosimi Alyx Schubert Pat Schloss Ben Ganzfried Interested? We’re recruiting students and postdocs! Vagheesh Narasimhan Larisa Miropolsky

25

26 HMP: Prevalence of genera (phylotypes) across the population
Cumulative prevalence


Download ppt "Sahar Abubucker, Nicola Segata,"

Similar presentations


Ads by Google