Presentation is loading. Please wait.

Presentation is loading. Please wait.

Functional profiling with HUMAnN2

Similar presentations


Presentation on theme: "Functional profiling with HUMAnN2"— Presentation transcript:

1 Functional profiling with HUMAnN2
Eric Franzosa Jason Lloyd-Price Functional profiling with HUMAnN2 Curtis Huttenhower Galeb Abu-Ali Ali Rahnavard STAMPS 2017 Harvard T.H. Chan School of Public Health Department of Biostatistics

2 The two big questions of microbial community profiling:
What are they doing? Who is there? (functional profiling) (taxonomic profiling) Like many great bioinformatics problems, answering these questions begins with sequence search!

3 HUMAnN2 for taxon-specific metagenome and metatranscriptome functional profiling
The relative abundance of gene i in a metagenome is the number of reads j that map to a gene sequence in the family, weighted by the inverse p-value of each mapping and normalized by the average length of all gene sequences in the orthologous family. Eric Franzosa Lauren McIver

4 HUMAnN2: stratified output
UniRef gene cluster Gene name Total gene abundance (RPK) UniRef90_R6K3Z5: IMP dehydrogenase 600.95 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_caccae 234.76 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_dorei 107.38 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_ovatus 92.18 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_stercoris 83.95 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_vulgatus 57.27 UniRef90_R6K3Z5: IMP dehydrogenase|unclassified 25.41 Σ Per-species & unclassified stratifications ~HUMAnN1 MetaCyc pathway Pathway abundance & coverage PWY-7221: GTP biosynthesis 200.35 1 PWY-7221: GTP biosynthesis|Bacteroides_caccae 120.23 PWY-7221: GTP biosynthesis|Bacteroides_dorei 11.12

5 HUMAnN2 real-world performance
~60% of reads align before translated search ~15% more reads align during translated search (total ~75%) Applied HUMAnN2’s tiered search to profile >2K human metagenomes (HMP1-II, six major body sites) Pangenome search tier 1-2 orders of magnitude faster than comprehensive translated search DIAMOND w/ comprehensive protein db bowtie2 w/ sample-specific pangenome db

6 And it works on non-human meta’omes, too
Luke Thompson

7 Quantifying the diversity of species contributing a function within and across subjects
low between-subject diversity high low simple, consistent simple, variable within-subject diversity A pathway’s contributional alpha-diversity is calculated from the distribution of taxa providing it (DNA or RNA) within a community; contributional beta-diversity is the corresponding comparison between communities. complex, consistent complex, variable high

8 HUMAnN2 reveals unusual “relative expression” in paired metatranscriptomes & metagenomes
Sucrose degradation follows a complex attribution pattern across ~200 human gut metagenomes… In collaboration with the STARR Consortium & HPFS cohort …but its expression can be dominated by a single species in paired gut metatranscriptomes!

9 The “HMP2” IBD Multi’omics Data resource
With Ramnik Xavier

10 The IBD Multi’omics DataBase
Cesar Arze

11 The IBD metatranscriptome in the HMP2 IBDMDB
117 Subjects: 59 Crohn’s Disease 34 Ulcerative Colitis 24 non-IBD Controls Gender: 57 Female 59 Male 1 unknown Cohorts: 32 MGH adult new onset 30 Cedars-Sinai adult establ. 31 Cincinnati peds new onset 11 Emory peds new onset 13 MGH peds new onset Melanie Schirmer

12 Different microbes can transcribe shared pathways
HISDEG-PWY: L-histidine degradation I Histidine is an α-amino acid that is used in the biosynthesis of proteins A. putredinis has been implicated in IBD Major contributor to transcription in subsets of IBD patients

13 PWY-7094: fatty acid salvage
Pathways can be contributed by different microbes over time PWY-7094: fatty acid salvage Faecalibacterium prausnitzii Time-courses for individual patients: CD Patient 1 CD Patient 2

14 https://bitbucket.org/biobakery/biobakery/wiki/humann2
HUMAnN2 tutorial

15

16 HUMAnN2 synthetic evaluation (genes)
…and is ~3x faster ~2.1 hours ~0.7 hours (10M reads, 8 cores) HUMAnN2 tiered search is more accurate… Comprehensive search suffers from spurious hits ...and provides accurate per-species quantification! Compare exp. vs. obs. gene abundance 1x Staggered abundance ~0.1x to 100x coverage Synthetic human gut metagenome (top 20 species)

17 HUMAnN2 real-world performance

18 Considerations for paired metatranscriptomes & metagenomes
$ humann2_rna_dna_norm --input_dna <DNA genefamilies file> input_rna <RNA genefamilies file> --output_basename <basename of the 3 output files> Calculates RNA/DNA abundance ratios Smooths the RNA and DNA abundances prior to taking the ratio Also outputs smoothed RNA and DNA files UniRef90_R6K3Z5: IMP dehydrogenase 2.02 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_caccae 5.96 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_dorei 3.82 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_ovatus 1.80 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_stercoris 0.87 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_vulgatus 0.34 UniRef90_R6K3Z5: IMP dehydrogenase|unclassified 1.96


Download ppt "Functional profiling with HUMAnN2"

Similar presentations


Ads by Google