Presentation is loading. Please wait.

Presentation is loading. Please wait.

S1 Supporting information Bioinformatic workflow and quality of the metrics Number of slides: 10.

Similar presentations


Presentation on theme: "S1 Supporting information Bioinformatic workflow and quality of the metrics Number of slides: 10."— Presentation transcript:

1 S1 Supporting information Bioinformatic workflow and quality of the metrics
Number of slides: 10

2 RNA-seq: BIOINFORMATIC PIPELINE
Cluster 3.0 / JavaTreeView (v1.1.6r4) Construction of functional clusters Pick Random (v1.0.0*) Standardization of read number to analyse Cufflinks (v2.1.1) Exon-intron structures Cuffmerge (v.1.0.0) Merger of the 21 exon-intron structures Htseq (v0.6.1p1) Quantification of read abundance RNA-seq: BIOINFORMATIC PIPELINE cDNA Library construction PolyA Single strand 50 nucleotides Sequencing HiSeq2000 (Illumina) SBS technique FastQ groomer (v1.0.4*) Fastqsanger format check FASTX-Toolkit (v1.0.0*) (1) Quality statistics (2) Quality score boxplot (3) Nucleotide distribution chart Tophat (v2.0.9) with the aligner Bowtie (v ) Mapping on S. mansoni genome version 5.2 BAM File SAM BAM-to-SAM (v0.1.18) Filter SAM (v1.0.0*) Deletion of unmapped reads New S.mansoni transcriptome Used as reference Sex-biased genes per developmental stages QUALITY CHECK MAPPED READS SAMPLING DE NOVO TRANSCRIPTOMEASSEMBLY IDENTIFICATION OF SEX-BIASED GENES FUNCTIONAL ANALYSES Blast2GO (v2.6.4) Male and female specific biological pathways through S. mansoni lifecycle Blastx (v2.2.30)/ AmiGO (v1.8) / GeneDB De novo annotation of the 100 best sex-biased genes per stage STRUCTURAL ANALYSES IGV (v2.3.16) Blat (v34) Cuffcompare (v2.2.1) Exon/Intron structure genome v5.2 vs de novo transcriptome DEseq (v1.12.1 ) Assessment of differences in gene expression * Galaxy tool version S3 – 2/10 Cluster Analysis candidate genes expression variation through S. mansoni lifecycle

3 RNA-seq: QUALITY OF THE METRICS

4 RNA-seq: REPLICATE CLUSTERING
♀ 1 ♀ 2 ♂ 1 ♂ 2 ♀ 1 ♀ 2 ♂ 1 ♂ 2 ♀ 1 ♀ 2 ♂ 1 ♂ 2 ♀ 1 ♀ 1 ♀ 1 ♀ 2 ♀ 2 ♀ 2 ♂ 1 ♂ 1 ♂ 1 ♂ 2 ♂ 2 ♂ 2 Cercariae Schistosomula s#2 Adult worms Schistosomula s#1 ♂ 1 ♀ 1 ♂ 2 ♀ 2 Schistosomula s#3 ♂ 1 ♀ 1 ♂ 2 ♀ 2 100% 0% identity ♀1 & ♀2 female duplicates ♂1 & ♂2 male duplicates DESeq package (v1.12.1) S3 – 4/10

5 RNA-seq: HEATMAPS (100 best P-values per stage)
♂ 1 ♀ 1 ♂ 2 ♀ 2 Cercariae ♂ 1 ♂ 2 ♀ 1 ♀ 2 Schistosomula s#1 ♂ 1 ♀ 1 ♂ 2 ♀ 2 Schistosomula s#2 ♂ 1 ♀ 1 ♂ 2 ♀ 2 Schistosomula s#3 ♂ 1 ♀ 1 ♂ 2 ♀ 2 Adult worms DESeq package (v1.12.1) S3 – 5/10

6 Proportion of categories
Quality analysis of the de novo transcriptome Description of the type of matches between the Cufflinks transcripts (XLOC) and the reference transcripts (Smp_ID v5.2) Number of XLOC Proportion of XLOC Match categories Proportion of categories Complete match of intron chain 6642 19,11% Smp overlap 27,86% Contained 113 0,33% Potentially novel isoform (fragment): at least one splice junction is shared with a reference transcript 2417 6,95% Single exon transfrag overlapping a reference exon and at least 10 bp of a reference intron, indicating a possible pre-mRNA fragment. 272 0,78% Generic exonic overlap with a reference transcript 239 0,69% A transfrag falling entirely within a reference intron 6613 19,03% Intronic transcript Possible polymerase run-on fragment (within 2Kbases of a reference transcript) 1466 4,22% Others 7,72% Exonic overlap with reference on the opposite strand 1193 3,43% An intron of the transfrag overlaps a reference intron on the opposite strand (likely due to read mapping errors) 24 0,07% Unknown, intergenic transcript 15776 45,39% Intergenic Repeat. Currently determined by looking at the soft-masked reference sequence and applied to transcripts where at least 50% of the bases are lower case 0,00% - (.tracking file only, indicates multiple classifications) Cuffcompare (Cufflinks v2.2.1) *Source of the S. mansoni genome reference: ftp://ftp.sanger.ac.uk/pub/pathogens/Schistosoma/mansoni/Latest_assembly_annotation_others/add_utrs.gff S3 – 6/10

7 ChIP-seq: QUALITY OF THE METRICS
Male adults Female adults Male cercariae Female cercariae Unbound_1 Unbound_2 H3K27Me3_1 H3K27Me3_2 Raw data Groomed QC passed Aligned (=1) Aligned (=1) % 53,40% 50,69% 56,29% 53,17% 47,07% 49,55% 48,14% 50,83% 54,15% 49,83% 59,74% 56,70% 43,18% 48,00% 49,05% 52,95% Aligned (>1) Aligned (>1) % 42,33% 45,89% 39,21% 42,74% 46,01% 45,67% 43,25% 42,45% 46,46% 36,18% 39,57% 52,27% 47,85% 46,43% 42,11% Total mapping % 95,72% 96,59% 95,51% 95,91% 95,07% 95,56% 93,81% 94,07% 96,60% 96,29% 95,92% 96,27% 95,45% 95,85% 95,48% 95,06% Used for peak calling Number of peaks N.A. 8363 6947 14 302 5 697 7 116 11 382 5 044 4 719 S3 – 7/10

8 ChIP-seq: REPLICATE CONSISTENCY OF EPICHIP ANALYSIS
TSS H3K27Me3 enrichment Male cercariae, replicate 1 Male cercariae, replicate 2 Female cercariae, replicate 1 Female cercariae, replicate 2 Female cercariae, replicate 3 Position on the gene (bases) TSS H3K27Me3 enrichment Position on the gene (bases) Male adults, replicate 1 Male adults, replicate 2 Female adults, replicate 1 Female adults, replicate 2 Female adults, replicate 3 S3 – 8/10 TSS = Transcription Start Site

9 ChIP-seq: Male and Female H3K27Me3 enrichments, depending on the developmental stages.
A: Males Cercariae Adults -1000 0 = TSS +5000 0.5 0.6 0.7 0.8 0.9 1 B: Females Cercariae Adults -1000 0 = TSS +5000 0.5 0.6 0.7 0.8 0.9 1 Position on the gene (bases) S3 – 9/10 TSS = Transcription Start Site

10 Statistical test of EpiChIP profile differences
Comparison between sexes Extreme Differences Z value p Cercariae 76,50% 41,91 <0,001 Adults 10,70% 5,862 Comparison between stages Males 56,30% 30,811 Femelles 25,60% 13,997 All pairs of comparison were significant (Kolmogorov-Smirnov two sample tests; p<0.001). The extreme differences given by the Kolmogorov-Smirnov two sample tests show that: (i) The difference between adult male and adult female distributions is low (10.7% of maximum difference) compare to cercarial stage (76.5% of maximum difference). (ii) The difference in chromatin structural changes from cercariae to adult is twice in males compare to females (25.6% of maximum difference for females vs 56.3% of maximum difference for males). S3 – 10/10


Download ppt "S1 Supporting information Bioinformatic workflow and quality of the metrics Number of slides: 10."

Similar presentations


Ads by Google