Presentation is loading. Please wait.

Presentation is loading. Please wait.

Update on HTProcess Apps Sciplant May 8, 2014. HTProcessPipeline Purpose- – Provide a more functional set of commonly needed applications for RNASeq and.

Similar presentations


Presentation on theme: "Update on HTProcess Apps Sciplant May 8, 2014. HTProcessPipeline Purpose- – Provide a more functional set of commonly needed applications for RNASeq and."— Presentation transcript:

1 Update on HTProcess Apps Sciplant May 8, 2014

2 HTProcessPipeline Purpose- – Provide a more functional set of commonly needed applications for RNASeq and Genome Assembly – Provide tools that allow bio-scientists to spend more time on considering the science of their data analysis path, and less time mousing, clicking, and typing – Key attributes: pipeline analysis environment, documentation of the analysis, smart information) management (including metadata

3 Current Active List

4 HTProcess_fastqc-0.1 Creates main HTProcess directory of read files Creates a manifest file to describe the reads in a library of sequencing files Runs Fastqc on each read file. For paired read files, tests for proper pairing of reads Takes in up to 3 different folders of reads: left reads; right reads; and unpaired reads Prepares single report readable within the user’s browser by clicking on it

5 HTProcess_fastqc-0.1 Example: fastqc_summary.html fastqc_summary.html

6 HTProcess_Reads

7

8 HTProcess.log HTPROCESS1 Mon May 5 15:38:12 MST 2014 fastqc is finished testing 2 files in the first paired read directory. fastqc is finished testing 2 files in the second paired read directory. fastqc is finished testing 1 files in the directory for single reads. Reads1 and Reads2 have the same number of files. Testing for valid pairing. SRR566981.sra_1.fastq,SRR566981.sra_2.fastq properly ordered SRR567164.sra_1.fastq,SRR567164.sra_2.fastq properly ordered All Trim settings have been set to trim settings 1. Edit them on manifest_file.txt to customize trimming. Starting creation of summary file for FASTQC reports First Phase of HTPROCESS1 FINISHED Mon May 5 15:46:07 MST 2014 The summary file for all the FASTQC reports has been created. HTPROCESS-FASTQC FINISHED Mon May 5 15:46:40 MST 2014

9 Manifest File- example HTProcess1_Reads...................................................... Library_name=testfiles Library_num=1 condition=testing pairing=paired_and_unpaired pair_spacing=400 pair_sd=35 pair_type=fragment encoding_SRR566981.sra_1.fastq=1.5 max_len_SRR566981.sra_1.fastq=78 encoding_SRR567164.sra_1.fastq=1.5 max_len_SRR567164.sra_1.fastq=76 encoding_SRR566981.sra_2.fastq=1.5 max_len_SRR566981.sra_2.fastq=78 encoding_SRR567164.sra_2.fastq=1.5 max_len_SRR567164.sra_2.fastq=76 encoding_fragSc_1.fq=1.9 max_len_fragSc_1.fq=101 library_max=101 Paired Reads !PPP SRR566981.sra_1.fastq,SRR566981.sra_2.fastq !PPP SRR567164.sra_1.fastq,SRR567164.sra_2.fastq Reads1 !XXX SRR566981.sra_1.fastq !XXX SRR567164.sra_1.fastq Reads2 !YYY SRR566981.sra_2.fastq !YYY SRR567164.sra_2.fastq ReadsS !ZZZ fragSc_1.fq...................................................... !!!TRIM SETTINGS!!!...................................................... !PairTrim 1 SRR566981.sra_1.fastq,SRR566981.sra_2.fastq !PairTrim 1 SRR567164.sra_1.fastq,SRR567164.sra_2.fastq !SingleTrim 1 fragSc_1.fq

10 Apps for creating the input directories, and for creating them and running HTProcess_fastqc

11 HTProcess_trimmomatic_0.32 Trimmomatic is a mult-function paired or unpaired read trimmer Basic trimming of a given number of bases on either end Removes contaminants that match sequences given by the user in a separate fasta file – e.g. adapter, primer sequences 2 Different methods for quality trimming Allows for 2 different programs or sets of settings to be used with the reads in a library

12 Manifest File- example HTProcess1_Reads...................................................... Library_name=testfiles Library_num=1 condition=testing pairing=paired_and_unpaired pair_spacing=400 pair_sd=35 pair_type=fragment encoding_SRR566981.sra_1.fastq=1.5 max_len_SRR566981.sra_1.fastq=78 encoding_SRR567164.sra_1.fastq=1.5 max_len_SRR567164.sra_1.fastq=76 encoding_SRR566981.sra_2.fastq=1.5 max_len_SRR566981.sra_2.fastq=78 encoding_SRR567164.sra_2.fastq=1.5 max_len_SRR567164.sra_2.fastq=76 encoding_fragSc_1.fq=1.9 max_len_fragSc_1.fq=101 library_max=101 Paired Reads !PPP SRR566981.sra_1.fastq,SRR566981.sra_2.fastq !PPP SRR567164.sra_1.fastq,SRR567164.sra_2.fastq Reads1 !XXX SRR566981.sra_1.fastq !XXX SRR567164.sra_1.fastq Reads2 !YYY SRR566981.sra_2.fastq !YYY SRR567164.sra_2.fastq ReadsS !ZZZ fragSc_1.fq...................................................... !!!TRIM SETTINGS!!!...................................................... !PairTrim 1 SRR566981.sra_1.fastq,SRR566981.sra_2.fastq !PairTrim 1 SRR567164.sra_1.fastq,SRR567164.sra_2.fastq !SingleTrim 1 fragSc_1.fq Change to 2 to use a separate program to trim the reads in this file!

13 Inputs for HTProcess_trimmomatic_0.32

14 Settings for trimmomatic

15 Output Files for HTProcess_trimmomatic_0.32

16 Combined unpaired reads for the entire library

17 Output Files for HTProcess_trimmomatic_0.32 Individual single read files for those who want to run all reads in a single library

18 Manifest File HTProcess1_Reads...................................................... Library_name=testfiles Library_num=1 condition=testing pairing=paired_and_unpaired pair_spacing=400 pair_sd=35 pair_type=fragment encoding_SRR566981.sra_1.fastq=1.5 max_len_SRR566981.sra_1.fastq=78 encoding_SRR567164.sra_1.fastq=1.5 max_len_SRR567164.sra_1.fastq=76 encoding_SRR566981.sra_2.fastq=1.5 max_len_SRR566981.sra_2.fastq=78 encoding_SRR567164.sra_2.fastq=1.5 max_len_SRR567164.sra_2.fastq=76 encoding_fragSc_1.fq=1.9 max_len_fragSc_1.fq=101 library_max=101 Paired Reads !PPP SRR566981.sra_1.fastq,SRR566981.sra_2.fastq !PPP SRR567164.sra_1.fastq,SRR567164.sra_2.fastq Reads1 !XXX SRR566981.sra_1.fastq !XXX SRR567164.sra_1.fastq Reads2 !YYY SRR566981.sra_2.fastq !YYY SRR567164.sra_2.fastq ReadsS !ZZZ fragSc_1.fq...................................................... !!!TRIM SETTINGS!!!...................................................... !PairTrim 1 SRR566981.sra_1.fastq,SRR566981.sra_2.fastq !PairTrim 1 SRR567164.sra_1.fastq,SRR567164.sra_2.fastq !SingleTrim 1 fragSc_1.fq...................................................... !!!TRIMMED READS!!!...................................................... !TRIMMED_Pr TrmPr1_SRR566981.sra_1.fastq,TrmPr2_SRR566981.sra_2.fastq !TRIMMED_Pr TrmPr1_SRR567164.sra_1.fastq,TrmPr2_SRR567164.sra_2.fastq !TRIMMED_S TrmS_testfiles.fastq...................................................... !!!TRIMMED ORPHAN AND INDIVIDUAL SINGLES!!!...................................................... Not used for normal analysis with a completely uniform library...................................................... !TRIMMED_OS TrmSos_SRR566981.sra_1.fastq !TRIMMED_OS TrmSos_SRR567164.sra_1.fastq !TRIMMED_OS TrmSos_fragSc_1.fq

19 Manifest File !!!TRIM SETTINGS!!!...................................................... !PairTrim 1 SRR566981.sra_1.fastq,SRR566981.sra_2.fastq !PairTrim 1 SRR567164.sra_1.fastq,SRR567164.sra_2.fastq !SingleTrim 1 fragSc_1.fq...................................................... !!!TRIMMED READS!!!...................................................... !TRIMMED_Pr TrmPr1_SRR566981.sra_1.fastq,TrmPr2_SRR566981.sra_2.fastq !TRIMMED_Pr TrmPr1_SRR567164.sra_1.fastq,TrmPr2_SRR567164.sra_2.fastq !TRIMMED_S TrmS_testfiles.fastq...................................................... !!!TRIMMED ORPHAN AND INDIVIDUAL SINGLES!!!...................................................... Not used for normal analysis with a completely uniform library...................................................... !TRIMMED_OS TrmSos_SRR566981.sra_1.fastq !TRIMMED_OS TrmSos_SRR567164.sra_1.fastq !TRIMMED_OS TrmSos_fragSc_1.fq Keep track of which reads are to be used for which analysis path with the entries in the manifest file

20 HTProcess_tophat-2.0.11 Nearly finished Produces BAM files for all trimmed reads Will produce a merged BAM file, also, to reflect the whole library

21 Manifest file vs Metadata In the future if metadata can be read by app and written by an app, then : – The manifest file could be replaced by metadata – The manifest file could be populated by metadata – The metadata could be populated by the app, but the manifest file could be created, too, for a more portable list of files used

22 Mobile/Tablet Use? The HTProcess apps are written, in part, with the idea that tablet/touchscreen interfaces may be better supported by the DE HTProcess apps may work within a more pipeline-oriented interface within the DE or a separate/related interface

23 Additional Apps HTProcess_Kmergenie – Analyze kmer coverage of reads HTProcess_Cufflinks – If I have time RSEM (not HTProcess) Updates of older apps


Download ppt "Update on HTProcess Apps Sciplant May 8, 2014. HTProcessPipeline Purpose- – Provide a more functional set of commonly needed applications for RNASeq and."

Similar presentations


Ads by Google