Presentation is loading. Please wait.

Presentation is loading. Please wait.

Call SNPs & Infer Phylogeny (CSI Phylogeny)

Similar presentations


Presentation on theme: "Call SNPs & Infer Phylogeny (CSI Phylogeny)"— Presentation transcript:

1 Call SNPs & Infer Phylogeny (CSI Phylogeny)
Rolf Sommer Kaas

2 Single Nucleotide Polymorphism
Assumption: Random + Independent Find differences (SNP calling), compared to a reference sequence. Close reference is better Make pseudo-alignment (independent assumption) Infer phylogeny

3 CSI Phylogeny – SNP calling
Raw reads Map reads to reference (BWA) Call all possible SNPs (Samtools) Filter positions and SNPs using: coverage, quality, and z-score Putting ignored snps in “ignored” file. Z-score = ()/() Pruning SNPs. Determine coverage along reference >coverage_file.gz Output: VCF file (filtered/flt) Assembled sequence NUCMER (Part of MUMMER) Filter based on length from end + pruning Output: VCF file (filtered/flt)

4 Variant Calling Format (VCF)
##fileformat=VCFv4.1 ##.. #CHROM POS ID REF ALT QUAL FILTER NODE_ C T 73 . NODE_ G A 125 . NODE_ A C 222 . INFO FORMAT qbot_dir/tmp//0807T15624_R1.trim.sorted.bam DP=1644;VDB=0.0408;AF1=0.5;AC1=1;DP4=2,2,3,14;MQ=60;FQ=63.7;PV4=0.23,4.1e-09,1,0.23 GT:PL:GQ 0/1:103,0,91:94 DP=82;VDB=0.0418;AF1=0.5163;AC1=1;DP4=1,2,19,28;MQ=60;FQ=-15.1;PV4=1,1e-11,1,1 GT:PL:GQ 0/1:155,0,12:15 DP=86;VDB=0.0421;AF1=1;AC1=2;DP4=0,0,40,39;MQ=60;FQ=-265 GT:PL:GQ 1/1:255,238,0:99

5 CSI Phylogeny – Alignment + Tree
Raw reads Assembled sequence VCF file (filtered) Find shared positions in all input VCF files (final output) Known accuracy issue SNP matrix (pairwise) Concatenate SNPs + Infer ML phylogeny (FastTree) Final phylogeny

6 CSI Phylogeny – How to run it
Environment /home/projects/pr_hackinars/data/19may/load_CSI_phylogeny_env.sh

7 CSI Phylogeny – How to run it
Recommendation: start a screen session Ks_SNP_tree_pipeline.pl --qbot_dir <dir> --out_dir <dir> --reference <fasta> <arg1> ..<argN> Options --disable_qbot_call qbot will not be invoked automatically. You will have to do it yourself with the cmd: qbot –s scripts –e error –o ouput in the qbot dir. --temp_dir <dir> Location of tmp files. Default is tmp folder inside the qbot_dir. --filter_depth <int> default: 10. --filter_relative_depth <int> default: 10. Meaning 10 pct of avg. --filter_min_depth <int> default: 10. Only applicable if using rel dp option. --filter_prune <int> default: 10. Meaning 10 SNPs. --filter_map_quality <int> default: 0. Recommended 25. --filter_snp_quality <int> default: 0. Recommended 30. --filter_z <double> default: 0. Recommended 1.96. --queue_filter_mem <str> default: 3900m. Increase only if your filter job gets killed due to memory.

8 CSI Phylogeny – How to run it


Download ppt "Call SNPs & Infer Phylogeny (CSI Phylogeny)"

Similar presentations


Ads by Google