Presentation is loading. Please wait.

Presentation is loading. Please wait.

2 gotCloud is a sequence analysis pipeline – IntegrativeAlignment, QC, Variant Calling, Phasing – SeamlessRequires only simple configuration files – Robust..against.

Similar presentations


Presentation on theme: "2 gotCloud is a sequence analysis pipeline – IntegrativeAlignment, QC, Variant Calling, Phasing – SeamlessRequires only simple configuration files – Robust..against."— Presentation transcript:

1

2 2 gotCloud is a sequence analysis pipeline – IntegrativeAlignment, QC, Variant Calling, Phasing – SeamlessRequires only simple configuration files – Robust..against unexpected failures & stops – Scalable..to many thousands of genomes gotCloud also provides… – A set of many useful software tools – Software library (C++) for sequence analysis What is g enomes o n t he Cloud ?

3 3 How can I use gotCloud?

4 4 GotCloud Alignment Pipeline End-to-end analysis – Fully automated and parallelized (with quality controls) – Requires only a simple fastq list file GotCloud Alignment Pipeline FASTQ Raw BAM (one per sample) Processed BAM (one per sample) bwa / mosaik dedup & recab qplot & verifyBamID QC SAMPLE_IDFASTQ1FASTQ2 Sample1Sm1_Run1_1.fastq.gz Sample1Sm1_Run2_1.fastq.gz Sample2Sm2.fastq.gz. Sample3Sm3_1.fastq.gzSm3_2.fastq.gz Alignment & QC Single Sample Many Samples

5 5 Running GotCloud Alignment Pipeline Example command./gotcloud align --list fastq.txt --out outputDir --numjobs 3 -- threads 2 –./gotcloud run GotCloud –align alignment pipeline –--list fastq.txt list of FASTQs –--out outputDir where to write output –--numjobs 3 run 3 samples concurrently –--threads 2 2 CPU threads per sample CRAM support is in beta testing Alignment & QC Single Sample Many Samples

6 6 Aligned and post-processed BAM files Summary statistics and graphs from qplot Contamination checking from verifyBamID – With and/or without external genotype data What to Expect from GotCloud Alignment Pipeline Alignment & QC Single Sample Many Samples Base Quality GC-bias Insert Size

7 7 GotCloud Variant Calling Pipelines Variant Calling Variant FilteringHaplotyping EPACTS (External) samtools Unfiltered VCF Filtered VCF Phased VCF infoCollector Association Results SVM Processed BAM (one per sample) Single Sample Many Samples Deep Genomes Shallow Genomes Targeted Exomes Genotype Likelihood (one per sample) glfMultiples beagle / thunder End-to-end analysis Very efficient Small memory (<1G) Scalable to >1,000s High parallelization Fault-tolerant Requires only list of BAMs Overview of SNP Calling Pipeline

8 8 GotCloud Variant Calling Pipelines Example commands./gotcloud snpcall --list bams.txt --out outputDir --numjobs 10./gotcloud indel --list bams.txt --out outputDir --numjobs 10./gotcloud genomestrip --list bams.txt --out outputDir --numjobs 10./gotcloud ldefine --list bams.txt --out outputDir --numjobs 10./gotcloud mei --list bams.txt --out outputDir --numjobs 10 * –snpcall/indel/genomestrip/mei variant caller to run –ldrefine run beagle/thunderVCF genotype refinement –--list bams.txt list of bams per sample –--out outputDir where to write output –--numjobs 10 run 10 jobs concurrently SNPs Indels Structural Variants Deep Genomes Shallow Genomes Targeted Exomes * Coming soon..

9 9 StudyGenomeExomeNPopulations# SNPs 1000 Genomes~6x~40x2,535Many69.1M Type 2 Diabetes~5x~80x2,850Europeans26.7M Exome Sequencing Project.~80x6,916EUR+AFR1.92M Sardinian Sequencing~4x.3,520Sardinians23.1M Bipolar Sequencing~12x.2,825Europeans43.7M Nephrotic Syndrome~4x.464Many25.6M Age-related Macular Degeneration~6x.3,000Europeans36.2M HUNT~4x.1,200Norwegians23.0M GotCloud for Large-scale Sequencing Variant Calling Many Samples Variant Filtering

10 10 Can I detect clinically important variants using gotCloud?

11 11 Variant Call Examples using GotCloud SNPs Indels Structural Variants Examples from APOL1 gene – Nephrotic syndrome associated genes – Available at SNPs : Nephrotic syndrome risk allele – APOL G1 allele A G 18 PASS AN=124;AC=2;AF= … Indels : Nephrotic syndome risk allele – APOL G2 allele AATAATT A756 PASS AN=114;AC=2;AF= … Structural Variants nearby APOL1 loci C.PASS AN=124;AC=2;END= Deep Genomes Shallow Genomes Targeted Exomes

12 12 Variant Call Examples using GotCloud SNPs Indels Structural Variants Examples from APOL1 gene – Nephrotic syndrome associated genes – Available at SNPs : Nephrotic syndrome risk allele – APOL G1 allele A G 18 PASS AN=124;AC=2;AF= … Indels : Nephrotic syndome risk allele – APOL G2 allele AATAATT A756 PASS AN=114;AC=2;AF= … Structural Variants nearby APOL1 loci C.PASS AN=124;AC=2;END= Deep Genomes Shallow Genomes Targeted Exomes

13 13 How can I use gotCloud on the Cloud?

14 14 GotCloud supports for many high-computing cluster systems – MOSIX – SLURM – Sun Grid Engine (SGE) – Portable Batch System (PBS) GotCloud also supports the Amazon Cloud – In 1000G low-pass genomes examples… – Average cost is $20 per genome – GotCloud Amazon Machine Images (AMI) available GotCloud on the Cloud Variant Calling Variant FilteringHaplotyping Deep Genomes Shallow Genomes Targeted Exomes Alignment & QC

15 15 Getting started… – GotCloud AMI is publicly available – The AMI contains example input files You can run today’s demo on your own Adding computing power for your own analysis… – GotCloud AMI supports StarCluster (SGE-compatible) – You can add as many nodes as you need. – You will need to upload your own files to Amazon Have your data mounted or use wget/curl/lftp GotCloud on Amazon Cloud Variant Calling Variant FilteringHaplotyping Deep Genomes Shallow Genomes Targeted Exomes Alignment & QC

16 16 Thanks! or https://github.com/statgen/gotcloudhttps://github.com/statgen/gotcloud https://drive.google.com/file/d/0B9LqMcsR6cysalhwaz B5WkVPOTg/view?usp=sharing Gonçalo Abecasis Mary Kate Wing Hyun Min Kang Goo Jun Adrian Tan Terry Gliedt Tom Blackwell Alan Kwong

17 17 GotCloud on Amazon Cloud : Launch Amazon Instance Assuming that you set up your account, security key, and also that you selected GotCloud AMI and machine size (instance type).. Selected Instance Type Selected Instance Type Click Launch When Ready

18 18 GotCloud on Amazon Cloud : Connecting to Your Instance After Launching, Check if the State is “Running” After Launching, Check if the State is “Running” Click “Connect” to Launch the terminal when Ready Click “Connect” to Launch the terminal when Ready

19 19 GotCloud on Amazon Cloud : Connecting to Your Instance Username (e.g. ubuntu) Path to your key (previously selected)

20 20 GotCloud on Amazon Cloud : Running GotCloud Check if the demo input files are available Check if the demo input files are available

21 21 GotCloud on Amazon Cloud : Running GotCloud Type a GotCloud command line

22 22 GotCloud on Amazon Cloud : Checking Output Files Examine the output directory

23 23 GotCloud on Amazon Cloud : Checking Output Files Examine the variant of interest

24 24 GotCloud on Amazon Cloud : Don’t Forget to Terminate


Download ppt "2 gotCloud is a sequence analysis pipeline – IntegrativeAlignment, QC, Variant Calling, Phasing – SeamlessRequires only simple configuration files – Robust..against."

Similar presentations


Ads by Google