Presentation is loading. Please wait.

Presentation is loading. Please wait.

What is genomesontheCloud ?

Similar presentations


Presentation on theme: "What is genomesontheCloud ?"— Presentation transcript:

1

2 What is genomesontheCloud ?
gotCloud is a sequence analysis pipeline Integrative Alignment, QC, Variant Calling, Phasing Seamless Requires only simple configuration files Robust ..against unexpected failures & stops Scalable ..to many thousands of genomes gotCloud also provides… A set of many useful software tools Software library (C++) for sequence analysis Briefly Talk about software tools and library included in GotCloud

3 How can I use gotCloud? Briefly Talk about software tools and library included in GotCloud

4 GotCloud Alignment Pipeline
& QC Single Sample Many Samples GotCloud Alignment Pipeline qplot & verifyBamID bwa / mosaik dedup & recab Processed BAM (one per sample) FASTQ Raw BAM (one per sample) QC End-to-end analysis Fully automated and parallelized (with quality controls) Requires only a simple fastq list file SAMPLE_ID FASTQ1 FASTQ2 Sample1 Sm1_Run1_1.fastq.gz Sm1_Run2_1.fastq.gz Sample2 Sm2.fastq.gz . Sample3 Sm3_1.fastq.gz Sm3_2.fastq.gz

5 Running GotCloud Alignment Pipeline
& QC Single Sample Many Samples Example command ./gotcloud align --list fastq.txt --out outputDir --numjobs 3 --threads 2 ./gotcloud run GotCloud align alignment pipeline --list fastq.txt list of FASTQs --out outputDir where to write output --numjobs 3 run 3 samples concurrently --threads CPU threads per sample CRAM support is in beta testing

6 What to Expect from GotCloud Alignment Pipeline
& QC Single Sample Many Samples Aligned and post-processed BAM files Summary statistics and graphs from qplot Contamination checking from verifyBamID With and/or without external genotype data Base Quality GC-bias Insert Size

7 GotCloud Variant Calling Pipelines
Filtering Haplotyping Deep Genomes Shallow Genomes Targeted Exomes Overview of SNP Calling Pipeline Processed BAM (one per sample) Genotype Likelihood (one per sample) Single Sample Many Samples samtools End-to-end analysis Very efficient Small memory (<1G) Scalable to >1,000s High parallelization Fault-tolerant Requires only list of BAMs glfMultiples infoCollector Filtered VCF Unfiltered VCF SVM beagle / thunder Phased VCF Association Results EPACTS (External)

8 GotCloud Variant Calling Pipelines
SNPs Indels Structural Variants Deep Genomes Shallow Genomes Targeted Exomes Example commands ./gotcloud snpcall list bams.txt --out outputDir --numjobs 10 ./gotcloud indel list bams.txt --out outputDir --numjobs 10 ./gotcloud genomestrip --list bams.txt --out outputDir --numjobs 10 ./gotcloud ldefine list bams.txt --out outputDir --numjobs 10 ./gotcloud mei list bams.txt --out outputDir --numjobs 10* snpcall/indel/genomestrip/mei variant caller to run ldrefine run beagle/thunderVCF genotype refinement --list bams.txt list of bams per sample --out outputDir where to write output --numjobs 10 run 10 jobs concurrently * Coming soon..

9 GotCloud for Large-scale Sequencing
Variant Calling Variant Filtering Many Samples Study Genome Exome N Populations # SNPs 1000 Genomes ~6x ~40x 2,535 Many 69.1M Type 2 Diabetes ~5x ~80x 2,850 Europeans 26.7M Exome Sequencing Project . 6,916 EUR+AFR 1.92M Sardinian Sequencing ~4x 3,520 Sardinians 23.1M Bipolar Sequencing ~12x 2,825 43.7M Nephrotic Syndrome 464 25.6M Age-related Macular Degeneration 3,000 36.2M HUNT 1,200 Norwegians 23.0M

10 clinically important variants using gotCloud?
Can I detect clinically important variants using gotCloud? Briefly Talk about software tools and library included in GotCloud

11 Variant Call Examples using GotCloud
SNPs Indels Structural Variants Deep Genomes Shallow Genomes Targeted Exomes Examples from APOL1 gene Nephrotic syndrome associated genes Available at SNPs : Nephrotic syndrome risk allele – APOL G1 allele A G PASS AN=124;AC=2;AF= … Indels : Nephrotic syndome risk allele – APOL G2 allele AATAATT A PASS AN=114;AC=2;AF= … Structural Variants nearby APOL1 loci C <DEL> . PASS AN=124;AC=2;END=

12 Variant Call Examples using GotCloud
SNPs Indels Structural Variants Deep Genomes Shallow Genomes Targeted Exomes Examples from APOL1 gene Nephrotic syndrome associated genes Available at SNPs : Nephrotic syndrome risk allele – APOL G1 allele A G PASS AN=124;AC=2;AF= … Indels : Nephrotic syndome risk allele – APOL G2 allele AATAATT A PASS AN=114;AC=2;AF= … Structural Variants nearby APOL1 loci C <DEL> . PASS AN=124;AC=2;END=

13 How can I use gotCloud on the Cloud?
Briefly Talk about software tools and library included in GotCloud

14 GotCloud on the Cloud Alignment & QC Variant Calling Variant Filtering Haplotyping Deep Genomes Shallow Genomes Targeted Exomes GotCloud supports for many high-computing cluster systems MOSIX SLURM Sun Grid Engine (SGE) Portable Batch System (PBS) GotCloud also supports the Amazon Cloud In 1000G low-pass genomes examples… Average cost is $20 per genome GotCloud Amazon Machine Images (AMI) available

15 GotCloud on Amazon Cloud
Alignment & QC Variant Calling Variant Filtering Haplotyping Deep Genomes Shallow Genomes Targeted Exomes Getting started… GotCloud AMI is publicly available The AMI contains example input files You can run today’s demo on your own Adding computing power for your own analysis… GotCloud AMI supports StarCluster (SGE-compatible) You can add as many nodes as you need. You will need to upload your own files to Amazon Have your data mounted or use wget/curl/lftp

16 or https://github.com/statgen/gotcloud
Thanks! or Gonçalo Abecasis Adrian Tan Mary Kate Wing Terry Gliedt Hyun Min Kang Tom Blackwell Goo Jun Alan Kwong

17 GotCloud on Amazon Cloud : Launch Amazon Instance
Assuming that you set up your account, security key, and also that you selected GotCloud AMI and machine size (instance type).. Selected Instance Type Click Launch When Ready

18 GotCloud on Amazon Cloud : Connecting to Your Instance
Click “Connect” to Launch the terminal when Ready After Launching, Check if the State is “Running”

19 GotCloud on Amazon Cloud : Connecting to Your Instance
Username (e.g. ubuntu) Path to your key (previously selected)

20 GotCloud on Amazon Cloud : Running GotCloud
List of BAMs per Sample Check if the demo input files are available

21 GotCloud on Amazon Cloud : Running GotCloud
Type a GotCloud command line

22 GotCloud on Amazon Cloud : Checking Output Files
Examine the output directory

23 GotCloud on Amazon Cloud : Checking Output Files
Examine the variant of interest

24 GotCloud on Amazon Cloud : Don’t Forget to Terminate


Download ppt "What is genomesontheCloud ?"

Similar presentations


Ads by Google