Metagenomics Tools Ion Proton Sequencer In: Sample DNA Out: 50M DNA fragments NCBI nucleotide database DNA fragments 15M+ records Do the math: 50M * 15M = 10 14 queries mpiBLAST Highly parallelized Blast algorithm NGS sample DNA Query NCBI DB CSU Cray XT6m 2,016 CPU cores
Metagenomics Dr. Toni Piaggio, National Wildlife Research Center, Fort Collins Florida Everglades water samples (4) “What species are in the water?” CSU NextGen Sequencing Core: Ion Proton; 2 weeks CSU Cray: 1,000 cores, 24-hours, 4 runs; 1 week Results
Metagenomics Rarefaction curves Estimate species richness Asymptotic? Find rare species
Computational Resources Oak Ridge Titan Cray XK7 Supercomputer 300K CPU cores; 50M GPU cores mpiBlast NCBI nucleotide DB Query 100% of sample DNA CSU Cray XT6m Supercomputer 2,016 CPU cores mpiBlast NCBI nucleotide DB Query 1% of sample DNA Strong scaling
Summary Big Data Issues Semiconductor sequencer data Large-scale database queries High-performance computing
Your consent to our cookies if you continue to use this website.