The iPlant Collaborative Community Cyberinfrastructure for Life Science Jason Williams Cold Spring Harbor Laboratory, iPlant www.iPlantCollaborative.org.

Slides:



Advertisements
Similar presentations
1 Is there an ? Is there an app for that ? Challenges in scalable analysis for Life sciences 1 Nirav Merchant UA BioComputing + iPlant Arizona Research.
Advertisements

Enabling Phenotypic Image Analysis Using Shared Cyberinfrastructure
The iPlant Collaborative Community Cyberinfrastructure for Life Science Nirav Merchant iPlant / University of Arizona
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
1 iPlant Data Store (iDS) Supporting the Lifecycle of Data Nirav Merchant 1.
DNA Subway Green Line Overview. Growth of Sequence Read Archive (SRA) 2.2 Quadrillion bases Log Scale!
The iPlant Collaborative Community Cyberinfrastructure for Life Science Jason Williams Cold Spring Harbor Laboratory, iPlant
IPlant Collaborative Powering a New Plant Biology iPlant Collaborative Powering a New Plant Biology.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Arthropod Genomics Research in ARS Workshop Jason Williams Cold.
Customized cloud platform for computing on your terms !
The iPlant Collaborative Community Cyberinfrastructure for Life Science Roger Barthelson/Uwe Hilgert iPlant / University of Arizona.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Introduction to iPlant Dan Stanzione The iPlant Collaborative September 16th, 2013.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Jason Williams Cold Spring Harbor Laboratory Botany 2013, New Orleans, LA.
BISQUE: Enabling Cloud and Grid Powered Image Analysis Ramona Walls iPlant Collaborative
Enabling Cloud and Grid Powered Image Phenotyping Nirav Merchant iPlant Collaborative
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
The iPlant Collaborative IBP Annual Meeting – June 1 st 2011 Steve.
1 iPlant: Cyberinfrastructure for Plant Sciences (and Beyond) Your Name Here 1.
IPlant Collaborative Bringing Together High Performance Computing and Biology.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Customized cloud platform for computing on your terms ! Nirav Merchant
The iPlant Collaborative Presented by Sheldon McKay Cold Spring Harbor Laboratory.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Jason Williams Cold Spring Harbor Laboratory, iPlant
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Enabling Cloud and Grid Powered Image Phenotyping Martha Narro iPlant Collaborative Adapted.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Network for Integrating Bioinformatics into Life Sciences Education April, 2014.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
My-Plant.org A Phylogenetically Structured Social Network Matthew R Hanlon November 13, 2010.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
IPlant Collaborative Hands-on Cyberinfrastructure Workshop – Part 2 R. Walls University of Arizona Biodiversity Information Standards (TDWG) Sep. 29, 2015,
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop iPlant Data Store.
The iPlant Collaborative Using iPlant for sharing, managing, and analyzing ecological data Ramona Walls Presented at ESA 2014 – Ignite session August 12,
The iPlant Collaborative Community Cyberinfrastructure for Life Science Jason Williams Cold Spring Harbor Laboratory, iPlant.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of the iPlant Data Store.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
IPlant Collaborative Hands-on Cyberinfrastructure Workshop - Part 1 R. Walls University of Arizona Biodiversity Information Standards (TDWG) Sep. 28, 2015,
The iPlant Collaborative Community Cyberinfrastructure for Life Science Jason Williams iPlant / Cold Spring Harbor Laboratory Texas A&M Tools and Services.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop University of Hawaii at Manoa; December 10-11, 2012.
The iPlant Collaborative Pollen RCN March 2 nd, 2013 The iPlant Collaborative Pollen RCN March 2 nd, 2013 Steve Goff BIO5 Institute.
Overview of Atmosphere
The iPlant Collaborative Community Cyberinfrastructure for Life Science Jason Williams Cold Spring Harbor Laboratory, iPlant.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop iPlant Data Store – Managing Your ‘Big’ Data.
IPlant Collaborative Bringing Together High Performance Computing and Biology.
Enabling Cloud and Grid Powered Image Phenotyping
The iPlant Collaborative Community Cyberinfrastructure for Life Science Jason Williams Cold Spring Harbor Laboratory, iPlant.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of the iPlant Discovery Environment.
The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of the iPlant Discovery Environment.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
Transforming Science Through Data-driven Discovery Genomics in Education University of Delaware – February 2016 Jason Williams, Education, Outreach, Training.
Transforming Science Through Data-driven Discovery Tools and Services Workshop Atmosphere Joslynn Lee – Data Science Educator Cold Spring Harbor Laboratory,
Transforming Science Through Data-driven Discovery Tools and Services Workshop Data Store Overview.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
Transforming Science Through Data-driven Discovery Workshop Overview Ohio State University MCIC Jason Williams – Lead, CyVerse – Education, Outreach, Training.
Transforming Science Through Data-driven Discovery Tools and Services Workshop Data Store – Managing your ‘Big’ Data Joslynn Lee, Ph.D. – Data Science.
Transforming Science Through Data-driven Discovery Using CyVerse Cyberinfrastructure to Enable Data Intensive Research, Collaboration, and Education Joslynn.
CyVerse Tools and Services
Tools and Services Workshop
Customized cloud platform for computing on your terms !
Joslynn Lee – Data Science Educator
Tools and Services Workshop Overview of Atmosphere
JMC CGEMS SUMMER GENOMICS TRAINING WORKSHOPS
Tools and Services Workshop
Tools and Services Workshop Overview of the iPlant Data Store
Data uploading and sharing with CyVerse
Cyberinfrastructure for the Life Sciences
Presentation transcript:

The iPlant Collaborative Community Cyberinfrastructure for Life Science Jason Williams Cold Spring Harbor Laboratory, iPlant

The iPlant Collaborative Vision How can we prepare for science we can’t anticipate?

The iPlant Collaborative Vision Enable life science researchers and educators to use and extend cyberinfrastructure to understand and ultimately predict the complexity of biological systems

The iPlant Collaborative What is cyberinfrastructure? Cyberinfrastructure consists of computing systems, data storage systems, instruments and data repositories, visualization environments, and people, linked together by software and networks to improve research productivity and enable breakthroughs not otherwise possible. --Craig Stewart iPlant makes computation, data storage, cloud services, and software tools easily available to informaticians and researchers, leveraging existing CI investments.

Biological Cyberinfrastructure The Problem of Big Data in Biology

The iPlant Collaborative Community-Identified Science Enablement Drivers Log Scale! Sequence Read Archive (SRA) now holds more than 2 Quadrillion bases

The iPlant Collaborative Community-Identified Science Enablement Drivers Fulfilling our vision will mean enabling access to datasets and tools Environmental data Phenotype data Phylogenetic Inferences Ecological Models Crop Models Association Studies Molecular Networks Genomic data and analysis: Reference guided assembly De novo assembly RNA-Seq (expression; gene/isoform discovery) Variant calling Genome/Transcriptome annotation ChIP-Seq/Integration of epigenetic information Multiple sequencing platforms New and evolving technologies

The iPlant Collaborative Community-Identified Science Enablement Drivers Genomic data Environmental data Phenotype data Phylogenetic Inferences Ecological Models Crop Models Association Studies Molecular Networks Predictive and synthetic Knowledge gathering Retrodictive insights Genomic data and analysis: Reference guided assembly De novo assembly RNA-Seq (expression; gene/isoform discovery) Variant calling Genome/Transcriptome annotation ChIP-Seq/Integration of epigenetic information Multiple sequencing platforms New and evolving technologies This means working with a vast landscape of data and tools:

The iPlant Collaborative Community-Identified Science Enablement Drivers Genomic data Environmental data Phenotype data Phylogenetic Inferences Ecological Models Crop Models Association Studies Molecular Networks Predictive and synthetic Knowledge gathering Retrodictive insights Navigating this landscape requires cyberinfrastructure:

The iPlant Collaborative How iPlant CI meets the challenge Genotypic Phylogenetic Tools for inference Ecological Models Crop Models Association Studies Molecular Networks Environmental Phenotypic Comparative Genomics Sequencing & Assembly Annotation Environmental datasets Climate model products Image-based Phenotyping Molecular Phenotyping Trait Data Need identified, specifics required In progress Foundation in place

iPlant Renewed by NSF September 2013 began next 5 year period Scientific Advisory Board Focus on Genotype-Phenotype science NSF Recommended expansion of scope beyond plants The iPlant Collaborative Where iPlant is today and where we are going

The iPlant Collaborative What we have to offer you Data Management & Storage Resources Access to High Performance Computing Resources Tool Integration System Application Programming Interfaces (APIs) Cloud Computing Resources Genotype To Phenotype Science Enablement Portfolio Tree of Life Science Enablement Portfolio Image Analysis Platform (BISQUE)

The iPlant Collaborative What we have to offer you End Users (Biologists) Computational Access Teragrid XSEDE

The iPlant Collaborative What we have to offer you Base Cluster (Dell/Intel/Mellanox): Intel Sandy Bridge processors Dell dual-socket nodes w/32GB RAM (2GB/core) 6,400 nodes 56 Gb/s Mellanox FDR InfiniBand interconnect More than 100,000 cores, 2.2 PF peak performance Co-Processors: Intel Xeon Phi “MIC” Many Integrated Core processors Special release of “Knight’s Corner” (61 cores) All MIC cards are on site at TACC o more than 6000 installed 7+ PF peak performance Max Total Concurrency: exceeds 500,000 cores 1.8M threads Entered production operations on January 7, 2013

How iPlant CI Enables Discovery Solution: Discovery Environment An extensible platform for science High-powered computing Data sharing/collaboration Easy to use interface Virtually limitless apps Analysis history (provenance)

How iPlant CI Enables Discovery Solution: Atmosphere On-demand computing resource built on a cloud infrastructure Virtual Machine pre-configured with: Software Memory requirements Processing power Plant authentication and storage and HPC capabilities Build custom images/appliances and share with community Cross-platform desktop access to GUI applications in the cloud (using VNC)

How iPlant CI Enables Discovery What Atmosphere means to bioinformaticians “What my users used to call me for, they now do on their own through Atmosphere. Now I can scale up my user community” Nathan Miller, Univ. Wisconsin, Madison BLAST 400k transcripts against NCBI nr in 36 h vs. 2 months Use iPlant Data Store to move 1500 high-res images per day for analysis “iPlant is a great equalizer.” Mike Covington, UC Davis

How iPlant CI Enables Discovery Solution: iPlant Data Store All data in within the same platform speed and accessibility Access your data from multiple iPlant services Automatic data backup redundant between University of Arizona and University of Texas (NSF Data management plan) Multiple ways to share data with collaborators Multi-threaded high speed transfers Default 100GB allocation. >1TB allocations available with justification SourceTime (s) CD320 Berkeley Server150 External Drive36* USB2.0 Flash30 iPlant Data Store 18* My Computer15

Green Line will bring RNA-Seq and HPC to the classroom RNA-Seq data will support classroom genome annotation projects on Red Line in Web Apollo Blue Line brings DNA Barcoding and Phylogenetics into the classroom (more than 35K student generated sequences uploaded so far); sequences can be exported to GenBank How iPlant CI Enables Discovery Educational CI

How iPlant CI Enables Discovery People … a key component of CI Tools & Services Workshops Genomics in Education Workshops Targeted to researchers Hands-on learning modules Individual consultations Targeted to educators Pair bioinformatics with classroom labs Help for generating lesson plans Webinars Pairs with asynchronous learning materials Reach broader audiences Follow up with workshop learners

Standalone Apps: TNRS, TreeViewer, PhytoBisque, etc. iPlant Semantic Web – “Intelligent” workflow authoring Foundation/Agave API: For programmers embedding iPlant CI capabilities How iPlant CI Enables Discovery Many more applications not covered here…

The iPlant Collaborative Your colleagues Staff: Greg Abram Sonali Aditya Ritu Arora Roger Barthelson Rob Bovill Brad Boyle Gordon Burleigh John Cazes Mike Conway Victor Cordero Rion Dooley Aaron Dubrow Andy Edmonds Dmitry Fedorov Melyssa Fratkin Michael Gatto Utkarsh Gaur Cornel Ghiban Leadership Team Steve Goff - UA Dan Stanzione – TACC Matthew Vaughn - TACC Nirav Merchant - UA Doreen Ware – CSHL Michael Schatz – CSHL David Micklos – CSHL Ann Stapleton – UNC Wilmington Ron Vetter – UNC Wilmington Faculty Advisors & Collaborators: Ali Akoglu Kobus Barnard Timothy Clausner Brian Enquist Damian Gessler Ruth Grene John Hartman Matthew Hudson David Lowenthal B.S. Manjunath Students: Peter Bailey Jeremy Beaulieu Devi Bhattacharya Storme Briscoe YaDi Chen David Choi Barbara Dobrin David Neale Brian O’Meara Sudha Ram David Salt Mark Schildhauer Doug Soltis Pam Soltis Edgar Spalding Alexis Stamatakis Steve Welch Zhenyuan Lu Eric Lyons Aaron MarcuseKubitz Naim Matasci Sheldon McKay Robert McLay Nathan Miller Steve Mock Martha Narro Shannon Oliver Benoit Parmentier Jmatt Peterson Dennis Roberts Paul Sarando Jerry Schneider Bruce Schumaker Steve Gregory Matthew Hanlon Natalie Henriques Uwe Hilgert Nicole Hopkins EunSook Jeong Logan Johnson Chris Jordan Kathleen Kennedy Mohammed Khalfan David Knapp Lars Koersterk Sangeeta Kuchimanchi Kristian Kvilekval Sue Lauter Tina Lee Andrew Lenards Monica Lent Edwin Skidmore Brandon Smith Mary Margaret Sprinkle Sriram Srinivasan Josh Stein Lisa Stillwell Jonathan Strootman Peter Van Buren Hans VasquezGross Rebeka Villarreal Ramona Wallls Liya Wang Anton Westveld Jason Williams John Wregglesworth Weijia Xu Andrew Predoehl Sathee Ravindranath Kyle Simek Gregory Striemer Jason Vandeventer Nicholas Woodward Kuan Yang Postdocs: Barbara Banbury Christos Noutsos Solon Pissis Brad Ruhfel John Donoghue Yekatarina Khartianova Chris La Rose Amgad Madkour Aniruddha Marathe Andre Mercer Kurt Michaels Zack Pierce

The ONE URL for all things iPlant at PAG this year