Presentation is loading. Please wait.

Presentation is loading. Please wait.

The iPlant Collaborative Community Cyberinfrastructure for Life Science Arthropod Genomics Research in ARS Workshop Jason Williams Cold.

Similar presentations


Presentation on theme: "The iPlant Collaborative Community Cyberinfrastructure for Life Science Arthropod Genomics Research in ARS Workshop Jason Williams Cold."— Presentation transcript:

1 The iPlant Collaborative Community Cyberinfrastructure for Life Science Arthropod Genomics Research in ARS Workshop Jason Williams / @JasonWilliamsNY Cold Spring Harbor Laboratory, iPlant

2 Goals for today’s talk Begin the process of adopting /adapting iPlant to build your own community capacity Learn about what you hope iPlant may be able to offer Highlight existing capabilities of the platform Explain some of the context and rationale behind iPlant

3 The iPlant Collaborative Vision Enable life science researchers and educators to use and extend iPlant's foundational cyberinfrastructure to understand and ultimately predict the complexity of biological systems and their dynamic nature under various environmental conditions.

4 The iPlant Collaborative What is Cyberinfrastructure?

5 The iPlant Collaborative What is Cyberinfrastructure? Platforms, tools, datasets Storage and compute Training and support

6 The iPlant Collaborative What problems can iPlant Solve? Crops and model plant systems Animal and livestock Agronomic microbes, insects…

7 The iPlant Collaborative What problems can iPlant Solve? iPlant is built for Data

8 The iPlant Collaborative How was iPlant built?

9 The iPlant Collaborative Landscape of community identified priorities Genomic data and analysis: Reference guided assembly De novo assembly RNA-Seq (expression; gene/isoform discovery) Variant calling Genome/Transcriptome annotation ChIP-Seq/Integration of epigenetic information Multiple sequencing platforms New and evolving technologies

10 The iPlant Collaborative Landscape of community identified priorities Genotypic Environmental Phenotypic Comparative Genomics Sequencing & Assembly Annotation Environmental datasets Climate model products Image-based Phenotyping Molecular Phenotyping Trait Data In planning In progress Foundation in place Evolutionary Models Ecological Models Association Studies Pathway Analysis

11 iPlant is a collaborative virtual organization The iPlant Collaborative Who makes up iPlant?

12 The iPlant Collaborative How is iPlant funded? Funded by NSF First funding ($50 Million) in 2008 Renewal funding ($50.3 Million) in 2013 o Scientific Advisory Board o Focus on Genotype-Phenotype science o NSF Recommended expansion of scope beyond plants

13 Ultracentrifuge - Electrophoresis Cycle sequencing – HTS ~20 years Technology… Transition… Enablement…

14 The iPlant Collaborative What a unified platform gets you Ability to access and manage data Software to analyze data Computing resources Skills and help to use software and interpret results Get Science Done

15 The iPlant Collaborative What a unified platform gets you Metadata management Ability to share data and workflows Open source sustainable tools Reproducibility

16 The iPlant Collaborative What a unified platform gets you High-performance and scalable computing Ability automate and collaborate Funding spent on science, not software or hardware Productivity

17 The iPlant Collaborative Support for a diverse user base Bioinformatics Users: Easy-to-use tools/interfaces (little or no command-line) Generous data storage, end-to-end workflows Access to training and support

18 The iPlant Collaborative Support for a diverse user base Bioinformaticians: (More) access to HPC Make tools and algorithms more accessible to users Better ways to manage large-project metadata

19 The iPlant Collaborative Support for a diverse user base Bioinformatics Engineers (community/core support): Ways to scale support for community or institutional users Optimization of software Shared data storage and user portals

20 The iPlant Collaborative Products What do you get with your account?

21 The iPlant Collaborative Products We strive to be the CI Lego blocks Danish 'leg godt' - 'play well’ Also translates as 'I put together' in Latin If a solution is not available you can craft your own using iPlant CI components

22 iPlant Data Store Initial 100 GB allocation – TB allocations available Automatic data backup Easy upload /download and sharing The resources you need to share and manage data with your lab, colleagues and community

23 Discovery Environment Hundreds of bioinformatics Apps in an easy-to-use interface A platform that can run almost any bioinformatics application Seamlessly integrated with data and high performance computing User extensible – add your own applications

24 Atmosphere Cloud computing for the life sciences Simple: One-click access to more than 200 virtual machine images Flexible: Fully customize your software setup Powerful: Integrated with iPlant computing and data resources

25 Science APIs Fully customize iPlant resources Science-as-a-service platform Define your own compute, and storage resources (local and iPlant) Build your own app store of scientific codes and workflows

26 DNA Subway Educational workflows for Genomes, DNA Barcoding, RNA-Seq Commonly used bioinformatics tools in streamlined workflows Teach important concepts in biology and bioinformatics Inquiry-based experiments for novel discovery and publication of data

27 Bisque Image analysis, management, and metadata Secure image storage, analysis, and data management Integrate existing applications or create new ones Custom visualization and image handling routines and APIs

28 The iPlant Collaborative Genome Assembly and Annotation

29 The iPlant Collaborative Genome Assembly and Annotation Annotation of the Lobolly Pine Mega genome—Jill Wegrzyn 20.15 Gb assembly—split into 40 jobs—216 CPU/job (8640 CPU total)—17 hours 22,656 CPU cores on1,888 nodes GenomeAssembly Size (Mb) CPU Run Time Arabidopsis thalianaTAIR101206002:44 Arabidopsis thalianaTAIR1012015001:27 Zea maysRefGen_v2206721722:53 TACC Lonestar Supercomputer Campbell et al. Plant Physiology. December 4, 2013, DOI:10.1104/pp.113.230144

30 The iPlant Collaborative An Evolving Data Commons specimen collection analysis project creation publication data discovery and re-use

31 The iPlant Collaborative Challenge: Transform existing datasets to do custom queries

32 The iPlant Collaborative Leveraging iPlant Data Store and iRODS

33 The iPlant Collaborative Collaborating with us

34 The iPlant Collaborative “Powered by iPlant” supports a variety of ways of using the iPlant infrastructure underneath another application that communicates with users; usually outside the iPlant project. Other major projects have adtoped the iPlant CI as their underlying infrastructure (some completely, some in limited ways – more on this later).

35 Example “ Powered by iPlant ” Impact CoGE usage and user count after federation and interoperability with iPlant

36 Extended Support Make bioinformatics tools better We find example after example of codes that get well below.01% of peak on a single core By the end of the year, it will be difficult to get a server below 20 cores. There is little sympathy for data/computing challenges when the software is willing to ignore at least 95-99.99% of available performance D.Stanzione, Director TACC

37 The iPlant Collaborative Getting tools out there GenSel installed by developers, made available through the DE For whole-genome predictions, widely used in breeding Dorian Garrick, Iowa State University

38 The iPlant Collaborative Solving problems faster iAnimal genotyping pipeline developed for 1000 Bulls processes two terabytes (TB) of raw sequence data to DNA variants in less than 8 hours James Koltes, Iowa State University

39 Where to go from here: iPlant Learning Center Get Started Guide Tutorials and Videos Documentation Upcoming Events Workshops Webinars

40 iPlant can come to you… Tools & Services Workshops Genomics in Education Workshops Targeted to researchers Hands-on learning modules Individual consultations Targeted to educators Pair bioinformatics with classroom labs Help for generating lesson plans Pairs with asynchronous learning Reach broader audiences Follow up with workshop learners Webinars

41 Where to go from here: If iPlant can, we’ll help show you how… If iPlant can’t we’ll find the path that gets you what you need Don’t hesitate to ask “Can iPlant do this?” Keep asking at ask.iplantcollabortive.org

42 Staff: Greg Abram Sonali Aditya Ritu Arora Roger Barthelson Rob Bovill Brad Boyle Gordon Burleigh John Cazes Mike Conway Victor Cordero Rion Dooley Aaron Dubrow Andy Edmonds Dmitry Fedorov Melyssa Fratkin Michael Gatto Utkarsh Gaur Cornel Ghiban Executive Team Steve Goff - UA Matthew Vaughn - TACC Nirav Merchant – UA Eric Lyons - UA Doreen Ware – CSHL Current and Former: Faculty Advisors & Collaborators: Ali Akoglu Kobus Barnard Timothy Clausner Brian Enquist Damian Gessler Ruth Grene John Hartman Matthew Hudson David Lowenthal B.S. Manjunath Students: Peter Bailey Jeremy Beaulieu Devi Bhattacharya Storme Briscoe YaDi Chen David Choi Barbara Dobrin David Neale Brian O’Meara Sudha Ram David Salt Mark Schildhauer Doug Soltis Pam Soltis Edgar Spalding Alexis Stamatakis Steve Welch Zhenyuan Lu Aaron MarcuseKubitz Robert McLay Nathan Miller Steve Mock Martha Narro Benoit Parmentier Jmatt Peterson Dennis Roberts Paul Sarando Jerry Schneider Bruce Schumaker Steve Gregory Matthew Hanlon Natalie Henriques Uwe Hilgert Nicole Hopkins EunSook Jeong Logan Johnson Chris Jordan Kathleen Kennedy Mohammed Khalfan David Knapp Lars Koersterk Sangeeta Kuchimanchi Kristian Kvilekval Sue Lauter Tina Lee Edwin Skidmore Brandon Smith Mary Margaret Sprinkle Sriram Srinivasan Josh Stein Lisa Stillwell Jonathan Strootman Peter Van Buren Hans VasquezGross Rebeka Villarreal Ramona Wallls Liya Wang Anton Westveld Jason Williams John Wregglesworth Weijia Xu Andrew Predoehl Sathee Ravindranath Kyle Simek Gregory Striemer Jason Vandeventer Nicholas Woodward Kuan Yang Postdocs: Barbara Banbury Christos Noutsos Solon Pissis Brad Ruhfel John Donoghue Yekatarina Khartianova Chris La Rose Amgad Madkour Aniruddha Marathe Andre Mercer Kurt Michaels Zack Pierce The iPlant Collaborative Who makes up iPlant?

43 Download these slides… www.iplantc.org/arswiki1 @JasonWilliamsNY @iPlantCollab Jason Williams – Williams@cshl.edu


Download ppt "The iPlant Collaborative Community Cyberinfrastructure for Life Science Arthropod Genomics Research in ARS Workshop Jason Williams Cold."

Similar presentations


Ads by Google