IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of the iPlant Data Store.

Slides:



Advertisements
Similar presentations
Texas Digital Library Services Preservation Network.
Advertisements

Legal Meetings: Extended Instructions on Movica and Screencast.
Attributes of SharePoint Migration  Quickly Migrate bulk SharePoint offline or Online database in other SharePoint or Office365.  Transfer Multiple.
B. Ramamurthy 4/17/ Overview of EC2 Components (fig. 2.1) 10..* /17/20152.
INTEGRATING BIG DATA TECHNOLOGY INTO LEGACY SYSTEMS Robert Cooley, Ph.D.CodeFreeze 1/16/2014.
Putting yourself online Making Web 2.0 work for you By Jonathan Smith Overseas School of Colombo.
Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.
1 Software Testing and Quality Assurance Lecture 32 – SWE 205 Course Objective: Basics of Programming Languages & Software Construction Techniques.
Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 
© 2010 VMware Inc. All rights reserved VMware ESX and ESXi Module 3.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
1 iPlant Data Store (iDS) Supporting the Lifecycle of Data Nirav Merchant 1.
IPlant Collaborative Powering a New Plant Biology iPlant Collaborative Powering a New Plant Biology.
Customized cloud platform for computing on your terms !
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Enabling Cloud and Grid Powered Image Phenotyping Nirav Merchant iPlant Collaborative
Using Biological Cyberinfrastructure Scaling Science and People: Applications in Data Storage, HPC, Cloud Analysis, and Bioinformatics Training Scaling.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
| nectar.org.au NECTAR TRAINING Module 5 The Research Cloud Lifecycle.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop iCommands and Other Data Store Resources.
| nectar.org.au NECTAR TRAINING Module 10 Beyond the Dashboard.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Objectives.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Building and Using Workflows Within the DE; Phylogenetics.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
IPlant Collaborative Hands-on Cyberinfrastructure Workshop – Part 2 R. Walls University of Arizona Biodiversity Information Standards (TDWG) Sep. 29, 2015,
1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall Nirav Merchant Bio Computing & iPlant Collaborative Eric Lyons.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop iPlant Data Store.
The iPlant Collaborative Using iPlant for sharing, managing, and analyzing ecological data Ramona Walls Presented at ESA 2014 – Ignite session August 12,
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Atmosphere.
Kickstart Installation
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of the iPlant Data Store.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
IPlant Collaborative Hands-on Cyberinfrastructure Workshop - Part 1 R. Walls University of Arizona Biodiversity Information Standards (TDWG) Sep. 28, 2015,
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop iPlant Data Store – Managing Your ‘Big’ Data.
| nectar.org.au NECTAR TRAINING Module 5 The Research Cloud Lifecycle.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of the iPlant Discovery Environment.
CyVerse-enabled NCBI Sequence Read Archive (SRA) Submission Pipeline
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of the iPlant Discovery Environment.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Data Demo and MAKER-P.
IPlant Collaborative Tools and Services Workshop Overview of the iPlant Discovery Environment Sriram Srinivasan.
© Copyright 2015 EMC Corporation. All rights reserved. EMC Isilon Scale-out NAS For Syncplicity.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Building and Using Workflows Within the DE; Phylogenetics.
Transforming Science Through Data-driven Discovery Tools and Services Workshop Atmosphere Joslynn Lee – Data Science Educator Cold Spring Harbor Laboratory,
Canadian Bioinformatics Workshops
Transforming Science Through Data-driven Discovery Tools and Services Workshop Data Store Overview.
CyVerse Workshop Discovery Environment Overview. Welcome to the Discovery Environment A Simple Interface to Hundreds of Bioinformatics Apps, Powerful.
Transforming Science Through Data-driven Discovery Workshop Overview Ohio State University MCIC Jason Williams – Lead, CyVerse – Education, Outreach, Training.
Transforming Science Through Data-driven Discovery Tools and Services Workshop Data Store – Managing your ‘Big’ Data Joslynn Lee, Ph.D. – Data Science.
Transforming Science Through Data-driven Discovery Tools and Services Workshop Data Store – Managing your ‘Big’ Data Joslynn Lee – Data Science Educator.
CyVerse Data Store Managing Your ‘Big’ Data. Welcome to the Data Store Manage and share your data across all CyVerse platforms.
Joslynn S. Lee, PhD, Data Science Educator Cold Spring Harbor Laboratory, DNA Learning Center Transforming Science Through Data-driven Discovery.
CyVerse Tools and Services
Tools and Services Workshop
Joslynn Lee – Data Science Educator
CyVerse Discovery Environment
Module 4: Managing Access to Resources
MANAGING, SHARING, AND PUBLISHING DATA WITH THE CYVERSE DATA STORE
Tools and Services Workshop Overview of Atmosphere
Tools and Services Workshop
Tools and Services Workshop Overview of the iPlant Data Store
Cloud based Open Source Backup/Restore Tool
Data uploading and sharing with CyVerse
SRA Submission Pipeline
Azure's Performance, Scalability, SQL Servers Automate Real Time Data Transfer at Low Cost MINI-CASE STUDY “Azure offers high performance, scalable, and.
Storing and Accessing G-OnRamp’s Assembly Hubs outside of Galaxy
MCBIOS 2016 – University of Memphis, TN
Presentation transcript:

iPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of the iPlant Data Store

What is “Big Data”? Big Data is a term applied to data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Big Data sizes are a constantly moving target currently ranging from a few dozen terabytes to many petabytes of data in a single data set. -Wikipedia -(

Overview of the iPlant Data Store High-Throughput Biology (Not Just Sequence Data) Genotype Phenotype In 11 Days Generates 4TB of raw data 600,000,000,000 bases of DNA sequence (200 human genomes) 1 Day 30 camera sets ~200 movies of dynamic root growth: 4GB a day

Overview of the iPlant Data Store What makes Big Data different? Why isn't saving/moving/copying Big Data as simple as using the tools we already have?

Overview of the iPlant Data Store What makes Big Data different? Changes in scale - quantitative introduce qualitative differences and complications?!

Overview of the iPlant Data Store Some Complications of Big Data Difficult/slow transfers Expense for storage/backup Difficult to share and publish Metadata Analysis

Teragrid XSEDE Overview of the iPlant Data Store Scalable, Reliable, Redundant, High-performance Access your data from multiple iPlant services Automatic data backup (redundant between University of Arizona and University of Texas) Multiple ways to share data with collaborators Multi-threaded high speed transfers Default 100GB allocation. >1TB allocations available with justification

Overview of the iPlant Data Store Scalable, Reliable, Redundant, High-performance iRODS is an open-source data management system iRODS supports many data intensive projects like NSF TeraGrid, Large Synoptic Survey telescope, etc.

Overview of the iPlant Data Store There are multiple ways to access the Data Store Through the Discovery Environment iDrop stand alone client iCommands iRODS FUSE (mounted volume in Linux environment)

Overview of the iPlant Data Store Some important items we won’t see in the demo Texas Replication Arizona Key component of your NSF data management plan Worry Free!

Overview of the iPlant Data Store Some important items we won’t see in the demo SourceDestinationCopy MethodTime (seconds) CDMy Computercp320 Berkeley ServerMy Computerscp150 External DriveMy Computercp36 USB2.0 FlashMy Computercp30 iDSMyComputeriget18 My Computer cp15 Close to optimum conditions; transfer between Univ. of Arizona and UC Berkeley 100GB: 29m15s 1 GB / 17.5 seconds

Some important items we won’t see in the demo Overview of the iPlant Data Store One of the complications of big data transfers is that you will always be limited by your local connection and Institutional policies.

iPlant Data Store Hands-on Lab

iPlant Data Store Lab Import large files into the DE using a URL Bulk Upload large files into the DE Understand metadata and annotate a file using the AVU format Share your data with another colleague/user Get started with iCommands (*command line interface) By the end of this module you should be able to:

iPlant Data Store Lab Goal: Import files into the data store, annotate them with metadata and share them with a colleague. Task 1: Import a file into the DE from a URL Task 2: Import a “large” file using iDrop in the DE Task 3: Markup your files with metadata Task 4: Share your data with a colleague / other user

Please login to the Discovery Environment. Follow along with the instructor Or Follow along with the handouts on your own iPlant Data Store Lab

Quick iCommands demo Commands demonstrated: iinit ils iget iexit Enter the host name (DNS) of the server to connect to: data.iplantcollaborative.org Enter the port number: 1247 Enter your irods user name: Enter your irods zone: iplant Enter your current iRODS password: Learn more in the online documentation:

iPlant Data Store Lab iPlant Supports the Life Cycle of Data Store Markup Search Transfer Analyze Visualize Collaborate Share Data Results A Results B Algo1 Algo2 Data Results A Results B Algo1 Algo2 Pre- Publication Post- Publication