Presentation is loading. Please wait.

Presentation is loading. Please wait.

National Data Service Consortium Development and Outreach Activities

Similar presentations


Presentation on theme: "National Data Service Consortium Development and Outreach Activities"— Presentation transcript:

1 National Data Service Consortium Development and Outreach Activities

2 The National Data Service Consortium
Towards a world where it is easier to publish, link, search, and reuse data of all kinds Advancing discovery by enabling open sharing of data Increase collaboration within/across fields Bring together R&D efforts surrounding data Interoperability and extensibility of data cyberinfastructure efforts In collaboration with the RDA implementing standards and protocols

3 High Level Work Breakdown Structure
Breakdown of seed funded development and outreach activities Presented to Technical Advisory Committee March 2016 Three main activities: NDS Labs NDS Share NDS Overall Mission

4

5 NDS Labs Towards developing out components of a U.S. NDS
NDS Labs Workbench Cloud resources/storage Support

6 1. NDS Labs Developer/User Tools (i.e. NDS Labs Workbench)
Encapsulation of services e.g. Docker containers, JSON descriptions Determine how to encapsulate external instances of services as well (e.g. Globus, BrownDog, ...) Cluster provisioning & monitoring tools Command line tools leveraging Kubernetes to simplify the process of setting up a cluster Containerized ELK as standard service deployed Command line & REST API interface Allow users to setup new project spaces and add users Allow user to add a new service Allow user to select and deploy services Allow user to provision resources, e.g. amount of CPUs, storage, computation resource to use, storage resource to use Tools to monitor deployed services, start/stop services Graphical Web Interface on top of REST interface Catalog of available services Canvas to deploy/control services Links to deployed services Links to specific service logs GUI access to all CLI functionality (e.g. admin abilities)?

7 1. NDS Labs Developer/User Tools (i.e. NDS Labs Workbench)
Encapsulation of services e.g. Docker containers, JSON descriptions Determine how to encapsulate external instances of services as well (e.g. Globus, BrownDog, ...) Cluster provisioning & monitoring tools Command line tools leveraging Kubernetes to simplify the process of setting up a cluster Containerized ELK as standard service deployed Command line & REST API interface Allow users to setup new project spaces and add users Allow user to add a new service Allow user to select and deploy services Allow user to provision resources, e.g. amount of CPUs, storage, computation resource to use, storage resource to use Tools to monitor deployed services, start/stop services Graphical Web Interface on top of REST interface Catalog of available services Canvas to deploy/control services Links to deployed services Links to specific service logs GUI access to all CLI functionality (e.g. admin abilities)?

8 1. NDS Labs Developer/User Tools (i.e. NDS Labs Workbench)
Catalog of services Database of service (e.g. persistently within etcd) Easy means of updating added services (e.g. fetching latest versions) Web portal (i.e. app store like thing for service) Establish workflow for ingesting new services (as automatic and simple as possible) Ability to browse services along with documentation on each service (e.g. external links, APIs, ...) Database Interface allowing separate NDS Labs deployments to pull services from here Development Environment Encapsulate development environment around deployed tools and provide as a single container Documentation on developing within NDS Labs YouTube videos on developing within NDS Labs Standup official NDS Labs instance Deploy on resources such as Nebula Web interface to request account/access (will be used by pilots)

9 1. NDS Labs Developer/User Tools (i.e. NDS Labs Workbench)
Catalog of services Database of service (e.g. persistently within etcd) Easy means of updating added services (e.g. fetching latest versions) Web portal (i.e. app store like thing for service) Establish workflow for ingesting new services (as automatic and simple as possible) Ability to browse services along with documentation on each service (e.g. external links, APIs, ...) Database Interface allowing separate NDS Labs deployments to pull services from here Development Environment Encapsulate development environment around deployed tools and provide as a single container Documentation on developing within NDS Labs YouTube videos on developing within NDS Labs Standup official NDS Labs instance Deploy on resources such as Nebula Web interface to request account/access (will be used by pilots)

10 1. NDS Labs Developer/User Tools (i.e. NDS Labs Workbench)
Production Deployment Support Ability to issue a command or push a button from GUI to deploy on users resources (e.g. AWS) Explore means of indexing data added in future to deployed services (to be used by NDS share portal) Easy update mechanism to allow distributed instances to remain up to date Explore means of centralized authentication across resources as well as deployed services Maintenance Refactor code base, separate into distinct components Add documentation to code

11 1. NDS Labs Developer/User Tools (i.e. NDS Labs Workbench)
Production Deployment Support Ability to issue a command or push a button from GUI to deploy on users resources (e.g. AWS) Explore means of indexing data added in future to deployed services (to be used by NDS share portal) Easy update mechanism to allow distributed instances to remain up to date Explore means of centralized authentication across resources as well as deployed services Maintenance Refactor code base, separate into distinct components Add documentation to code

12 1. NDS Labs Support for Multiple Resources
Modification of tools to abstract away underlying resources and allow multiple resources to be leveraged simultaneously Provide means of allow users to enter credentials (e.g. Amazon account, XRAC) Intelligent resourcing tools (e.g. select compute resources near data) Manual resourcing tools (e.g. CLI/GUI modifications allowing users to deploy specific services on specific resources) Migration tools to move services across underlying platforms Deploy official NDS Labs instance across available resources SDSC Cloud and TACC Rodeo PSC Jetstream Amazon

13 1. NDS Labs Populate Services Catalog
Select several technologies for each required NDS component (e.g. archives, publishing, etc.) and encapsulate them e.g. DataONE, IRODS, SciServer, ... Identify other relevant technologies for each required NDS component and engage its developers to encapsulate them Tool Launcher (as one mechanism of running code near data) Support for various data sources/services within NDS Labs Support for various tools within NDS Labs (e.g. Jupyter, RStudio, ...)

14 1. NDS Labs Populate Services Catalog
Select several technologies for each required NDS component (e.g. archives, publishing, etc.) and encapsulate them e.g. DataONE, IRODS, SciServer, ... Identify other relevant technologies for each required NDS component and engage its developers to encapsulate them Tool Launcher (as one mechanism of running code near data) Support for various data sources/services within NDS Labs Support for various tools within NDS Labs (e.g. Jupyter, RStudio, ...)

15 NDS Labs Workbench

16 NDS Share Production data management resources
Exploration capabilities across data repositories

17 2. NDS Share Repository of Last Resort
data.share.nationaldataservice.org Globus endpoint on NCSA hardware (santiago) Skinned front face ... Build up and categorize datasets per domain which can be utilized for experimentation with NDS Labs

18 2. NDS Share Repository of Last Resort
data.share.nationaldataservice.org Globus endpoint on NCSA hardware (santiago) Skinned front face ... Build up and categorize datasets per domain which can be utilized for experimentation with NDS Labs

19 2. NDS Share NDS Share Portal
Federated search across NDS component archives and NDS Labs deployed resources Utilize repos listed here: Pilot effort among DataNets/DIBBs Web interface Search box Catalog of available archives

20 2. NDS Share Repository Recommender (decision support tool to suggest an archive machine database on a number of criteria, e.g. scientific domain) Explore leveraging DataNET SEAD virtual archive component Command Line tool Run in folder containing data Web Interface Data migration support Should an archive/repository being going away, identify a new archive, and a plan to move the data to the new repository (possibly and executable workflow)

21 2. NDS Share Published datasets Identify storage options Mint DOIs

22 2. NDS Share Published datasets Identify storage options Mint DOIs

23 NDS Mission Engage scientific, cyberinfrastructure, library, publishing communities Pursue broad interoperability of data management technolgies

24 3. NDS Mission Protocols, Interfaces, Standards
Work with RDA to identify standards that can be implemented within components that can be leveraged by US NDS Disseminate recommendations to components Serve as a testbed for RDA efforts Rice Genome Variant Discovery ... Implementation tasks: Data Types Registry WG: Host DTR Data Types Registry WG: Add support in BrownDog to use DTR to identify types and generate previews Data Description Registry Interoperability WG: Leverage Research Data Switchboard towards federated search portal in NDS Share Data Description Registry Interoperability WG: Implement utilized protocols within collaborating archive efforts Metadata Standards Directory WG: Add support for Metadata transformations in BrownDog, use case TERRA Work towards implementing standards within funded NDS efforts Towards data preservation within project and the long term data management of its data products Begin implementing support for: OAI-PMH Adapters that transform native interface? Directly modify code? BagIt

25 3. NDS Mission Outreach NDSC Workshops
Two per year NDS Labs and NDS Share Tutorials NDSC workshops YouTube recordings Other conferences, workshops, venues (e.g. IEEE eScience, SC, XSEDE, ...) NDS Labs and Share user support Pilot efforts Others? Community organization Committee recruitment and meetings Explore sustainability plans Maintain NDS wiki(s) Social media (e.g. Twitter, blog, etc...) NDS website

26 3. NDS Mission Outreach NDSC Workshops
Two per year NDS Labs and NDS Share Tutorials NDSC workshops YouTube recordings Other conferences, workshops, venues (e.g. IEEE eScience, SC, XSEDE, ...) NDS Labs and Share user support Pilot efforts Others? Community organization Committee recruitment and meetings Explore sustainability plans Maintain NDS wiki(s) Social media (e.g. Twitter, blog, etc...) NDS website

27 Pilots/Collaborative Efforts
NIST Materials Data Facility ARPA-E TERRA REF iSee Crops in Silico NIH BD2K KnowEng NASA Access Terra Fusion

28 Resources NCSA Nebula (unused computational resources)
NCSA Storage Condo (100 TB for MDF) SDSC Cloud (100 TB of storage, some computation) TACC Rodeo (40 CPU cores) ROGER (1 PB for TERRA) Blue Waters (2.5 PB for TERRA, 32 million Sus and 2 PB for Access)

29


Download ppt "National Data Service Consortium Development and Outreach Activities"

Similar presentations


Ads by Google