Metadata in the iPlant Collaborative Cyberinfrastructure Birds of a Feather meeting at PAG XXII, Jan. 14, 2014.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Improving Learning Object Description Mechanisms to Support an Integrated Framework for Ubiquitous Learning Scenarios María Felisa Verdejo Carlos Celorrio.
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
New Products for © 2009 ANGEL Learning, Inc. Proprietary and Confidential, 2 Update Summary Enrich teaching and learning Meet accountability needs.
Unveiling ProjectWise V8 XM Edition. ProjectWise V8 XM Edition An integrated system of collaboration servers that enable your AEC project teams, your.
Enterprise Content Management Departmental Solutions Enterprisewide Document/Content Management at half the cost of competitive systems ImageSite is:
IPlant Data Commons. iPlant Data Commons leverages all elements of our CI to enhance data management, discoverability, and reuse.
ESRI Geoportal Extension 10 November 2010 Out-of-the-box capabilities and additional options.
Global Alignment and Collaboration Jo
ARCHIMÈDE Presented by Guy Teasdale Directeur, Services soutien et développement Bibliothèque de l’Université Laval CARL Workshop on Institutional Repositories.
Gov 2.0: The Government’s Web 2.0 Platform Ramesh Ramakrishnan Division Director Citizant Ph: (703) x165
GVSU is scheduled to upgrade to version 9.1 December 21, 2010 at 5pm.
Oct 31, 2000Database Management -- Fall R. Larson Database Management: Introduction to Terms and Concepts University of California, Berkeley School.
Supporting Customized Archival Practices Using the Producer-Archive Workflow Network (PAWN) Mike Smorul, Mike McGann, Joseph JaJa.
DEiXTo.
ACAT 2008 Erice, Sicily WebDat: Bridging the Gap between Unstructured and Structured Data Jerzy M. Nogiec, Kelley Trombly-Freytag, Ruben Carcagno Fermilab,
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
ISO/TC211 Geographic Information/Geomatics Implementing ISO Metadata David Danko Work Item 15—Project Leader
ETD Repositories Using DSpace Software Andrew Penman The Robert Gordon University 27 th September 2004.
SERNEC Image/Metadata Database Goals and Components Steve Baskauf
OpenMDR: Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
Danielle Baldwin, ITS Web Services CMS Administrator Application Overview and Joomla 1.5 RC 1 Highlights.
Semantic Sensor/Device Description System EEEM042-Mobile Applications and Web Services Assignment- Spring Semester 2015 Prof. Klaus Moessner, Dr Payam.
DuraCloud Managing durable data in the cloud Michele Kimpton, Director DuraSpace.
OpenMDR: Alternative Methods for Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
New Products for ©  2009 ANGEL Learning, Inc. Proprietary and Confidential, 2 Update Summary Enrich teaching and learning Meet accountability needs.
Using the SAS® Information Delivery Portal
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop iCommands and Other Data Store Resources.
Integrated Collaborative Information Systems Ahmet E. Topcu Advisor: Prof Dr. Geoffrey Fox 1.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Objectives.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
1 By: Nour Hilal. Microsoft Access is a database software where data is stored in one or more Tables. A Database is a group of related Tables. Access.
XML Registries Source: Java TM API for XML Registries Specification.
Data Management BIRN supports data intensive activities including: – Imaging, Microscopy, Genomics, Time Series, Analytics and more… BIRN utilities scale:
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
0 eCPIC User Training: Resource Library These training materials are owned by the Federal Government. They can be used or modified only by FESCOM member.
Linking Tasks, Data, and Architecture Doug Nebert AR-09-01A May 2010.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
The Prajna Project Utilities for Understanding Edward Swing.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
IPlant Collaborative Hands-on Cyberinfrastructure Workshop - Part 1 R. Walls University of Arizona Biodiversity Information Standards (TDWG) Sep. 28, 2015,
SRG: A Digital Document-Enhanced Service Oriented Research Grid Ahmet E. Topcu Ahmet Fatih Mustacoglu Geoffrey C. Fox Aurel Cami Indiana University Computer.
Build an Automated Workflow Visual Workflow Creator Discovery Environment.
Database and Information Management Chapter 9 – Computers: Understanding Technology, 3 rd edition.
Internet Documentation and Integration of Metadata (IDIOM) Presented by Ahmet E. Topcu Advisor: Prof. Geoffrey C. Fox 1/14/2009.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
WebDat: A Web-based Test Data Management System J.M.Nogiec January 2007 Overview.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
Microsoft Office 2013 Try It! Chapter 4 Storing Data in Access.
CyVerse-enabled NCBI Sequence Read Archive (SRA) Submission Pipeline
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of the iPlant Discovery Environment.
Santi Thompson - Metadata Coordinator Annie Wu - Head, Metadata and Bibliographic Services 2013 TCDL Conference Austin, TX.
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
CGI – GeoSciML Testbed 3 Status for BRGM Jean-Jacques Serrano.
Automating the Audit: Updates from the Metadata Upgrade Project at the University of Houston Libraries Andrew Weidner, Metadata Librarian Santi Thompson,
uses of DB systems DB environment DB structure Codd’s rules current common RDBMs implementations.
Progress on TripalBIMS Breeding Information Management System in Tripal Sook Jung, Taein Lee, Chun-Huai Chen, Jing Yu, Ksenija Gasic, Todd Campbell, Kate.
Semantic Interoperability: caCORE and the Cancer Data Standards Repository (caDSR)  Jennifer Brush.
CyVerse Workshop Discovery Environment Overview. Welcome to the Discovery Environment A Simple Interface to Hundreds of Bioinformatics Apps, Powerful.
IPDA Registry Definitions Project Dan Crichton Pedro Osuna Alain Sarkissian.
REDCap General Overview
CyVerse Tools and Services
Using E-Business Suite Attachments
SRA Submission Pipeline
Session 2: Metadata and Catalogues
Metadata The metadata contains
Welcome - webinar instructions
Presentation transcript:

Metadata in the iPlant Collaborative Cyberinfrastructure Birds of a Feather meeting at PAG XXII, Jan. 14, 2014

From the iPlant Data Strategy: “The vision for iPlant CI data capabilities is to provide flexible, adaptive and scalable data infrastructure that enables users and communities to implement best practices for data management.”

How to enable best practices for data management in iPlant: 1.A way to add and edit metadata 2.Metadata templates for common file types 3.Search and browse iDS based on metadata and file content 4.Support for unstructured and structured (relational) data within the iDS 5.Interoperability with key external data sources 6.Benefits/features that are aligned with the use of popular file types 7.An iPlant Data Commons for public data

KEY ELEMENTS OF THE iPLANT DATA STRATEGY

1. CI to enable users to add and edit metadata using simple and flexible interfaces, including customizable metadata components. – a web-based user interface accessible via the DE – upload metadata as csv file – access to all metadata entities via iPlant APIs

Current DE metadata interface

Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Attribute  Value  attribute_1value_a attribute_2value_b attribute_3value_c attribute_4value_d Add Add Delete Delete Templates  Templates  OK Cancel Browse Templates Browse Templates

2. Project data management templates and best practices for organizing, handling and managing data for diverse use cases, including: – groups or consortia working on large-scale genome and transcriptome sequencing projects or species range maps – single PI laboratories focused on specific analysis such as RNA-Seq experiments, phenotype data sets

Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Attribute  Value  attribute_1value_a attribute_2value_b attribute_3value_c attribute_4value_d Add Add Delete Delete Templates  Templates  OK Cancel Browse Templates Browse Templates

Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Attribute  Value  attribute_1value_a attribute_2value_b attribute_3value_c attribute_4value_d Add Add Delete Delete Browse Templates Browse Templates OK Cancel Browse Templates Cancel Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Genome Sequence in iDS Genome Sequence in iDS Item 1 Select a template Insert Attributes Preview

Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Attribute  Value  attribute_1value_a attribute_2value_b attribute_3value_c attribute_4value_d Add Add Delete Delete Browse Templates Browse Templates OK Cancel Browse Templates Cancel Insert Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Item 3Item 5 Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Item 3Item 5 Genome Sequence in iDS Genome Sequence in iDS Item 1 Attributes Preview  project  specimen identifier  collection date  geographic location nam…  geographic location longi…  geographic location latit…  genus  species  infraspecific name  project  specimen identifier  collection date  geographic location nam…  geographic location longi…  geographic location latit…  genus  species  infraspecific name

Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Attribute  Value  attribute_1value_a attribute_2value_b attribute_3value_c attribute_4value_d Add Add Delete Delete Browse Templates Browse Templates OK Cancel Browse Templates Cancel Insert Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Item 3Item 5 Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Item 3Item 5 Genome Sequence in iDS Genome Sequence in iDS Item 1 Attributes Preview  project  specimen identifier  collection date  geographic location nam…  geographic location longi…  geographic location latit…  genus  species  infraspecific name  project  specimen identifier  collection date  geographic location nam…  geographic location longi…  geographic location latit…  genus  species  infraspecific name

Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Add Add Delete Delete Browse Templates Browse Templates OK Cancel Accordion Item Attribute  Value   project*jackson  specimen identifier54769  collection date* T19:23  sequencing method* Template: Metagenemoic Sequence Metadata

Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Add Add Delete Delete Browse Templates Browse Templates OK Cancel Accordion Item Attribute  Value   project*jackson  specimen identifier54769  collection date* T19:23  sequencing method* Template: Metagenemoic Sequence Metadata All of these are ISO8601 compliant time stamps: T19:23:10+00:00…

Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Add Add Delete Delete Browse Templates Browse Templates Cancel Accordion Item Attribute  Value   project*jackson  specimen identifier54769  collection date* T19:23  sequencing method* Template: Metagenemoic Sequence Metadata OK

Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Add Add Delete Delete Browse Templates Browse Templates OK Cancel Accordion Item Attribute  Value   project*jackson  specimen identifier54769  collection date* T19:23  sequencing method* Template: Metagenemoic Sequence Metadata This field is required.

Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Add Add Delete Delete Browse Templates Browse Templates Cancel Accordion Item Attribute  Value   project*jackson  specimen identifier54769  collection date* T19:23  sequencing method*DOI# Template: Metagenemoic Sequence Metadata OK

3. CI to support searching and browsing based on metadata attributes and suitable file content. – provenance/system metadata and scientific metadata – across both private data and public data – ontology enhanced searches

Search capabilities Search API: users will be able to search by – file or folder name – any metadata attribute or value – date created – date last modified – creator – file size – file type – tool that created the file – analysis that created a file or folder – constraints (and, or, xor)

Search capabilities Users will be able to make "smart folders", that is, folders for all the files that match a set of search criteria.

4. Support for unstructured, semi-structured, and structured (relational) data within the iDS. – Document-based and NoSQL approaches to support unstructured and semi-structured data – Support for large matrix based data sets (e.g., in GBS, GWAS, etc.) – A way for users to search and access data in iPlant-hosted projects that include MySQL and PostgreSQL databases

5. Interoperability with key external data sources, including, but not limited to: – Ability to use external data in analyses run through iPlant, e.g., import from BioMart – Access to databases like CoGe, PO, MaizeGDB – Ability to push/publish/link data housed in iDS to canonical public repositories like NCBI, Data Dryad – Ability to engage semantic services and semantic pipelines based on metadata and ontological reasoning systems.

6. Benefits/features that are aligned with the use of popular file types. – provide the suitable utilities, tools, integration, and documentation on best data management practices for projects utilizing these formats

Demo: rowse/iplant/home/shared/iplant_public_testhttp://mirrors.iplantcollaborative.org/b rowse/iplant/home/shared/iplant_public_test

7. An iPlant Data Commons that provides stable access to objects in the iDS that includes: – The option to make data public and permanent (un-editable). – Issuing multiple permanent identifiers (unique IDs) as needed (i.e. DOI, NOID, ARK) while packaging the content in standard compliant formats.