Presentation is loading. Please wait.

Presentation is loading. Please wait.

Metadata in the iPlant Collaborative Cyberinfrastructure Birds of a Feather meeting at PAG XXII, Jan. 14, 2014.

Similar presentations


Presentation on theme: "Metadata in the iPlant Collaborative Cyberinfrastructure Birds of a Feather meeting at PAG XXII, Jan. 14, 2014."— Presentation transcript:

1 Metadata in the iPlant Collaborative Cyberinfrastructure Birds of a Feather meeting at PAG XXII, Jan. 14, 2014

2 From the iPlant Data Strategy: “The vision for iPlant CI data capabilities is to provide flexible, adaptive and scalable data infrastructure that enables users and communities to implement best practices for data management.”

3 How to enable best practices for data management in iPlant: 1.A way to add and edit metadata 2.Metadata templates for common file types 3.Search and browse iDS based on metadata and file content 4.Support for unstructured and structured (relational) data within the iDS 5.Interoperability with key external data sources 6.Benefits/features that are aligned with the use of popular file types 7.An iPlant Data Commons for public data

4 KEY ELEMENTS OF THE iPLANT DATA STRATEGY

5 1. CI to enable users to add and edit metadata using simple and flexible interfaces, including customizable metadata components. – a web-based user interface accessible via the DE – upload metadata as csv file – access to all metadata entities via iPlant APIs

6 Current DE metadata interface

7 Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Attribute  Value  attribute_1value_a attribute_2value_b attribute_3value_c attribute_4value_d Add Add Delete Delete Templates  Templates  OK Cancel Browse Templates Browse Templates

8 2. Project data management templates and best practices for organizing, handling and managing data for diverse use cases, including: – groups or consortia working on large-scale genome and transcriptome sequencing projects or species range maps – single PI laboratories focused on specific analysis such as RNA-Seq experiments, phenotype data sets

9 Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Attribute  Value  attribute_1value_a attribute_2value_b attribute_3value_c attribute_4value_d Add Add Delete Delete Templates  Templates  OK Cancel Browse Templates Browse Templates

10 Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Attribute  Value  attribute_1value_a attribute_2value_b attribute_3value_c attribute_4value_d Add Add Delete Delete Browse Templates Browse Templates OK Cancel Browse Templates Cancel Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Genome Sequence in iDS Genome Sequence in iDS Item 1 Select a template Insert Attributes Preview

11 Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Attribute  Value  attribute_1value_a attribute_2value_b attribute_3value_c attribute_4value_d Add Add Delete Delete Browse Templates Browse Templates OK Cancel Browse Templates Cancel Insert Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Item 3Item 5 Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Item 3Item 5 Genome Sequence in iDS Genome Sequence in iDS Item 1 Attributes Preview  project  specimen identifier  collection date  geographic location nam…  geographic location longi…  geographic location latit…  genus  species  infraspecific name  project  specimen identifier  collection date  geographic location nam…  geographic location longi…  geographic location latit…  genus  species  infraspecific name

12 Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Attribute  Value  attribute_1value_a attribute_2value_b attribute_3value_c attribute_4value_d Add Add Delete Delete Browse Templates Browse Templates OK Cancel Browse Templates Cancel Insert Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Item 3Item 5 Metagenomic Sequence (MIMS) Eukaryotic Genome Sequence (MIGS) Item 3Item 5 Genome Sequence in iDS Genome Sequence in iDS Item 1 Attributes Preview  project  specimen identifier  collection date  geographic location nam…  geographic location longi…  geographic location latit…  genus  species  infraspecific name  project  specimen identifier  collection date  geographic location nam…  geographic location longi…  geographic location latit…  genus  species  infraspecific name

13 Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Add Add Delete Delete Browse Templates Browse Templates OK Cancel Accordion Item Attribute  Value   project*jackson  specimen identifier54769  collection date*2008-01-23T19:23  sequencing method* Template: Metagenemoic Sequence Metadata

14 Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Add Add Delete Delete Browse Templates Browse Templates OK Cancel Accordion Item Attribute  Value   project*jackson  specimen identifier54769  collection date*2008-01-23T19:23  sequencing method* Template: Metagenemoic Sequence Metadata All of these are ISO8601 compliant time stamps: 2008- 0123T19:23:10+00:00…

15 Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Add Add Delete Delete Browse Templates Browse Templates Cancel Accordion Item Attribute  Value   project*jackson  specimen identifier54769  collection date*2008-01-23T19:23  sequencing method* Template: Metagenemoic Sequence Metadata OK

16 Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Add Add Delete Delete Browse Templates Browse Templates OK Cancel Accordion Item Attribute  Value   project*jackson  specimen identifier54769  collection date*2008-01-23T19:23  sequencing method* Template: Metagenemoic Sequence Metadata This field is required.

17 Metadata: /iplant/home/user/file.txt Metadata: /iplant/home/user/file.txt Add Add Delete Delete Browse Templates Browse Templates Cancel Accordion Item Attribute  Value   project*jackson  specimen identifier54769  collection date*2008-01-23T19:23  sequencing method*DOI# Template: Metagenemoic Sequence Metadata OK

18 3. CI to support searching and browsing based on metadata attributes and suitable file content. – provenance/system metadata and scientific metadata – across both private data and public data – ontology enhanced searches

19 Search capabilities Search API: users will be able to search by – file or folder name – any metadata attribute or value – date created – date last modified – creator – file size – file type – tool that created the file – analysis that created a file or folder – constraints (and, or, xor)

20 Search capabilities Users will be able to make "smart folders", that is, folders for all the files that match a set of search criteria.

21 4. Support for unstructured, semi-structured, and structured (relational) data within the iDS. – Document-based and NoSQL approaches to support unstructured and semi-structured data – Support for large matrix based data sets (e.g., in GBS, GWAS, etc.) – A way for users to search and access data in iPlant-hosted projects that include MySQL and PostgreSQL databases

22 5. Interoperability with key external data sources, including, but not limited to: – Ability to use external data in analyses run through iPlant, e.g., import from BioMart – Access to databases like CoGe, PO, MaizeGDB – Ability to push/publish/link data housed in iDS to canonical public repositories like NCBI, Data Dryad – Ability to engage semantic services and semantic pipelines based on metadata and ontological reasoning systems.

23 6. Benefits/features that are aligned with the use of popular file types. – provide the suitable utilities, tools, integration, and documentation on best data management practices for projects utilizing these formats

24 Demo:http://mirrors.iplantcollaborative.org/b rowse/iplant/home/shared/iplant_public_testhttp://mirrors.iplantcollaborative.org/b rowse/iplant/home/shared/iplant_public_test

25 7. An iPlant Data Commons that provides stable access to objects in the iDS that includes: – The option to make data public and permanent (un-editable). – Issuing multiple permanent identifiers (unique IDs) as needed (i.e. DOI, NOID, ARK) while packaging the content in standard compliant formats.


Download ppt "Metadata in the iPlant Collaborative Cyberinfrastructure Birds of a Feather meeting at PAG XXII, Jan. 14, 2014."

Similar presentations


Ads by Google