Presentation is loading. Please wait.

Presentation is loading. Please wait.

RDA Wheat Data Interoperability Cookbook and last developments 9 th March 2015, San Diego.

Similar presentations


Presentation on theme: "RDA Wheat Data Interoperability Cookbook and last developments 9 th March 2015, San Diego."— Presentation transcript:

1 RDA Wheat Data Interoperability Cookbook and last developments 9 th March 2015, San Diego

2 2 The WDI working group in brief  Endorsed by RDA in March 2014  Members: ~=30 members and 15 active members, Wheat scientists, data and metadata technologists  The goal: contribute to the improvement of Wheat related data interoperability by  Building a common interoperability framework (metadata, data formats and vocabularies)  Providing guidelines for describing, representing and linking Wheat related data

3 3  Deliverables  A report of the survey of existing standards  A cookbook intended for the Wheat data managers community, which provides them with guidelines on what data formats, metadata, vocabularies and ontologies they should use to describe, represent and link different types of Wheat data.  A library of linked vocabularies and ontologies in machine readable formats with respect to the Linked Data standards.  A prototype which showcases the gain of interoperability Initial plans

4 4 Where we are Surveys Landscape of Wheat related standards and their use by the community Comprehensive overview of Wheat related ontologies and vocabularies Workshops Recommendations Mappings between different data formats Actions to conduct in order to improve the current level of Wheat related data interoperability Interoperability use cases Implementation Interactive cookbook: recommendations + guidelines A repository of Wheat related linked vocabularies (Bioportal)

5 Wheat related standards survey and workshop

6 6 Data typeData formats currently usedRecommendations StandardizedTool specificNon standardized SNPsVCFBAM/SAM, BED, VARSCAN, VEP VCF files generated by using the survey sequences of IWGSC + metadata about VCF files to enrich the information about the SNPs. genome annotations Genbank Flat File, General Feature Format (GFF), EMBL GFF 3 + specifications with regard the description of specific columns GermplasmsMPCD, ABCD, Darwin Core, Darwin Core Germplasm Grin GlobaltabulatedMPCD Gene expression Many format standards laid out by repositories such as NCBI (GEO) and EBI Array Express Existing format standards laid out by the repositories such as NCBI (GEO) and EBI Array Express + ENA Physical mapsGFFCmap, fpcGFF3 Genetic mapsCmap, gnpmapGFF3 (to be confirmed) PhenotypesDrops, ped, isa- tab, ephesis tabulatedIsa-tab

7 7 Examples of use cases TitleSearching for germplasm with specific traits DescriptionExample of searching for germplasm with specific traits - tagged with ontology terms? Data types Germplasm Phenotype Challenges ●Metadata very important ~ standardized format ●Association of genes to traits, linked to germplasm, marker information ●Need for quality controls- how confident are you of the data source? ●Provenance of the germplasm- pedigree, ownership, ●Standard system for tracking germplasm, names Title Identification of wheat genes that control root growth DescriptionRequires: Annotated genes (Gene Ontology, PFam, and other functional annotation) Data typesGenomic annotations? - Gene location ? (IWGS-SS ID or MIPS HCS link) Challenges Mapping between wheat genes and orthologs from other species (deduce function by seq. similarity); Access to RNASeq data (genes that are not expressed in roots may be irrelevant) ; mapping of wheat genes and information on their function based on literature TitleQuery on trial data associated with varieties Data typesPhenotypic data, GIS data, (wheat economy/production data) Description To search wheat varieties with distribution maps, production figures, performances in wheat mega environments, associated projects worldwide plus layers of climatic data on specific wheat production areas and disease prevention information. ChallengesPhenotypic data should be linked to GIS data. Using keywords or ontology terms a system or a tool should be able to pull out such information from different websites/systems developed by wheat community.

8 8

9 Wheat related ontologies and vocabularies survey

10 10  Assess the level of visibility and interoperability of Wheat related vocabularies and ontologies  Is the vocabulary/ontology updated regularly?  What license and/or copyright is used?  Is the vocabulary/ontology part of any ontology communities or listing services?  Is the vocabulary/ontology used or implemented in any database/repository?  Does the vocabulary/ontology interlink and/or map to other vocabularies and ontologies?  Does the vocabulary/ontology  Identify the domain covered by the ontologies and vocabularies  Refine the cookbook  Collect more interoperability use cases  Collect some technical details The objectives of the survey

11 11 The objectives of the survey Guidelines and Repository What level of visibility/operability? What content? What formats, and technologies?

12 The Wheat related BioPortal allows one to search for terms across multiple ontologies, browse mappings between terms in different ontologies, receive recommendations on which ontologies are most relevant for a corpus, annotate text with terms from ontologies

13 13  Metadata (harmonization, minimal metadata sets)  Mappings  Next workshop (summer 2015)  Review and complete the recommendations  Refine and complete the guidelines and the best practices  Finalize the repository of Wheat related vocabularies  Implement the prototype Next steps

14 14 Thanks!


Download ppt "RDA Wheat Data Interoperability Cookbook and last developments 9 th March 2015, San Diego."

Similar presentations


Ads by Google