Presentation is loading. Please wait.

Presentation is loading. Please wait.

OzNome 5-star Tool: A Rating System for making data FAIR and Trustable

Similar presentations


Presentation on theme: "OzNome 5-star Tool: A Rating System for making data FAIR and Trustable"— Presentation transcript:

1 OzNome 5-star Tool: A Rating System for making data FAIR and Trustable
Simon Cox, Jonathan Yu 20 October 2017 Land and Water Abstract - The OzNome initiative is seeking to connect information infrastructures across Australia and enable researchers, industry and key partners to achieve productivity gains around their discovery, access and use of data. While its origins are in earth and environmental data, the intended scope is more comprehensive. Tools and methods developed through OzNome will provide access to existing and emerging trusted, curated and well governed data ecosystems. A key concern in seeking to enhance the connectivity of information infrastructures is to understand the current state of a component data system, in particular considering the lifecycle of data from discovery, access, use and publication. Characterising the current state allows organisations and individuals to understand areas for improving their data collection, publication and connectivity to the broader community and digital ecosystem.

2 OzNome – “A connected Australia”
CSIRO-led initiative to enhance and connect information infrastructures across Australia. Tools, products, services Methods, approaches, practices Infrastructure Initial focus on enhancing and connecting environmental information infrastructures in Australia, starting with CSIRO L&W. + OzNome FAIR 5-star

3 Motivation Environmental data comes from many sources
Solving big problems the data to be connected In order to be connectable, the data should be FAIR OzNome FAIR 5-star

4 How to make information ‘connectable’?
Follow the FAIR principles Assessment tool with recommendations on improvements OzNome FAIR 5-star

5 OzNome FAIR 5-star

6 Citeable via a stable, persistent web identifier
Findable: Citeable via a stable, persistent web identifier Described with appropriate metadata Indexed in a well known system, e.g. google search, catalog search Accessible: Available on the web? Via standardised web service? Curated with a commitment that this data will be available long term Indexed in a well known system Interoperable/Reusable: Common formats Discoverable, community-endorsed schema or data model Unambiguous definitions for all elements (e.g. column definitions, units of measure) linked to accessible (standard) definitions Linked to other data using external identifiers (e.g. URIs) Licenced Trusted: Usage information available Part of a regular data collection program or series, with clear maintenance arrangements and update schedule Commitment that this data will be available long term OzNome FAIR 5-star

7 Data rating criteria published hosted curated updated, maintained
licensed  citeable  described findable  loadable  useable  comprehensible connected, linked  assessable trusted The OzNome team have developed a set of criteria under 14 headings to assess data collection, publication and service provisioning arrangements (see This is based on experience with data services and supply chains, and rating systems and maturity models proposed over the years across different initiatives, e.g. 5* Linked Open Data, FAIR, Schema.org. OzNome FAIR 5-star

8 Is it intended to be published? How? How often?
Key word Levels published a. No external access b. External access, non-web protocol (e.g. physical media distribution) c. Published via the web hosted  a. not on web b. files on web-server c. repository with web interface d. web service - local API e. RESTful web service - OpenAPI/Swagger f. standard web API (SPARQL, OGC WMS/WFS/WCS/SOS/WPS, ...) curated a. once-off dump, no ongoing commitment b. best effort c. institutional repository d. certified repository updated, maintained a. one-time dataset b. part of data series, occasional/irregular update c. part of data series with regular updates Implicitly FAIR? FAIR - Accessible More than FAIR! Is it intended to be published? How? How often? OzNome FAIR 5-star

9 FAIR – Findable, Reusable
Key word Levels licensed  a. no licence b. licence described in text c. standard licence (e.g. Creative Commons) citeable  a. Not citeable b. Local identifier (may change) c. Web identifier (transient URL or query) d. Persistent web identifier (PURL, DOI, handle, ARK, etc) described a. no metadata b. text description (abstract) and keywords c. basic metadata (e.g. Dublin Core) d. specialized metadata (e.g. Darwin Core, ISO 19115, scientific data profile of schema.org) e. rich metadata using (standard) RDF vocabularies (e.g. DCAT, ADMS, PROV, GeoDCAT, OMV, VoID) findable  a. not indexed b. indexed in a local, organizational catalogue c. metadata harvested or pushed into a community (e.g. Research Data Australia, Re3Data) or jurisdictional catalogue d. visible in general-purpose indexes (Google, Bing) e. highly ranked in general-purpose indexes FAIR – Findable, Reusable Indexed? Identified? Licensed? OzNome FAIR 5-star

10 Format, structure, semantics, links
Key word Levels loadable  a. bespoke file format b. standard data-format, denoted by a MIME-type (CSV, JSON, XML, netCDF, etc) c. choice from multiple standard formats useable  a. implicit schema, not formalized b. explicit schema, formalized in DDL, XSD, data-package, RDFS/OWL, JSON-Schema or similar c. community schema, available from a (standard) location comprehensible a. local field labels b. field labels linked to text explanations c. standard labels (e.g. CF Conventions, UCUM units) d. some field names linked to standard, externally managed vocabularies e. all field names linked to standard, externally managed vocabularies connected, linked  a. no links b. in-bound links from a catalogue or landing page c. out-bound links to related data FAIR - Interoperable Format, structure, semantics, links OzNome FAIR 5-star

11 netCDF metadata example
OzNome FAIR 5-star

12 netCDF metadata example - interoperable
OzNome FAIR 5-star

13 Quality, provenance, trusted?
Key word Levels assessable a. No quality or lineage information b. Lineage statement in text c. Formal provenance trace (W3C PROV-O or similar) trusted a. no information about usage b. usage statistics available. c. Clearly endorsed by reputable organization or framework FAIR – Reusable More than FAIR? Quality, provenance, trusted? OzNome FAIR 5-star

14 5-star assessment tool http://oznome.csiro.au/5star/
The rating system is implemented as the 5* OzNome Data tool (available at allows users to carry out a self-assessment by classifying the 14 facets into five qualities of data – Findable, Accessible, Interoperable, Reusable and Trusted. For each quality, the user assesses the current state of their data offerings and products. They are given a rating out of 5 stars for each quality. In doing the self-assessment, users can also explore ways in which they are able to improve their data collection and how it is accessed by others. This gives data providers graduated targets to improve their data collection and publishing process. OzNome FAIR 5-star

15 Interoperable? > 10 years of information standards work in CSIRO
OzNome FAIR 5-star

16 OzNome maturity estimation
Findable via ANDS/RDA Accessible - Available as web service Interoperable/Reusable: web services, standard schema. Standard vocabularies. Trusted: Reliable operationalised The 5* OzNome Data tool has been used to evaluate well-known services, such data available via the Australian Soil Resource Information System (ASRIS) (see Figure 2). OzNome FAIR 5-star

17 OzNome maturity estimation
Findable via Google search Accessible - publish limited set from 2005 Available as web service Interoperable/Reusable: some web services, reference definitions as text Trusted: Reliable operationalised updates of 2005 data Findable/Accessible: dataset test deployments and aggregates via connected infrastructure (internal CSIRO) Enhanced connectivity via web services Interoperable/Reusable: web services, reference definitions as Linked Data and externally hosted observable properties vocabulary definitions Trusted: Not operational and no trusted repository This tool is able to be used to carry out a first pass assessment of current data provisioning arrangements. In this presentation we will provide details how this tool was developed as well as case studies carried out. In particular, we will present a case study carried out in partnership with the Bureau of Meteorology around the Australian Water Resource Assessments web services, which utilised the OzNome 5-star tool to assess current state and provide recommendations for improvement of data provisioning arrangements to enhance the potential uptake and use of their data assets. OzNome FAIR 5-star

18 Oznome data assessment criteria
Key word Matching FAIR Principle published Implicitly FAIR hosted  A1 - A2 curated More than FAIR updated, maintained licensed  R1.1 citeable  F1 described R1, F2, F3 findable  loadable  I1 useable  I2, R1.3 comprehensible I2 connected, linked  I3 assessable R1.2 trusted Oznome data assessment criteria OzNome FAIR 5-star

19 Summary & conclusions Augment FAIR principles
curated, updated, maintained, trusted Add specific guidance on maturity within each criterion Tuned to geospatial/environmental data Form-based tool for self-assessment OzNome FAIR 5-star

20 Links and references FAIR principles FAIR principles and metrics for evaluation evaluation OzNome data assessment criteria - 5-star tool - OzNome FAIR 5-star

21 Thank you Land and Water Simon J D Cox Research Scientist
Jonathan Yu Research Scientist t t e e w people.csiro.au/Simon-Cox w people.csiro.au/Jonathan-Yu Land and Water

22 ? Geofabric features C Web application File download
Data users Geofabric features Web application 3rd parties or new researchers AWRA-L 2005- Policy Reports Data Providers e.g Govt’ agencies, research groups AWRA definitions General public ? C File download Research Modelling environments AWRA-L historical Current AWRA data ecosystem OzNome FAIR 5-star

23 Fundamental Questions
In what ways can we assess the FAIRness of a digital resource? To what degree can we automate this assessment? Must we treat each type of digital resource differently? Who will use the metrics? The producers, the funders, or the users? Can one resource be more FAIR than another? Will/should FAIRness assessments impact funding decisions? Should only one organization define these metrics? Or can anybody make their own metrics? What happens if a digital resources scores well against one set of metrics, but not another? OzNome FAIR 5-star

24 AWRA-L Draft Vocabularies
Referenceable metadata for implicit observable properties Online AWRA vocabulary register (draft) Mappings from AWRA data to online draft AWRA vocabularies OzNome FAIR 5-star

25 AWRA Draft Vocabulary example
Evapotranspiration feature of interest Critical Zone Potential Evapotranspiration Water object of interest has unit of measure Millimeter OzNome FAIR 5-star

26 FAIR - Interoperable Key word Levels loadable a. bespoke file format
b. standard data-format, denoted by a MIME-type (CSV, JSON, XML, netCDF, etc) c. choice from multiple standard formats FAIR - Interoperable OzNome FAIR 5-star

27 FAIR - Interoperable Key word Levels loadable useable
a. bespoke file format b. standard data-format, denoted by a MIME-type (CSV, JSON, XML, netCDF, etc) c. choice from multiple standard formats useable  a. implicit schema, not formalized b. explicit schema, formalized in DDL, XSD, data-package, RDFS/OWL, JSON-Schema or similar c. community schema, available from a (standard) location FAIR - Interoperable OzNome FAIR 5-star

28 FAIR - Interoperable Key word Levels loadable useable comprehensible
a. bespoke file format b. standard data-format, denoted by a MIME-type (CSV, JSON, XML, netCDF, etc) c. choice from multiple standard formats useable  a. implicit schema, not formalized b. explicit schema, formalized in DDL, XSD, data-package, RDFS/OWL, JSON-Schema or similar c. community schema, available from a (standard) location comprehensible a. local field labels b. field labels linked to text explanations c. standard labels (e.g. CF Conventions, UCUM units) d. some field names linked to standard, externally managed vocabularies e. all field names linked to standard, externally managed vocabularies FAIR - Interoperable OzNome FAIR 5-star

29 OzNome FAIR 5-star

30 37 repositories OzNome FAIR 5-star

31 Scoring the resources OzNome FAIR 5-star

32 Overall evaluation OzNome FAIR 5-star

33 Unpacking data N One symbol, many meanings Nitrogen Newtons North Noon
Neutral November Moles No! N One symbol, many meanings OzNome FAIR 5-star

34 ? Geofabric features C Web application File download
Data users Geofabric features Web application 3rd parties or new researchers AWRA-L 2005- Policy Reports Data Providers e.g Govt’ agencies, research groups AWRA definitions General public ? C File download Research Modelling environments AWRA-L historical Current AWRA data ecosystem OzNome FAIR 5-star


Download ppt "OzNome 5-star Tool: A Rating System for making data FAIR and Trustable"

Similar presentations


Ads by Google