Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jonathan Yu, Environmental Informatics, CSIRO Land and Water

Similar presentations


Presentation on theme: "Jonathan Yu, Environmental Informatics, CSIRO Land and Water"— Presentation transcript:

1 Jonathan Yu, Environmental Informatics, CSIRO Land and Water
Hacking Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers Jonathan Yu, Environmental Informatics, CSIRO Land and Water Acknowledgements to OzNome team at CSIRO (Ben Leighton, Simon Cox, Paul Box, David Lemon, Hendra Wijaya, Ramneek Singh, Ashley Sommer, Andrew Freebairn, Peter Fitch, Matt Stenson, Sally Tetreault-Campbell, Andy Reeson, Todd Sanderson)

2 Data science and modelling
Cloud computing Data Linked Data resources Repositories, web services, infrastructure, formats… (ontologies, vocabs, semantics) HPC, *aaS Data science and modelling Containerisation, collab tech Access to compute anywhere, anytime, at scale Access to data at increasing volume, variety, velocity, Capturing increasing amount of precise info models, definitions for all scientific concepts Reproducible, repeatable, collaborative, shareable, mobile orchestration of workflows Increasingly powerful, shareable, environments for access and use of data for science We live in an exciting time computing wise…

3 Dealing with data silos Making sense of the data
Data challenges Dealing with data silos Unconnected, unknown, unmanaged Improving data access and connectivity Making sense of the data Understanding and using data in context. Integrating it with other datasets. Data access and connectivity Majority of data points are not yet connected today, and companies often do not have the right platforms – curate, publish, aggregate, analyse Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

4 Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

5 Access and use any data precisely, Using native tools,
Users Access and use any data precisely, Using native tools, Run any-everytime, From any-everywhere, Reproducible, Shareable/linkable/citeable, Collaborative, “Just works” Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

6 Concrete metrics of use Dialogue with users
Providers/producers Concrete metrics of use Dialogue with users Shareable/linkable/citeable Collaborative Easy to share data Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

7 Leverage existing resources Meet users where they are now
Show the possible with modern computing and Linked Data approaches Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

8 Hacking to link across existing data
Aim is to network existing information infrastructures with the knowledge network. Leverage what is there but with the principle that everything has a “Cool” identifier. Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

9 Link to tools people already use
C Link to tools people already use Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

10 Link up communities that already exist
Aim is to network existing information infrastructures with the knowledge network. Leverage what is there but with the principle that everything has a “Cool” identifier. Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

11 Typical researcher user experience…
Spatial features Web application Data users 3rd parties or new researchers X data 2005- Policy Reports Data Providers e.g Govt’ agencies, research groups X definitions General public C File download Research Modelling environments X historical Current AWRA data ecosystem - does that resonate with you? Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

12 Register Integrate Transform Analyse Synthesize Spatial features
Deliver Data users Spatial features Discover Access Understand Transform Integrate Critical mass Easy 3rd parties or new researchers Reports Data Providers e.g Govt’ agencies, research groups Policy Domain Definitions General public C Research Register outputs Modelling environments Data Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

13 Bootstrapping existing resources minting identifiers and APIs…
Data users Any spatial feature Knowledge Network 3rd parties or new researchers Definition Service Reports Data Providers e.g Govt’ agencies, research groups Policy Data broker APIs General public Any definition C Research Any gridded data Modelling environments Bootstrapping existing resources minting identifiers and APIs… Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

14 CSIRO Knowledge Network
URIs/HTTP Identifiers for everything… Indexing: 70,000+ datasets 170,000+ geo-features AusGovt organisations list (1220 organisations, ABNs next) Possible extensions: LinkedIn, ORCID, twitter, etc. Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

15 View of analytics on a data provider’s knowledge repository
Link to feature here: Data visualisation of key themes in Knowledge repository Analysis of Top 100 themes in knowledge repository Social analytics about knowledge people love Social analytics about knowledge people are viewing Knowledge network | Jonathan Yu

16 Example PID for spatial thing: Rpub doc: Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

17 Jupyter NB: Github link
Example PID for data collection: Jupyter NB: Github link Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

18 Example Cookie-cut spatial region (using PID) from continental scale dataset via web service Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

19 Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

20 Data science and modelling
Cloud computing Linked Data resources Data It just works! Containerisation, collab tech Data science and modelling Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

21 Challenges Making it intuitive and easy
Need more “Linked Data” tools/libraries and examples at user end of ecosystem (e.g. Python, Need more resources to bootstrap Conventions/protocols about the content Base ingredients to work with Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

22 ABS Statistical Areas Level 2
ABS Greater Capital Cities Classification SA2 GCCSA Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

23 Common convention for encoding spatial things
Original Common convention for encoding spatial things skos:prefLabel “Australian Capital Territory”; skos:notation “801” ^^ abs:SA4_Code ; skos:notation “8ACTE” ^^ abs:GCC_Code ; skos:exactMatch “8” ^^ abs:STE_CODE16 . RDF/SKOS prefLabel: Australian Capital Territory” notation: “801” related: “abs-gcc:8ACTE” exactMatch: “abs-ste:8” SKOS inspired KVP Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

24 What do these keywords mean? 5-star vocabulary terms?
Precision vs. existence How trusted is this? Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

25 Build on existing resources Meet users where they are now
Hack Show the possible with modern computing and Linked Data approaches Bootstrapping via identifiers for things so we can demonstrate value Better conventions for content Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers

26 Thanks! Land and Water Jonathan Yu Research scientist
Land and Water Jonathan Yu Research scientist t e jonathan.yu [at] csiro.au Weaving an integrated environmental data ecosystem using Linked Data and resource identifiers


Download ppt "Jonathan Yu, Environmental Informatics, CSIRO Land and Water"

Similar presentations


Ads by Google