Presentation is loading. Please wait.

Presentation is loading. Please wait.

GLOBAL BIODIVERSITY INFORMATION FACILITY

Similar presentations


Presentation on theme: "GLOBAL BIODIVERSITY INFORMATION FACILITY"— Presentation transcript:

1 GLOBAL BIODIVERSITY INFORMATION FACILITY
Designing a Global Network to Accommodate Contributions from all Sources and Technical Abilities Tim Robertson GBIF Secretariat

2 Content How the GBIF index is built Joining the GBIF network Technical requirements Documentation on services and standards The use of current protocols for data harvesting Simplified full dataset harvesting The new GBIF integrated publishing toolkit Extending the model – Simple Transfer Schema task group

3 Today: How the network is structured

4 Today: Entry requirements

5 Basis of Record: Data served
(Source: GBIF Data Portal October 2008)

6 Basis of Record: What the standards say

7 Comparison: International Standards Organisation
2 digit country codes (ISO 3166) Multilingual (English, French + external translations) Simple Tab Demitted File format Loads straight into database for reuse As simple as it needs to be… For controlled vocabularies, could this approach be adopted? Could removing complex technical schemas allow for easier contribution?

8 Harvesting: Using existing protocols
Provider has TAPIR wrapper Wrapper allows for 200 records per request 260,000 records to harvest 1300 request / responses 9 hours total 500MB XML transferred Extracted to a 32MB delimited file for the index Compressed to 3MB Why not produce this on the provider?

9 Harvesting: Streamlining the process
Benefits Indexes can be more up-to-date better for the user benefits provider Provider systems can be left to answer specific real queries the original purpose for the wrapper software Easy for small data publishers to produce Already done in an ad-hoc manner for very large providers Not dissimilar to Sitemaps protocol

10 Harvesting: Streamlining the process
If this is already being done in an ad-hoc manner, should it be defined as a standard?

11 GBIF: The integrated publishing toolkit (IPT)
Publishing of Occurrence data Checklist data Taxonomic data Dataset descriptive data (metadata) Key features Embedded data cache takes load off ”LIVE” system allows for file based importing Web application to search and browse data TAPIR, WFS, WMS, TCS, EML, RSS, ”Local DwC Index” Simple extensions – the ”star schema” Can be used in a hosting environment

12 GBIF: The integrated publishing toolkit (IPT)

13 GBIF: The integrated publishing toolkit (IPT)
Ready for ”alpha” testing – please enquire! Demonstrations by Markus Döring and Tim Robertson all week Poster Lunchtime session Tuesday

14 Extending the model: More data types
The data being mobilised is largely “single core entity” the “Occurrence Record” Integrating with other areas? Earth observation networks Ecological networks Task group to investigate specific use cases to determine a Common Transfer Schema: Primarily data modeling experience Technical implementation Presentation to TDWG community Perhaps multiple core entities, each extensible?

15 Extending the model: More data types

16 Extending the model: More data types

17 Contact Tim Robertson GBIF Secretariat Universitetsparken 15 2100 Copenhagen Denmark


Download ppt "GLOBAL BIODIVERSITY INFORMATION FACILITY"

Similar presentations


Ads by Google