Presentation is loading. Please wait.

Presentation is loading. Please wait.

Making FAAM Flights Discoverable

Similar presentations


Presentation on theme: "Making FAAM Flights Discoverable"— Presentation transcript:

1 Making FAAM Flights Discoverable
28/04/2016 Graham Parton

2 FAAM Data in CEDA Archives
Over 950 flights 2004 to present >2Tb worth of data Core products routinely delivered and ingested Non-core sporadic available Changing project data access requirements Ingestion processes now largely automated, but reliant on manually setting up Project information

3 Finding FAAM Data Historically done purely through archive navigation
Reliant on user knowing where to look Required user to know flight years and numbers to go via /badc/faam OR CEDA to manually maintain linking in project data collections CEDA Data Catalogue had very poor FAAM flight coverage

4 Why Cataloguing Changed
Cataloguing was not a sustainable manual activity EU-INSPIRE legislation mandated making all geo-spatial data discoverable EUFAR programme also required flight discoverability Export to external discovery services (e.g. data.gov, NERC Data Catalogue Service) following ISO-19115/19156 Seeking consistent approach to cataloguing and content

5 CEDA data catalogue CEDA’s new ISO-compliant catalogue launched Oct 2014 Migration of CEDA’s previous catalogue content to maintain data citations. Increase in granularity from 340 collections to over 3000 datasets Since roll out undertaken: wholesale review of content and updates to supporting architecture (more to be done) New service is scriptable! FAAM record creation became a reality

6 Information Sources Carefully maintained flight information gathered over the years (flight number to project(s) mapping) Consistent directory convention followed Use of CEDA file-naming convention: instruments and platform details Details from FAAM site and NERC Grants of the Web Harvested file metadata (Elastic Search for Flight Finder Tool) CEDA’s security data base README files for each flight

7 Preparing for FAAM record creation
Collate missing information (Project descriptions, acronym mappings, instrument details) Ensuring consistency in archive derived details Working up interactions with CEDA’s services - Elastic Search, securityDB Adjustment to catalogue structure Production of 1500 lines of Python utilising catalogue’s underpinning architecture (Django) Test, verification, refinements, re-testing, further verifications and now.. roll out

8 950+ Flights Now Catalogued
950+ FAAM Flight dataset records uniquely linked the archive and: Project record(s) - WHY the flights took place Project data collections to link to other associated project data (and 3rd party data resources) Listing of instruments on each flight Linked to EUFAR flight finder tool All discoverable by: Project name Acronym Instrument Platform (FAAM) Flight number

9

10 Show and tell…

11 To be done Automated script to update/create records during ingestion
Updates to records when new data arrives Establishing better metadata provision from source (FAAM) Links from Flight Finder tracks ARSF and other EUFAR aircraft data (CMIP5, CMIP6 etc) Updated to missing geographic/temporal info from Flight Finder’s Elastic Search content Scraping parameters from files via Elastic Search

12 What the script does Reads in FAAM Project details from csv file and:
Finds existing Projet record in catalogue, or Creates new Project record Returns a dictionary where keys are project abbreviations and values are MOLES Projects Gathers list of flights and internal paths from archive (/badc/faam/data/*/*) Gathers the following for each flight: Internal path Platform details (including link to existing FAAM Platform record) Flight number and project abbreviation (from 00README file) Instrument groups (from directories) and instrument abbreviations (thanks to CEDA file-naming convention)

13 What the script does Creates 1 Dataset record per flight:
Creates title and abstract based on templates, platform, instrument groups and project info Collects bounding box and temporal extent by either: polling Elastic Search of content supporting EUFAR Flight Finder Tool: Details gathered from flight log Presets/templates for some fields (e.g. keywords, update frequency, lineage and quality statements etc.) Linked to Project record(s) Polls SecurityDB to get permission settings Project Dataset A Archive Data Title Abstract Keywords Lineage Quality Geo-temp Permission Etc…

14 What the script does continued…
FAAM Collection Then adds: Related parties (authors, ceda officer, publisher etc) Creates (if needed) and links to data collections and the main FAAM data collection Constructs + links to data acquisition details: instrument abbreviations from file-names used to link to correct instrument records Links to FAAM platform record Link to a record detailing the FAAM flight itself Dataset Dataset Collection A Archive Data Platform Project Instrument Operation Acquisition


Download ppt "Making FAAM Flights Discoverable"

Similar presentations


Ads by Google