The Process of Data Ingestion in ÆKOS Andrew Graham and Matt Schneider TERN Ecoinformatics Data Analysts Logos used with consent. Content of this presentation.

Slides:



Advertisements
Similar presentations
Improving Learning Object Description Mechanisms to Support an Integrated Framework for Ubiquitous Learning Scenarios María Felisa Verdejo Carlos Celorrio.
Advertisements

Soils to Satellites Logos used with consent. Content of this presentation except logos is released under TERN Attribution Licence v1.0
TERN Eco-informatics – Overview for DCRG Craig Walker Eco-informatics Facility Director.
EMu Online Data Sources Brad Lickman For Taxonomy and Geolocation (and Vocabulary Control)
Raising your research profile with AEKOS Anita Smyth and Matt Schneider Logos used with consent. Content of this presentation except logos is released.
SHaRED Submission, Harmonisation and Retrieval of Ecological Data By Martin Pullan.
Partnerships as critical research infrastructure The ÆKOS Experience Anita Smyth Partnerships, Licensing, Communications Eco-informatics Facility TERN.
Raising your research profile with AEKOS Anita Smyth and David Turner Logos used with consent. Content of this presentation except logos is released under.
Terrain derivatives Derived from the 1 second resolution SRTM DEM-S and DEM-H SUSTAINABLE AGRICULTURE FLAGSHIP John Gallant and Jenet Austin | Terrain.
Emission Inventory System Reports Course Sally Dombrowski
V Alyssa Rosemartin 1, Lee Marsh 1, Ellen Denny 1, Bruce Wilson USA National Phenology Network, Tucson, AZ; 2 - Oak Ridge National Laboratory, Oak.
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
TERN Eco-informatics – Managing and delivering ecological research data now and into the future Craig Walker Eco-informatics Facility Director Logos used.
Overview of key concepts and features
ArcGIS Data Reviewer: An Introduction
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
ÆKOS: A new paradigm for discovery and access to complex ecological data David Turner, Paul Chinnick, Andrew Graham, Matt Schneider, Craig Walker Logos.
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 1 August 15th, 2012 BP & IA Team.
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.
1 WORLD TOURISM ORGANIZATION (UNWTO) MEASURING TOURISM EXPENDITURE: A UNWTO PROPOSAL SESRIC-UNWTO WORKSHOP ON TOURISM STATISTICS AND THE ELABORATION OF.
Centre for Geo-information Fieldwork: the role of validation in geo- information science RS&GIS Integration Course (GRS ) Lammert Kooistra Contact:
Oy Metadata Content j of Metadata. Discovery Access Understanding Levels of Metadata joy of Metadata Metadata Standards Why standards Which standards.
Get More Value from Your Reference Data—Make it Meaningful with TopBraid RDM Bob DuCharme Data Governance and Information Quality Conference June 9.
SC32 WG2 Metadata Standards Tutorial Metadata Registries and Big Data WG2 N1945 June 9, 2014 Beijing, China.
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
1 Benjamin Perry, Venkata Kambhampaty, Kyle Brumsted, Lars Vilhuber, William Block Crowdsourcing DDI Development: New Features from the CED 2 AR Project.
Descriptive Analysis Database Archive monitoring network locations, climate, emissions, wildfires, census, political, physical, and image databases Databases.
Wayne A. Robbie, Supervisory Soil Scientist USDA Forest Service, Southwestern Region Albuquerque, NM Common Elements: Ecological Sites Descriptions and.
Getting Ready for the Future Woody Turner Earth Science Division NASA Headquarters May 7, 2014 Biodiversity and Ecological Forecasting Team Meeting Sheraton.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Inventory and Monitoring Terrestrial Fauna Inventory and Monitoring Terrestrial Fauna Linking Field Activities to Budget Processes.
Data documentation and metadata for data archiving and sharing Managing research data well workshop London, 30 June 2009 Manchester, 1 July 2009.
ArcGIS Data Reviewer: An Introduction
Linking Tasks, Data, and Architecture Doug Nebert AR-09-01A May 2010.
Publications Office Metadata Registry (MDR) INSPIRE Registry and Registers Workshop Willem van Gemert Publications Office of the EU Dissemniation and Reuse.
Role of Spatial Database in Biodiversity Conservation Planning Sham Davande, GIS Expert Arid Communities Technologies, Bhuj 11 September, 2015.
Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.
A Provisional Observational Data Standard to Facilitate Data Sharing and Aggregation Lynn Kutner, Bruce Stein, and Donna Reynolds TDWG Annual Meeting,
Sea Ice Mapping Systems Archive Browser Interface Distribution IngestProduction Ice Analyst Application Database Henrik Steen AndersonDMI Paul SeymourNIC.
Data Staging Data Loading and Cleaning Marakas pg. 25 BCIS 4660 Spring 2012.
Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003.
Australia’s National Vegetation Information System (NVIS)
2008 EPA and Partners Metadata Training Program: 2008 CAP Project Geospatial Metadata: Introduction Module 1: Introduction & Overview of the FGDC CSDGM.
Master Data Management & Microsoft Master Data Services Presented By: Jeff Prom Data Architect MCTS - Business Intelligence (2008), Admin (2008), Developer.
Metadata Content Entering Metadata Information. Discovery vs. Access vs. Understanding Cannot search on content if it is not documented. Cannot access.
TERN & the Ecoinformatics Facility Who Are We? DOI’s Within TERN Ecoinformatics.
OECD Expert Group on Statistical Data and Metadata Exchange (Geneva, May 2007) Update on technical standards, guidelines and tools Metadata Common.
Getting Familiar with Metadata Laurie Porth Rocky Mountain Research Station Audience: Scientists/researchers who have heard of metadata and now need to.
Draft Spatial Data Standards for the Department of Water Resources Greg Smith
The business process models and quality issues at the Hungarian Central Statistical Office (HCSO) Mr. Csaba Ábry, HCSO, Methodological Department Geneva,
Working Group: Data Foundations and Terminology (Practical Policy Considerations) Reagan Moore.
1 DATA Act Information Model Schema (DAIMS) Version 1.0 Briefing June 2016.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
SAP MDG ONLINE TRAINING Online | classroom| Corporate Training | certifications | placements| support CONTACT US: MAGNIFIC TRAINING INDIA
Quality declarations Study visit from Ukraine 19. March 2015
The new FFDB – an outlook
The IPT user interface and data quality tools
Flanders Marine Institute (VLIZ)
Template library tool and Kestrel training
Exchanging Reference Metadata using SDMX
Data Quality By Suparna Kansakar.
EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal
Enhancing ICPSR metadata with DDI-Lifecycle
Indicator structure and common elements for information flow
Primary key Introduction Introduction: A primary key, also called a primary keyword, is a key in a relational database that is unique for each record.
2. An overview of SDMX (What is SDMX? Part I)
Data Model.
Reportnet 3.0 Database Feasibility Study – Approach
Presentation transcript:

The Process of Data Ingestion in ÆKOS Andrew Graham and Matt Schneider TERN Ecoinformatics Data Analysts Logos used with consent. Content of this presentation except logos is released under TERN Attribution Licence Data Licence v1.0

Introduction The Data Analyst Role with TERN Ecoinformatics Analysis of source data and methods ÆKOS system development and domain modelling Contextual description of the data Publication of data into ÆKOS

The AEKOS Framework 1.Upper Context: Party, Project, Scope etc 2.Domain Model (Ontology): Observed entities, their features and relationships 3.Description Model: Methods and definitions 4.Indexing Model: Search and federation

Upper Context Provides context for Datasets: Contact details High level objectives of program Licensing details and conditions of use Statement of scope Alignment with national metadata standards (ANDS) Statement of curation processes applied to data

Understanding Field Sampling Schematic view of sampling configuration

Methodological work-flow Study Location Selection Study Location Visit Study Location Establishment Sampling Unit Selection Vegetation Assessment Physical Assessment Landscape Assessment Soil Assessment Fire Evidence Surface Cover Disturbance Evidence Vertebrate Evidence Climate Evidence Species Assessment Species Life Stage Vegetation Assemblage Voucher Collection Canopy Age-class Canopy Assessment Structural Formation Overstorey Measurement

Authored Method Descriptions Start with published method manuals Enrich existing method descriptions (protocols) with external web links and other resources Clarify questions about methods Divide the protocol into smaller method descriptions

Authored Method Descriptions Use a consistent format across datasets to allow comparison Direct linkage between the data value and the specific method of measurement Allows rapid assessment of suitability of data for re-use Eventually a method catalogue for researchers

Definition of source datasets Analysis and definition of source data types: Observation data Taxonomic concepts (a specific type of ref. data) Reference data (i.e. Lookup tables) Images and other artefacts.

Mapping to the ÆKOS Domain Model Study Location Sampling Unit Study Location Visit Spatial Poin t mudmap comment visit date observers disturbance datum x coord y coord identifier marker type Species Organism Group Voucher Specimen determined identity accession No. determiner field identity life form cover/abundance life stage phenology dominance Landscape slope aspect landform pattern selects represents contains represented by

Indexing Enrichment of data with common indexes: Project level traits Data management traits Ecological process traits (disturbance and land-use) Measurement details Species taxonomy Vegetation Assemblage (e.g. NVIS Major Veg. Groups) Jurisdictional and Bio-geographic boundaries Spatially derived features (e.g. distance from road, slope, aspect, etc.)

Federated Taxonomy

The AEKOS Ingestion DSL Screen cap of Eclipse... Source data query Vocabulary management Method description Mapping to the common model Populate indexes Upper context authoring Sandbox testing Source data query Vocabulary management Method description Mapping to the common model Populate indexes Upper context authoring Sandbox testing

Data Work-flow Point of truth is always the source database Data values are not changed Data issues fed back to Data Providers Automatic data refresh mechanism developed Corrections made in source database and fed back to AEKOS on next push Just new records and edits after the first load Update frequency defined for each dataset

Quality Assurance ÆKOS QA and review: Team review domain modelling of every dataset ingested Sandbox test ingestion before publishing to ÆKOS Review of method description by other team members Internal code validation and error checking

Quality Assurance Data Providers QA: Review method descriptions Review upper context Portal feedback: Review data content in the portal Use the portal and suggest enhancements and changes Look and feel Index traits Data accuracy and representation Feedback survey and facility on portal

Thank you Contact Details Data Analyst – Matt Schneider Data Analyst – Andrew Graham Website