Presentation is loading. Please wait.

Presentation is loading. Please wait.

GeoViQua: Advances in data quality disclosing Ivette Serral Center of Research in Ecology and Forestry Applications (CREAF)

Similar presentations


Presentation on theme: "GeoViQua: Advances in data quality disclosing Ivette Serral Center of Research in Ecology and Forestry Applications (CREAF)"— Presentation transcript:

1 GeoViQua: Advances in data quality disclosing Ivette Serral Center of Research in Ecology and Forestry Applications (CREAF) contact@geoviqua.org

2 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 QUAlity aware VIsualisation for the Global Earth Observation system of systems Its an FP7 project devoted to show quality information embedded in GEOSS data (2011-2014) 10 partners, 7 countries

3 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 The problem GEOSS data is treated by means of the GEOSS Common Infrastructure (GCI) Is there quality information in the GCI? –There is some in the form of ISO19115 DQ elements and lineage –But.. not enough The GCI does not follow a global model for quality The GCI is shown and searchable on the GEO Portal The GEOPortal search and results –are not ranged by quality –quality indicators are not easily comparable –spatially distributed uncertainty is not included

4 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Community View Data Quality? Many researchers refer to the famous five as the common criteria for evaluating spatial data quality: –lineage; completeness; consistency; positional accuracy; and attribute accuracy. Broad scientific acceptance of the common spatial quality elements does not apply to all cases for fitness-for-use evaluation –user requirements can go far beyond the widely accepted famous five. We used semi-structured telephone and face-to-face interviews with a variety of geospatial data users and experts from a number of countries and application domains. More information at: http://www.geoviqua.org/Docs/SubmittedDeliverables/D2_1_GeoViQua.pdf

5 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 What about users? Users are exceedingly interested in good quality metadata records –And information that can help to assess fitness-for-use of the data Users find metadata records typically incomplete with essential data omitted –The process of dataset discovery and selection is more difficult Users are also interested in soft knowledge about data quality –Data providers comments on the overall quality of a dataset, known data errors, potential data usage –Peers reviews and recommendations (they contact their peers to obtain suggestions) –Dataset provenance, citation and licensing information Citation is incomplete (lack of valid producer contact details), and licensing often missing Citation: users rely on data from good reputation producers Currently, some of these cannot be recorded in standard metadata Users need to easily and systematically compare metadata records –Side-by-side visualisation of all metadata elements would allow geospatial datasets to be compared more effectively, especially when datasets are very similar and differences are hard to distinguish

6 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 The ISO classical view Quality indicators Provenance/Lineage Usage

7 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Quality model is much more than positional accuracy There are many quantifiable aspects that can be recorded: –Consistency, completeness, positional, thematic and temporal accuracy… There are many qualitative aspects that are needed: –Lineage (traceability), scientific papers, user feedback, data usage…

8 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 GeoViQua Data model treats statistical uncertainties m 3.6 Value of the vertical DEM accuracy m 1.2 3.6 Explicit recognition that errors acceptably fit a Normal distribution with mean 1.2 An overall positive bias was observed A difficult feature to convey by traditional means)

9 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Two models on data quality are needed Producers quality metadata –In the producers metadata records –Encoded in the classical ISO 19115/19139 –Some extensions required –Stored in the current catalogues (GEOSS Clearinghouse, etc) Users quality metadata –In independent metadata repositories –Linked to producers metadata by id –Future component of the GCI? –Contains comments, like it, star rates, etc

10 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Advances in quality models: GVQ - producer quality model http://schemas.geoviqua.org/GVQ/3.1.0

11 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 1.Publications. Based on ISO 19115 CI_Citation and extended with ISO 690 elements. Added to a number of quality elements within the metadata document. An existing DQ_ or MD_ element is extended to allow a referenceDoc to be added. 2.Discovered issues. Added discovered issue class (e.g., a problem which the producer has identified during generation of a dataset) to the DQ_DataQuality element. 3.Reference datasets used for evaluation. Added to dataEvaluation section of the 19157 to allow recording the reference dataset used to assess the quality indicator. 4.Traceability. Added a new metaquality type to allow the lineage of a data quality assessment to be recorded, along with its representativity and coverage. This is a requirement of the QA4EO principles. More information: Lucy Bastin [l.bastin@aston.ac.uk] & a poster in this session room Advances in quality models: GVQ - producer quality model

12 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Advances in quality models: GVQ - user quality model http://schemas.geoviqua.org/GVQ/3.1.0

13 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 ISO 19115 only provides the MD_Usage to report how users apply the dataset in their activities. This is insufficient for the GEOSS needs. GeoViQua has elaborated this model from scratch. A user can submit a GVQ_FeedbackItem in a form of: A user comment. A rating mark. A usage report supported by a citation of a report. A link to external feedback (blog pages, Google docs document, etc). A metadata override that amends a producer metadata value. A quality label (GEO Label). These items are related to a dataset through an identifier. More information: Lucy Bastin [l.bastin@aston.ac.uk] & a poster in this session room Advances in quality models: GVQ - user quality model

14 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 The GeoViQua Quality Model is explained in the GEOSS Best Practice Twiki: http://wiki.ieee-earth.org/GEOSS_Tutorials It has been presented in the AIP5 session and it is a contribution to the GEOSS Standards and Interoperability Forum (SIF). More information: Anna Riverola [Anna.Riverola@uab.cat] Advances in quality models: GVQ - user quality model

15 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Advances in visualizing metadata quality information GeoViQua has developed the Q-Rubric tool, an extension on the NOAA formers version An XSLT tool that convert XML metadata files into an HTML punctuation page. Analyses every ISO quality metadata information and rates it by presence/absence ( attributing one point when metadata exists, but not penalizing if information is missing ). Help users to evaluate how many metadata elements related to data quality are provided. Adds two new information groups related to ISO quality: Quality and Usage. GEOSS representation style has been applied to the original Rubric tables.

16 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Advances in visualizing metadata quality information Some results from the GCI: –97203 metadata records held in the Clearinghouse; 96867 analysed –14.79% non defining mandatory topic category –80.63% do not have any quality element (of any class) –Quality: Positional accuracy is the most populated class with 37.77% documented. 36.06% of completeness and 18.79% of logical consistency. Only 0.50% regards to thematic accuracy. –Lineage: 35.27% do not have any lineage sub-element defined. –Usage: 0.60% of elements documented. Conclusions: –Metadata providers do not comply with the ISO Core Mandatory. Many topic categories present just a 75% of completeness. –This impacts metadata search engines for data discovery requests. Download it: http://www.geoviqua.org/docs/isoRubricQHTML.xsl More information: Alaitz Zabala [Alaitz.Zabala@uab.cat]

17 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Advances in visualizing quality information I Integrating UncertWeb project proposals: Use NetCDF-U The Network Common Data Form (NetCDF) is one of the primary methods of self documenting data storage and access in the international geosciences research and education community and beyond. NetCDF-U Conventions are used to formally qualify the uncertainty information in geospatial data encoded in the netCDF-3 format, by means of concepts from the UncertML best practice of the UncertWeb project NetCDF-U Conventions are designed to be fully compatible with the netCDF Climate and Forecast Conventions, the de-facto standard for a large amount of data in the Fluid Earth Science community. It is now a discussion paper in OGC.

18 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 More information: Victor Zaldo [v.zaldo@creaf.uab.cat] Advances in visualizing quality information I Many data involved in the GeoViQua scenarios are encoded in NetCDF. An open source format file. Gives strength and freedom to encode metadata. GeoViQua is developing tools for reading and writing NetCDF-U files and import/export from/to other raster formats. NetCDF file opened with the NASA software Panoply NetCDF file exported to IMG file and opened with the new tool

19 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Advances in visualizing quality information II Integration of Quality Information with OGC Web Map Service: WMS-Q The WMS 1.3.0 currently does not well support the integration of quality information into WMS. The current WMS does not support how data layer can semantically associate with the corresponding uncertainty layers. WMS-Q specification is proposed as far as possible within the bounds of the WMS 1.3.0 specification, requiring as few extensions as possible. To integrate the dataset-level quality information into the WMS, we propose to expand slightly Type attribute of MetadataURL element to have unstructured and other-structured options. Propose to add a description element for the MetadataURL element. Pixel-level uncertainty information can be encoded using NetCDF Uncertainty Conventions (NetCDF-U). Work tested in the OGC interoperability experiment OWS-9 More information: Jon Blower [j.d.blower@reading.ac.uk]

20 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Preliminary results from experiments with colour coding: Quality should be intercomparable - i.e. the saturation should be intuitively comparable even across hues/categories. Perceptual colour models make this possible. Hue represents category, and saturation represents the "Purity for the parcel enrichment" (in percent) or the certainty. Advances in visualizing quality information III More information: Simon Thum [simon.thum@igd.fraunhofer.de] Nearly uncertain in both campaigns Gain in certainty 22.03.07 16.12.2006

21 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Advances in visualizing quality information IV Creation of a Carbon Atlas portal Combining the possibilities of web mapping with the comparison of models including uncertainty: combination of ncWMS (server) and OpenLayers (client): 1. Possibility to compare models between them: ncWMS: Web Map Service for geospatial data that are stored in CF-compliant NetCDF files ( developed and maintained by the Reading e-Science Centre )

22 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Advances in visualizing quality information IV 2. Creation of Comparison map (based on IPCCs visualization method): colour pixel = difference between models, patterns = % on how models agree. Need to add to the ncWMS server the possibility to associate pattern/raster. More information: Pascal Evano [p.evano@creaf.uab.cat]

23 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Advances in applied scenarios I Datasets selected to cover a wide range of geospatial characteristics

24 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Advances in applied scenarios II Uncertainty assessment for continuous and categorical variables Continuous variables: uncertainty related to citizens meteo data in relation to the official Metoffice ones. More information: Dan Cornford [D.Cornford@aston.ac.uk] Categorical variables: spatialized quality indicators coming from a satellite image classification. Global, local and pixel uncertainty level. Several statistical classification methods are used. More information: Eva Sevillano [Eva.Sevillano@uab.cat] Cat1-Classification Probability of success (%) Cat2Cat3Fidelity

25 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Quality search integrated in the EuroGEOSS Discovery and Access Broker to be applied to the GEO Portal. Advances in including data quality in search

26 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Retrieve quality information embedded in Metadata Advances in including data quality in search More information: Lorenzo Bigagli [lorenzo.bigagli@pin.unifi.it]

27 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 What is it? –The GEO Label is intended to assist the user to assess the scientific relevance, quality, acceptance and societal needs of the components (ST-09-02 Task Team, 2010). Purposes ? –be a quality indicator for GEOSS geospatial data and datasets; –improve user recognition and trust in datasets that carry a GEO label; –assist in searching by providing users with visual clues of dataset quality and relevance; and –increase visibility of EO data. GEO label development: –The GeoViQua project is currently undertaking research to define and evaluate the concept of a GEO label. –The development is carried out in three phases: Advances in labelling the quality: the GEO Label Done! In progress!

28 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Phase I Study: –Overall, GEO label questionnaire results show that users and producers agree on the benefits of introducing a GEO label, with no distinct difference between user and producer views. –The majority of respondents support an all-in-one drill-down interrogation facility as the key GEO label function. Phase II Study : –The GEO labels will be a graphical representation generated individually for each dataset in the GEOSS (or other data portals and clearinghouses) based on the quality information that is available for that dataset. –Second online questionnaire-based survey to identify the designs that convey quality information to users in most efficient and comprehensible way. Currently: –At this stage we are analysing the GEO label study II results to fully define and establish a GEO label that meets the needs of the geodata user community. –Phase III: we will create physical prototypes which will be used in a human subject study. Advances in labelling the quality: the GEO Label More information: Victoria Lush [lushv@aston.ac.uk]

29 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Many possibilities has been shown. Now the project enters in a development phase where the concepts exposed and prototypes need to be developed. Move the GeoViQua Quality Model for a broader adoption. Develop a user feedback system prototype. Test search and visualization developments in a GEO Portal replica (ESA contribution) Work with the Architecture GEO committees to move some of this contribution for adoption in the GCI. The future

30 www.geoviqua.org Foz dIguaçú, 21-23 November 2012 Advances in promoting feedback to GEOSS www.ogc.uab.cat/GEOSSBack/index_eng.htm Just a prototype to play with and demonstrate a concept More information: Joan Masó [joan.maso@uab.cat]

31 GeoViQua: Advances in data quality disclosing Thanks! Ivette Serral Center of Research in Ecology and Forestry Applications (CREAF) contact@geoviqua.org


Download ppt "GeoViQua: Advances in data quality disclosing Ivette Serral Center of Research in Ecology and Forestry Applications (CREAF)"

Similar presentations


Ads by Google