Presentation on theme: "Development of Program Level Product Quality Metrics Robert Frouin 1, Rama Hampapuram 2, Greg Hunolt 3, Kamel Didan 4, and others 5 1 Scripps Institution."— Presentation transcript:
Development of Program Level Product Quality Metrics Robert Frouin 1, Rama Hampapuram 2, Greg Hunolt 3, Kamel Didan 4, and others 5 1 Scripps Institution of Oceanography, 2 GSFC / ESDIS, 3 SGT, 4 UofA, 5 MEaSUREs PIs _________________________________ ESDSWG Meeting – MPARWG Breakout October 2010, New Orleans
Goal The purpose is to stir a discussion about the concepts of product quality metrics useful to the program (managers, missions, etc…) That may (and should) lead to an agreement on an approach to provide Program level metric(s) on usability of MEaSUREs products by the user community. –This discussion started Aug (involving all MEaSUREs’ PIs) –Some level of details (or a way forward) “needs” to be worked out preferably at this meeting
Context With global scale and multi-temporal data records increasingly available, easier to acquire and use for science, it becomes imperative that a programmatic level product quality metric be in place to insure they’re properly supporting science and policy making. There are four overarching themes: 1. Traceability (reproducibility, repeatability, etc…) 2. Fidelity (high quality, known error and uncertainty, etc…) 3. Transparency (community algorithms, good practices, documentation, interoperability, etc…) 4. Impact (science, economics, society, etc…)
MEaSUREs and Product Quality “Product Quality” has two parts – Scientific quality of data – Usability of package consisting of data and documentation Projects may track these in detail for their own purposes – Details may vary from project to project Programmatic interest is in tracking progress and aggregated reporting – Common, agreed upon, definitions across projects – Simple (small number of) metrics for indicating overall progress in individual projects as well as Program as a whole
Starting Points Progress so far –Robert Frouin’s list of criteria Uniqueness Interpretability Accuracy Consistency Completeness Relevance Accessibility Level of usability –Greg Hunolt’s strawman tables
-To measure how well products conform to “requirements” (who and how to define req.?) -To track maturity and progress (e.g., accuracy and coverage). -To ascertain whether products are used “properly” (consider user creativity!). -To take necessary corrective actions or improvements. Importance of Assessing Product Quality
Objective - To determine what program level product quality metrics would make sense – i.e. be meaningful, clear and concise, and be practical to collect and report. -Dimensions and criteria should be defined for programmatic assessments and planning, i.e., they may differ from the detailed standards for product quality developed at the project level.
NASA Guidelines for Ensuring Quality of Information -From NASA’s viewpoint, the basic standard of information quality has three components: utility, objectivity, and integrity. -In ensuring the quality of the disseminated NASA “information”, all of these components must be “sufficiently” addressed.
-Utility: Refers to the extent that the information can be used for its intended purpose, by its intended audience. -Objectivity: Refers to the extent that the information is accurate, clear, complete, and unbiased. -Integrity: Refers to the protection of NASA’s information from unauthorized access, revision, modification, corruption, falsification, and inadvertent or unintentional destruction. -The disseminated information and the methods used to produce this information should be as transparent as possible so that they can, in principle, be reproducible by qualified individuals.
Dimensions and Criteria to Consider for Product Quality Metrics -Uniqueness: How unique is the data set? Can it be obtained from other sources at the same temporal and spatial resolution, over the same time period, with the same accuracy? How “meaningful” and how to measure this? -Interpretability: Is the data clearly defined, with appropriate symbols and units? Is the data easily comprehended? Are the algorithms explained adequately? Are possible usages and limitations of the data documented properly?
-Accuracy: How does the data agree with independent, correct sources of information (reference data), especially in situ measurements? How biased is the data? How does accuracy depend on spatial and temporal scales, geographic region, and season? -Consistency: Is the data always produced in the same way (e.g., from one time period to the next)? Is the data coherent spatially and temporally, and does it remain within the expected domain of values? Is the data in accordance with other (relevant) data or information?
-Completeness: Is some data missing (e.g., due to algorithm limitations or nonexistent input)? Is the data sufficiently comprehensive (e.g., long-term, extended spatially) and accurate for usability? -Relevance: How significant or appropriate is the data for the applications envisioned? What advantages are provided by the data? -Accessibility: How available, easily and quickly retrievable is the data? Is the data sufficiently up- to-date? Can the data be easily manipulated? Does the data have security restrictions?
Straw Man Approach to Product Quality Metrics -Usability is an overarching criteria because for a product to be fully usable the product must not only be of high science quality, but that quality, along with all other information required for use of the product, must be documented. -This suggests the possibility of defining a set of usability levels that would address not only intrinsic science quality but also the other factors that contribute to, or are required, for a product to be usable (i.e., documentation, accessibility, and support service).
Straw Man Usability Levels Usability Level Science Quality Level Documentation Level Accessibility/Support Services Level High Usable with Difficulty High Medium Limited Usability Qualified High MediumLow Poor / Unusable UncertainPoorLow -The usability levels would derive from the science quality, documentation, and accessibility levels, in which criteria defined previously could come into play.
Straw Man Intrinsic Science Quality Levels Intrinsic Science Quality Level Maturity Level Factor 2Factor 3 HighValidated Stage 3 High Qualified HighValidated Stage 1 or 2 Medium UncertainBeta or Provisional Low The “Factors” could be selected criteria that apply to Intrinsic Science Quality. Each criterion or ‘factor’ used could have its set of questions, and the answers to those questions could be the basis for “High”, “Medium” or “Low” for that factor.
Straw Man Documentation Levels Documentation Level Factor 1Factor 2Factor 3 High Medium Low Poor
Straw Man Accessibility / Support Services Levels Accessibility / Support Services Level Product FormatToolsFactor 3 ExcellentWidely used standard Tools for all platforms available High Very GoodLimited use standard Limited tools available Medium MarginalNon-standard format Do it yourselfLow PoorProprietaryMay be a proprietary tool if any Really Bad
-In this approach, the metrics associated with usability, intrinsic science quality, documentation, and accessibility / support Services should be defined for those items that need to be tracked at the program level, i.e., that are both important and potentially problematical or a key measure of a project’s process. -Some level of detail is necessary. Some criteria must be objective, since perceptions of the individuals involved with product development may be subjective. -The metrics should provide information on the state of the product without the conceptual knowledge of the application (project-independent) and with specific applications in mind (project-dependent).
Interaction with Users (who measures the metric?) -The perceived quality of a product by users, or the real world quality of products, may be very different from the analysis by those involved in generating the products. -User surveys are complementary to internal (i.e., collected from stakeholders) metrics. They are necessary to assess, using comparative analysis, proper usage and adequate documentation and accessibility, which may lead to corrective actions for improving product quality.
22 Summer NDVI comparisons Winter EVI comparisons Same sensor(s) & a “simple” reprocessing (C4 to C5) leads to major change 10+% sometimes
Consider –A published paper using MODIS C4 data record –A new Analysis by C5 confirmed the basic findings of the published paper, but there was noticeable spatial differences Some had issues with the differences 23 C4 based Amazon response to 2005 drought C5 based Amazon response to 2005 drought Saleska, Didan, Huete & Da Rocha (science 2007)
Implications on the carbon cycle 24 MODIS C4 EVI based Annual GPP MODIS C5 EVI based Annual GPP C5 – C4 Ann. GPP DifferenceC5 – C4 Ann. GPP Percent Difference
Also consider Data from MODIS that describe the behavior of a patch of vegetation –Use all data (most users do it) Documentation is not clear as to what not to do ? For example atmospherically corrected data gives a false sense of “corrected”. –Filter and use remaining data (few users do it but then it becomes a challenge to use RS data in general) –Find a work around Case by case basis The challenge is how to make sense of these issues –Error and uncertainty reported as a single number by MODIS (Global multi-temporal data) is for the most part useless!
–Synoptic TS data is quite problematic –Know the limitations of the data
Global clouds & data usefulness metrics
Global data performance JFM AMJ JAS OND Annual average % 5075