Presentation is loading. Please wait.

Presentation is loading. Please wait.

Quality of PSI Robbin te Velde Helsinki, 19-20 April 2007.

Similar presentations


Presentation on theme: "Quality of PSI Robbin te Velde Helsinki, 19-20 April 2007."— Presentation transcript:

1 Quality of PSI Robbin te Velde Helsinki, 19-20 April 2007

2 2 of 12 Outline of the presentation Short (philosophical) introduction on quality Data management & data quality (practice and theory) Conventional data management PSI enlightened models Quality and pricing

3 3 of 12 Defining the elusive concept of Quality (I) Common definitions of quality (Garvin, 1984) : Transcendent: “quality is neither mind nor matter, but a third entity independent of the two […] even though Quality cannot be defined, you know what it is” (Pirsig, 1974) Product-based: “differences in quality amount to differences in the quantity of some desired ingredients or attribute” (Abbot, 1955) Manufacturing-based: “quality means conformance to requirements” (Crosby, 1984) Value-based: “quality means best for certain customer conditions. These conditions are (a) the actual use and (b) the selling price of the product” (Feigenbaum, 1961) User-based: “quality is fitness for use” (Juran, 1988)

4 4 of 12 Defining the elusive concept of Quality (II) There is no unambiguous definition of quality. Each definition stresses other dimensions in quality management [thus] the specific interpretation of quality is no neutral process but is both cause and effect of internal (management x staff) and external (organisation x customer; buyer x supplier) relations. In each era one particular definition has been dominant. Over the centuries there has been a shift from the transcendent to the product and manufacturing-based via the value-based back to the more transcendent user-based definition.

5 5 of 12 The grim reality of data quality Lack of Metadata Management –no common data definitions exists about what data means (e.g., shared vocabulary) No clarity on data ownership –Users create, modify and access data but nobody sees it as its responsibility to own it (fear of ‘blame culture’) Poor data quality –no common consistent way of validating data across applications Massive data redundancy and fractured inconsistent data across different systems –significant data re-keying –maintenance of master data attributes done in different systems –two-way data flow between systems to synchronise the same data Business process outsourcing occurring without process integration and/or integrated master data management Fractured unmanaged unstructured content –no CMS and/or taxonomy to organise the content

6 6 of 12 Data Quality Improvement as part of Data Management (I) Database Administration Data Security Management Data Architecture, Analysis & Design Metadata Management Data Warehousing & Business Intelligence Reference & Master Data Management Data Quality Improvement Unstructured Data Management Data Stewardship, Strategy & Governance Regulatory Compliance (SOX, etc.) Data Quality Analysis (including Data Profiling) Data Cleanup Campaigns and Programs Data Quality Requirements Analysis Data Quality Auditing and Certification

7 7 of 12 Conventional Data Management (II) This model is still very much within the manufacturing-based tradition of quality control Quality is defined as the accuracy of the product (the data) Assumes existence of ex ante, objective, uniform quality criteria Works well but only under certain conditions (stable, well- defined operational environment) Primary process accept reject Re-use Quality criteria (filter) Example I: Unique identification of companies at the Dutch Basic Business Register (BBR) [source: Human Inference] Primary process (MoE: tons) Re-use of information (investors) Example II: Lack of common data definition between Czech Statistical Office (CZSO) and Ministry of Environment (MoE) [source: prof. Jiri Hřebiček, Masaryk University, Brno, Czech Republic] Primary process (CZSO: kg) Same value hospital EC database

8 8 of 12 Limitations of conventional Data Management Ex ante objective criteria are never 100% complete –It is impossible to define beforehand all possible combinations –If you do not include enough combinations you miss the ‘fuzzy’ ones –If many combinations are included filtering takes too long The (futile) effort to go for 100% accuracy hampers process outsourcing Reference data has different meanings to different people; the quality of this reference data is related to the requirements of each user 350,000 objects x 3 types of address x 15 object categories Search options were only based on exact matches, so all ‘fuzzy’ duplicates (e.g., alternative spelled names) were not found the number of combinations was already so big that the filtering took several seconds This often resulted in duplicates because once users has searched for a few seconds without any results, they simply created a new record Example III: Address validation at RWTÜV AG (Germany) [source: Human Inference] The official Dutch government portal Overheid.nl has a strict policy not to allow any content from third (private) parties on the website. This is not a particularly citizen-centered approach but the official policy statement is that they only want “100% certified” information and that they thus do not accept content generated by processes which are not fully under their own control. Example IV: Content syndication at Dutch government portal (overheid.nl) Reference data has different meaning to different users and the quality of this data is related to the requirements of each user. Some reference data may be more critical than others depending on its use at the time. The solution choosen it to built a unique dynamic list of business rules for each user, based on the qualitative feedback obtained from that user. Example V: Use of business rules for UK security trading (private sector) [source: Finsoft Ltd]

9 9 of 12 Hidden assumptions of conventional (closed) model for PSI quality control There is a strict split between (public sector) generation of data and (private sector) re-use of that information The flow of data is unidirectional The generator of the data is solely responsible for the quality of the data Lack of quality of PSI is an important obstacle for re-use Primary process accept reject Re-use Quality criteria (filter) Example I: Unique identification of companies at the Dutch Basic Business Register (BBR) [source: Human Inference] Primary process Final use Example VI: Conventional (‘closed’) model for PSI quality control (cf. the Czech waste case) ex ante quality control Re-use ex post quality control public sector private sector Primary process accept reject Re-use Quality criteria (filter) Example I: Unique identification of companies at the Dutch Basic Business Register (BBR) [source: Human Inference] Primary process Final use Example VI*: Conventional (‘closed’) model for PSI quality control (ex post quality control outsourced to private sector, e.g. Acxiom) ex ante quality control Re-use ex post quality control public sector private sector

10 10 of 12 ‘Enlightened’ models for PSI quality control The generation, re-use and final use are intertwined The flow of data is multidirectional The public content holder does not have to be the generator but is always at least partly responsible for the quality of the data Lack of fitness for (re)use is an important obstacle for re-use (not lack of primary data quality per se) Example VII: Intertwined, multidirectional data flows Primary process Final use Re-use public sector private sector Primary process Final use Re-use Primary process accept reject Re-use Quality criteria (filter) Example I: Unique identification of companies at the Dutch Basic Business Register (BBR) [source: Human Inference] Primary process Final use Example VIII: Co-management of quality (geo- information Norway) Re-use public sector private sector Primary process accept reject Re-use Quality criteria (filter) Example I: Unique identification of companies at the Dutch Basic Business Register (BBR) [source: Human Inference] Primary process Final use Example IX: Co-management of quality by final user (Latvia) Re-use public sector private sector Example X: Central role of government in quality management Primary process Final use Re-use public sector private sector Primary process Final use Re-use

11 11 of 12 Quality and pricing Many public content holders fear that opening up their information (for free) to the public at large cannibalizes their income from commercial re-use. In general though low end and high end markets can very well co-exist. The price discrimination is based on differences in quality In the specific case of information goods, this quality does not always refer to the quality of the primary data itself but especially to the ‘fitness for re-use’. Primary process accept reject Re-use Quality criteria (filter) Example I: Unique identification of companies at the Dutch Basic Business Register (BBR) [source: Human Inference] Primary process Final use Example XI: Commercial re-use excludes free use for public at large Re-use public sector private sector First price no price Primary process accept reject Re-use Quality criteria (filter) Example I: Unique identification of companies at the Dutch Basic Business Register (BBR) [source: Human Inference] Primary process Final use Example XI: Commercial re-use excludes free use for public at large Re-use public sector private sector Primary process accept reject Re-use Quality criteria (filter) Example I: Unique identification of companies at the Dutch Basic Business Register (BBR) [source: Human Inference] Primary process Final use Example XII: High-end market (‘fit for re-use’) coexists with low end market Re-use public sector private sector high quality low quality ‘fit for re-use’ Final use’

12 12 of 12 Contact Robbin te Velde (tevelde@dialogic.nl) 20 April 2007


Download ppt "Quality of PSI Robbin te Velde Helsinki, 19-20 April 2007."

Similar presentations


Ads by Google