Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSU Data Stewardship Committee Kickoff Meeting April 4, 2003.

Similar presentations


Presentation on theme: "CSU Data Stewardship Committee Kickoff Meeting April 4, 2003."— Presentation transcript:

1 CSU Data Stewardship Committee Kickoff Meeting April 4, 2003

2 2 What is Data Stewardship? Data stewardship is the process of managing information necessary to support program and financial managers, and assuring data captured and reported is accurate, accessible, timely, and useable for decision- making and activity monitoring. Data stewardship is the process of managing information necessary to support program and financial managers, and assuring data captured and reported is accurate, accessible, timely, and useable for decision- making and activity monitoring. –U.S. Department of the Interior

3 3 What is Data Stewardship? (contd) Data Stewardship has, as its main objective, the management of the corporation's data assets in order to improve their reusability, accessibility, and quality. It is the Data Stewards' responsibility to approve business naming standards, develop consistent data definitions, determine data aliases, develop standard calculations and derivations, document the business rules of the corporation, monitor the quality of the data in the data warehouse, define security requirements, and so forth…. Data Stewardship has, as its main objective, the management of the corporation's data assets in order to improve their reusability, accessibility, and quality. It is the Data Stewards' responsibility to approve business naming standards, develop consistent data definitions, determine data aliases, develop standard calculations and derivations, document the business rules of the corporation, monitor the quality of the data in the data warehouse, define security requirements, and so forth…. –Claudia Imhoff, Ph.D., President, Intelligent Solutions, Inc.

4 4 What is Data Stewardship? (contd) Stewardship programs focus on improving data quality, reducing data duplication, formalizing accountability for data, and improving business and IT productivity. An effective Data Stewardship program will rapidly improve the ROI from data warehousing and business intelligence efforts, application integration efforts, ERP, CRM, content and knowledge management, and EAI efforts. Stewardship programs focus on improving data quality, reducing data duplication, formalizing accountability for data, and improving business and IT productivity. An effective Data Stewardship program will rapidly improve the ROI from data warehousing and business intelligence efforts, application integration efforts, ERP, CRM, content and knowledge management, and EAI efforts. –Robert Seiner, Publisher, The Data Administration Newsletter

5 5 Why do we need data stewardship? Consider the costs of poor data quality Consider the costs of poor data quality –Incorrect enrolled student counts –Incorrect flexibly scheduled course section counts –Incorrect alumni data Why reinvent the wheel – and differently every time, at that! Why reinvent the wheel – and differently every time, at that! –Reports that claim to show the same information, but with different results –Leads to decisions based on information that is incorrect or improperly understood

6 6 Data Stewardship Committee charge The charge of the Data Stewardship Committee (DSC) is to define, validate, organize and protect data assets, thus enabling areas throughout the University to make decisions based upon high-quality, easily usable information The charge of the Data Stewardship Committee (DSC) is to define, validate, organize and protect data assets, thus enabling areas throughout the University to make decisions based upon high-quality, easily usable information

7 7 Creating our common vision What products should we develop? What products should we develop? –Data marts –Data quality metrics –Metadata repository/data dictionary –Other?

8 8 Creating our common vision (contd) What services should we provide? What services should we provide? –Change control –Other?

9 9 What is a data mart? …the restriction of the data warehouse to a single business process or to a group of related business processes targeted toward a particular business group. …the restriction of the data warehouse to a single business process or to a group of related business processes targeted toward a particular business group. –Ralph Kimball, Ph.D., CEO Ralph Kimball Associates A data mart is a subject-specific collection of organizational data which can be used for analytical purposes relating to specific business questions or functions. A data mart contains only that data which is needed to respond to the specified business questions. A data mart is a subject-specific collection of organizational data which can be used for analytical purposes relating to specific business questions or functions. A data mart contains only that data which is needed to respond to the specified business questions. –David Fuller

10 10 What is a data mart (contd) Data marts are usually derived by taking many tables and flattening them into a few tables Data marts are usually derived by taking many tables and flattening them into a few tables Data marts are easier to query and report from Data marts are easier to query and report from

11 11 What are data quality metrics? …there is no meaningful concept of data quality in the real world; it is only as a by-product of the deficiencies of abstracting and representing reality that data quality arises as an issue at all. …there is no meaningful concept of data quality in the real world; it is only as a by-product of the deficiencies of abstracting and representing reality that data quality arises as an issue at all. –Matt Duckham, Dept. of Computer Science, University of Keele, UK Metrics are ways to measure data quality Metrics are ways to measure data quality –How many values in a column are valid (internal consistency)? What state is ZZ? What state is ZZ? –How many values across columns are consistent (external consistency)? Why does Ms. Jane Doe have a gender of Male? Why does Ms. Jane Doe have a gender of Male? Metrics show data quality improving or worsening over time Metrics show data quality improving or worsening over time We are developing a data quality architecture (DQA) for use with a variety of data sources We are developing a data quality architecture (DQA) for use with a variety of data sources

12 12 What is a metadata repository? Meta data is all physical data and knowledge-containing information about the business and technical processes, and data, used by a corporation…. While meta data repositories perform all of the functions of a data dictionary, their scope is far greater. Meta data is all physical data and knowledge-containing information about the business and technical processes, and data, used by a corporation…. While meta data repositories perform all of the functions of a data dictionary, their scope is far greater. –David Marco, President, Enterprise Warehousing Solutions What features might a metadata repository have? What features might a metadata repository have?

13 13 What is a metadata repository? (contd) Definitions of columns and tables. Definitions of columns and tables. The ability to determine which tables contain a given column, or a column with a given description – e.g., which tables contain Academic Sub-Plan. The ability to determine which tables contain a given column, or a column with a given description – e.g., which tables contain Academic Sub-Plan. The ability to query the queries – e.g., find all existing queries with IPEDS in their description. The ability to query the queries – e.g., find all existing queries with IPEDS in their description.

14 14 What is a metadata repository? (contd) The ability to determine which queries reference a given column and/or specific values of that column – e.g., which queries use employee status L as one of their criteria? The ability to determine which queries reference a given column and/or specific values of that column – e.g., which queries use employee status L as one of their criteria? The ability to determine the path (menu group > panel group > panel) to follow to reach a particular panel – e.g., how do I get to the Application Data panel? The ability to determine the path (menu group > panel group > panel) to follow to reach a particular panel – e.g., how do I get to the Application Data panel? The ability to determine which columns from which tables appear on a given panel, and vice versa – e.g., From which table and column is Program Action populated in panel X, and conversely, which panels, in addition to panel X, are populated by ACAD_PROG? The ability to determine which columns from which tables appear on a given panel, and vice versa – e.g., From which table and column is Program Action populated in panel X, and conversely, which panels, in addition to panel X, are populated by ACAD_PROG?

15 15 What is a metadata repository? (contd) A metadata repository can capture data definitions, not create them A metadata repository can capture data definitions, not create them Definitions more detailed than those already stored somewhere must be provided by subject matter experts (SMEs) Definitions more detailed than those already stored somewhere must be provided by subject matter experts (SMEs) The metadata repository can provide a framework for the systematic capture and publication of this metadata The metadata repository can provide a framework for the systematic capture and publication of this metadata

16 16 What is change control? In this context, it means controlling certain changes to the data In this context, it means controlling certain changes to the data –Adding new values to critical columns –Any other changes that can impact reporting These changes should be brought to the attention of this committee before they are made These changes should be brought to the attention of this committee before they are made –Data users can assess and discuss the impact –The changes can be published before they are made –Users can modify reports as needed before they risk publishing incorrect information But how do we define critical data? But how do we define critical data?

17 17 Identifying Critical Data Data stewardship over all CSU data is not cost-effective Data stewardship over all CSU data is not cost-effective The prerequisite to The prerequisite to –developing data marts –implementing a data quality architecture –developing a metadata repository/data dictionary –and instituting change control over our data is identification of the critical data over which we will maintain stewardship

18 18 University Data University data are institutional assets and are held by the university to support its fundamental instructional, research, and public service missions. University data are institutional assets and are held by the university to support its fundamental instructional, research, and public service missions. –Arizona State University

19 19 University Data (contd) UNIVERSITY INFORMATION -- A data element is considered UNIVERSITY INFORMATION if it provides support to and meets the needs of units of the University. Examples of UNIVERSITY INFORMATION include, but are not limited to, many of the elements supporting financial management, student curricula, payroll, personnel management, and capital equipment inventory. Data may be considered UNIVERSITY INFORMATION if it satisfies one or more of the following criteria UNIVERSITY INFORMATION -- A data element is considered UNIVERSITY INFORMATION if it provides support to and meets the needs of units of the University. Examples of UNIVERSITY INFORMATION include, but are not limited to, many of the elements supporting financial management, student curricula, payroll, personnel management, and capital equipment inventory. Data may be considered UNIVERSITY INFORMATION if it satisfies one or more of the following criteria A. It is used for planning, managing, reporting, or auditing a major administrative function; B. It is referenced or used by an organizational unit to conduct University business; C. It is included in an official University administrative report; D. It is used to derive an element that meets the criteria above. A. It is used for planning, managing, reporting, or auditing a major administrative function; B. It is referenced or used by an organizational unit to conduct University business; C. It is included in an official University administrative report; D. It is used to derive an element that meets the criteria above. …Data that may be managed locally may yet have significant impact if it is used in a manner that can impact University operations…. …Data that may be managed locally may yet have significant impact if it is used in a manner that can impact University operations…. –Georgia State University

20 20 University Data (contd) No one owns University data but the University No one owns University data but the University We may be data stewards, custodians, users, producers, etc., but we are not owners of the data We may be data stewards, custodians, users, producers, etc., but we are not owners of the data

21 21 Tasks Identifying University Data Identifying University Data –Which columns in which tables (or which fields on which panels) do we need to Define? Define? Use in reporting / analysis / decision-making? Use in reporting / analysis / decision-making? Quality assure? Quality assure? Exercise change control over? Exercise change control over? What are our relative priorities? What are our relative priorities? –Data marts –Data quality –Data dictionary –Other products?

22 22 Future possibilities Statistical analysis of data Statistical analysis of data Data mining Data mining –The diapers and beer discovery –Registration patterns –Student attrition patterns Prerequisites Prerequisites –Better data structure –Better data quality

23 Thank You!


Download ppt "CSU Data Stewardship Committee Kickoff Meeting April 4, 2003."

Similar presentations


Ads by Google