Presentation is loading. Please wait.

Presentation is loading. Please wait.

CHAPTER 3 Data and Knowledge Management

Similar presentations


Presentation on theme: "CHAPTER 3 Data and Knowledge Management"— Presentation transcript:

1 CHAPTER 3 Data and Knowledge Management

2 Chapter 3: Data and Knowledge Management
3.1 Managing Data 3.2 The Database Approach 3.3 Database Management Systems 3.4 Data Warehouses and Data Marts 3.5 Knowledge Management Copyright John Wiley & Sons Canada

3 Copyright John Wiley & Sons Canada
LEARNING OBJECTIVES Identify three common challenges in managing data, and describe one way organizations can address each challenge using data governance. Name six problems that can be minimized by using the database approach. Demonstrate how to interpret relationships depicted in an entity-relationship diagram. Discuss at least one main advantage and one main disadvantage of relational databases. Copyright John Wiley & Sons Canada

4 LEARNING OBJECTIVES (continued)
Identify the six basic characteristics of data warehouses and data marts. Demonstrate the use of a multidimensional model to store and analyze data. List two main advantages of using knowledge management, and describe the steps in the knowledge management system cycle. Copyright John Wiley & Sons Canada

5 Copyright John Wiley & Sons Canada
OPENING CASE 3.1 BIG DATA The Problem In fact, the amount of digital data increases tenfold every five years. Scientists say that we are undergoing a new revolution, the “Industrial Revolution of Data,” and they have coined the term “Big Data” to describe the superabundance of data available today. This causes issues in storage space, speed, time, structure, quantity and quality of data. Copyright John Wiley & Sons Canada

6 Copyright John Wiley & Sons Canada
THE SOLUTION For many organizations, the first step in managing Big Data was to deal with the problem of information silos. Silos are information that is stored and isolated in separate functional areas. Organizations began to integrate this information into a database environment and then to develop data warehouses to serve as decision-making tools. Next, they turned their attention to the business of data and information management; that is, making sense of their proliferating data. Seeing a market need for data management, Oracle, IBM, Microsoft, and SAP together have spent more than $15 billion in recent years to purchase software firms specializing in data management and business intelligence Copyright John Wiley & Sons Canada

7 Copyright John Wiley & Sons Canada
THE RESULTS The way information is managed touches all areas of life. Today, the availability of abundant yet small-scale data enables companies to cater to niche markets, and even individual customers, anywhere in the world. Some industries have led the way in gathering and exploiting data. For example, credit card companies monitor every purchase and can accurately identify fraudulent ones, using rules derived by analyzing billions of transactions. Copyright John Wiley & Sons Canada

8 Copyright John Wiley & Sons Canada
DISCUSSION What market do you believe will experience the most growth in “Big Data”? Smart Phones? Tablets? What type of “Big Data” is used at a university? Copyright John Wiley & Sons Canada

9 Copyright John Wiley & Sons Canada
3.1 MANAGING DATA The Difficulties of Managing Data Data Governance IT applications require data. These data should be of high quality, meaning they should be accurate, complete, timely, consistent, accessible, relevant, and concise. Difficulties in managing data: Amount of data increasing exponentially Data are scattered throughout organizations and collected by many individuals using various methods and devices. Data come from many sources. Data security, quality, and integrity are critical. Copyright John Wiley & Sons Canada

10 DIFFICULTIES IN MANAGING DATA
Amount of data increases exponentially over time Data are scattered throughout organizations Data obtained from multiple internal and external sources Data degrade over time Data subject to data rot Data security, quality, and integrity are critical, yet easily jeopardized Information systems that do not communicate with each other can result in inconsistent data; Federal regulations. Data rot: a term that refers primarily to problems with the media on which the data are stored. Over time, temperature, humidity, and exposure to light can cause physical problems with storage media and thus make it difficult to access the data. Copyright John Wiley & Sons Canada

11 Copyright John Wiley & Sons Canada
DATA GOVERNANCE Data Governance Master Data Management Master Data See video Data governance is an approach to managing information across an entire organization. Master data management is a process that spans all of an organization’s business processes and applications. Master data are a set of core data, such as customer, product, employee, vendor, and geographic location, that span all of the enterprise’s information systems. Copyright John Wiley & Sons Canada

12 MASTER DATA MANAGEMENT
John Stevens registers for Introduction to Management Information Systems (ISMN 3140) from 10 AM until 11 AM on Mondays and Wednesdays in Room 41 Smith Hall, taught by Professor Rainer. Transaction Data Master Data John Stevens Student Intro to Management Information Systems Course ISMN 3140 Course No. 10 AM to 11AM Time Mondays and Wednesday Weekday Room 41 Smith Hall Location Professor Rainer Instructor Copyright John Wiley & Sons Canada

13 Copyright John Wiley & Sons Canada
3.2 THE DATABASE APPROACH Databases minimize the following problems: Data redundancy: The same data are stored in many places. Data isolation: Applications cannot access data associated with other applications. Data inconsistency: Various copies of the data do not agree. Databases are arranged so that one set of software programs—the database management system—provides all users with access to all the data. Copyright John Wiley & Sons Canada

14 DATABASE APPROACH (CONTINUED)
Database Management Systems (DBMS) maximize the following issues: Data security: Databases have extremely high security measures in place to deter mistakes and attacks. Data integrity: Data meet certain constraints, such as no alphabetic characters in a Social Insurance Number field. Data independence: Applications and data are not linked to each other, so that all applications are able to access the same data. Data security: Keeping the organization’s data safe from theft, modification, and/or destruction. Data integrity: Data must meet constraints (e.g., student grade point averages cannot be negative). Data independence: Applications and data are independent of one another. applications and data are not linked to each other, meaning that applications are able to access the same data. Copyright John Wiley & Sons Canada

15 DATABASE MANAGEMENT SYSTEMS
Figure 3.1 University Database Management System Copyright John Wiley & Sons Canada

16 Copyright John Wiley & Sons Canada
DATA HIERARCHY Bit: (binary digit) represents the smallest unit of data a computer can process. Byte: represents a single character. Field: A logical grouping of related characters Record: A logical grouping of related fields File (or table): A logical grouping of related records Database: A logical grouping of related files A bit is a binary digit, or a “0” or a “1”. A byte is eight bits and represents a single character (e.g., a letter, number or symbol). A field is a group of logically related characters (e.g., a word, small group of words, or identification number). A record is a group of logically related fields (e.g., student in a university database). A file is a group of logically related records. A database is a group of logically related files. Copyright John Wiley & Sons Canada

17 HIERARCHY OF DATA FOR A COMPUTER-BASED FILE
Figure 3.2 Hierarchy of data in University database Copyright John Wiley & Sons Canada

18 DATA HIERARCHY (CONTINUED)
Bit (binary digit): Byte (eight bits): Copyright John Wiley & Sons Canada

19 DATA HIERARCHY (CONTINUED)
Example of Field and Record Copyright John Wiley & Sons Canada

20 DATA HIERARCHY (CONTINUED)
Example of Field and Record Copyright John Wiley & Sons Canada

21 DESIGNING THE DATABASE
Data model Entity is a person, place, thing, or event which an organization maintains information. Instance: is a specific, unique representation of the entity. Attribute is a characteristic or quality of a particular entity Primary key is a field that uniquely identifies a record. Secondary keys are other field that have some identifying information but typically do not identify the file with complete accuracy. The data model is a diagram that represents the entities in the database and their relationships. An entity is a person, place, thing, or event about which information is maintained. A record generally describes an entity. E.g. customer, employee or product An attribute is a particular characteristic or quality of a particular entity. An Instance of an entity is a specific, unique representation of the entity. For example, STUDENT The primary key Every record in a file must contain at least one field that uniquely identifies that record so that it can be retrieved, updated, and sorted. Secondary keys For example, the student’s major would be a secondary key if a user wanted to find all students in a particular major field of study. It should not be the primary key, however, because many students can have the same major. Copyright John Wiley & Sons Canada

22 ENTITY-RELATIONSHIP MODELING
Database designers plan the database design in a process called entity-relationship (ER) modeling. ER diagrams consists of entities, attributes and relationships. Entity classes Instance Identifiers Entity classes are groups of entities of a certain type. An instance of an entity class is the representation of a particular entity. Entity instances have identifiers, which are attributes that are unique to that entity instance. Copyright John Wiley & Sons Canada

23 RELATIONSHIPS BETWEEN ENTITIES
Cardinality and modality are the indicators of the business rules in a relationship. Cardinality refers to the maximum number of times an instance of one entity can be associated with an instance of the related entity. Modality refers to the minimum number of times an instance of one entity can be associated with an instance of the related entity. Figure 3.3 Cardinality and Modality Symbols Copyright John Wiley & Sons Canada

24 ENTITY-RELATIONSHIP DIAGRAM MODEL
STUDENT, PARKING PERMIT, CLASS, and PROFESSOR are entity classes. An instance of an entity class is the representation of a particular entity. Therefore, a particular STUDENT (Peng Xu, ) is an instance of the STUDENT entity class; a particular parking permit (91778) is an instance of the PARKING PERMIT entity class; a particular class (76890) is an instance of the CLASS entity class; and a particular professor (Teresa De Carvalho, ) is an instance of the PROFESSOR entity class. Copyright John Wiley & Sons Canada

25 3.3 DATABASE MANAGEMENT SYSTEMS
Database management system (DBMS) Relational database model Structured Query Language (SQL) Query by Example (QBE) Data Dictionary A database management system is a set of programs that provide users with tools to add, delete, access, and analyze data stored in one location. The relational database model is based on the concept of two-dimensional tables. Structured query language allows users to perform complicated searches by using relatively simple statements or keywords. Query by example allows users to fill out a grid or template to construct a sample or description of the data he or she wants. The data dictionary provides information on each attribute, such as its name, whether it is a key or part of a key, the type of data expected (e.g., alphanumeric, numeric, dates), and valid values. Copyright John Wiley & Sons Canada

26 STUDENT DATABASE EXAMPLE
Figure 3.5 Example of Student Database Copyright John Wiley & Sons Canada

27 Copyright John Wiley & Sons Canada
NORMALIZATION Normalization Minimizes redundancy Maximizes data integrity Optimizes processing performance Normalized data occurs when attributes in the table depend only on the primary key. Normalization is a method for analyzing and reducing a relational database to its most streamlined form for minimum redundancy, maximum data integrity, and best processing performance. Copyright John Wiley & Sons Canada

28 NON-NORMALIZED RELATION
Consider the first column (labelled Order). This column contains multiple entries for each Order—four rows for Order 11, six rows for Order 12, and so on. These multiple rows for an Order are called repeating groups. The table also has multiple entities: ORDER, PART, SUPPLIER, and CUSTOMER. When you normalize the data, you want to eliminate repeating groups to create normalized tables, each containing only one entity. Copyright John Wiley & Sons Canada

29 NORMALIZING THE DATABASE (PART A)
Copyright John Wiley & Sons Canada

30 NORMALIZING THE DATABASE (PART B)
Copyright John Wiley & Sons Canada

31 NORMALIZATION PRODUCES ORDER
The normalization process breaks down the relation ORDER into smaller relations: ORDER, SUPPLIER, and CUSTOMER and ORDERED-PARTS and PART . Each of these relations describes a single entity. This process is conceptually simpler, and it eliminates repeating groups. Copyright John Wiley & Sons Canada

32 3.4 DATA WAREHOUSING AND DATA MARTS
Data warehouses and Data Marts Organized by business dimension or subject Use On-line Analytical Processing Integrated Time Variant Nonvolatile Multidimensional A data warehouse is a repository of historical data organized by subject to support decision makers in the organization. A data mart is a low-cost, scaled-down version of a data warehouse that is designed for the end-user needs in a small organization or a strategic business unit or a department in a large organization. Organized by Business Dimension or Subject Data are organized by subject (for example, by customer, vendor, product, price level, and region). Online analytical processing (OLAP) involves the analysis of accumulated data by end users (usually in a data warehouse). In contrast to OLAP, online transaction processing (OLTP) typically involves a database, where data from business transactions are processed online as soon as they occur. Integrated: Data are collected from multiple systems and are integrated around subjects. Time Variant: Data warehouses and data marts maintain historical data. Nonvolatile: Data warehouses and data marts are nonvolatile, meaning that only IT professionals can change or update the data. Multidimensional: data warehouses and data marts store data in a multidimensional structure, which consists of more than two dimensions. A common representation for this structure is the data cube. Copyright John Wiley & Sons Canada

33 THE ENVIRONMENT FOR DATA WAREHOUSING AND DATA MARTS
Source systems that provide data to the data warehouse or data mart Data integration technology and processes that are needed to prepare the data for use Different architectures for storing data in an organization’s data warehouse or data marts Different BI tools and applications for the variety of users The need for metadata, data quality, and governance processes to be in place to ensure that the data warehouse or data mart meets its purposes Copyright John Wiley & Sons Canada

34 DATA WAREHOUSE FRAMEWORK
This figure shows the process of building and using a data warehouse. Copyright John Wiley & Sons Canada

35 Copyright John Wiley & Sons Canada
RELATIONAL DATABASES This next series of slideshows the relationship between relational databases and a multidimensional data structure (or data cube). Copyright John Wiley & Sons Canada

36 MULTIDIMENSIONAL DATABASE
A common source for the data in data warehouses is the company’s operational databases, which can be relational databases. This figure displays how data would be represented by a three-dimensional matrix (or data cube). Copyright John Wiley & Sons Canada

37 EQUIVALENCE BETWEEN RELATIONAL AND MULTIDIMENSIONAL DATABASES
This Figure displays the equivalence between these relational and multidimensional databases. Copyright John Wiley & Sons Canada

38 DATA INTEGRATION (ETL)
To extract data from source systems, transform them, and load them into a data mart or warehouse. Can be performed by hand-written code (e.g., SQL queries) or by commercial data-integration software. Can be transformed to make them more useful. The period of time during which new data are loaded into the warehouse or mart is known as the “load window.” Copyright John Wiley & Sons Canada

39 Copyright John Wiley & Sons Canada
STORING THE DATA The most common architecture is one central enterprise data warehouse, without data marts. Independent data marts, which store data for a single or a few applications, such as in marketing or finance. Hub and spoke stores data in a central data warehouse while simultaneously maintaining dependent data marts that obtain their data from the central repository. Copyright John Wiley & Sons Canada

40 STORING DATA (CONTINUED)
Metadata is Data about data. Data Quality: The quality of the data in the warehouse must be adequate to satisfy users’ needs Governance requires that people, committees, and processes be in place. Users: There are a large number of potential BI users, including IT developers; front-line workers; analysts; information workers; managers and executives; and suppliers, customers, and regulators. Metadata: IT personnel need information about data sources; database, table, and column names; refresh schedules; and data usage measures. Users’ needs include data definitions, the available report/query tools, report distribution information, and help desk contact information. Users: IT developers and analysts typically fall into this category. Other users, including managers and executives, are information consumers who utilize information created by others. Copyright John Wiley & Sons Canada

41 Copyright John Wiley & Sons Canada
3.5 KNOWLEDGE MANAGEMENT Knowledge management (KM) Knowledge Intellectual capital (or intellectual assets) Knowledge management is a process that helps organizations manipulate important knowledge that is part of the organization’s memory, usually in an unstructured format. Knowledge that is contextual, relevant, and useful. Intellectual capital is another term often used for knowledge. Copyright John Wiley & Sons Canada

42 KNOWLEDGE MANAGEMENT (CONTINUED)
Explicit knowledge: objective, rational, technical knowledge that has been documented. Examples: policies, procedural guides, reports, products, strategies, goals, core competencies Tacit knowledge: cumulative store of subjective or experiential learning. Examples: experiences, insights, expertise, know-how, trade secrets, understanding, skill sets, and learning Explicit knowledge: objective, rational, technical knowledge that has been documented. Examples: policies, procedural guides, reports, products, strategies, goals, core competencies Tacit knowledge: cumulative store of subjective or experiential learning. Examples: experiences, insights, expertise, know-how, trade secrets, understanding, skill sets, and learning Copyright John Wiley & Sons Canada

43 KNOWLEDGE MANAGEMENT (CONTINUED)
Knowledge management systems (KMSs) Best practices Knowledge management systems (KMSs) are those that use modern information technologies—the Internet, intranets, extranets, databases—to systematize, enhance, and expedite knowledge management within a single firm and among multiple firms. Best practices are the most effective and efficient ways of doing things. Copyright John Wiley & Sons Canada

44 KNOWLEDGE MANAGEMENT SYSTEM CYCLE
Create knowledge Capture knowledge Refine knowledge Store knowledge Manage knowledge Disseminate knowledge 1. Create knowledge. Knowledge is created as people determine new ways of doing things or develop know-how. Sometimes external knowledge is brought in. 2. Capture knowledge. New knowledge must be identified as valuable and be represented in a reasonable way. 3. Refine knowledge. New knowledge must be placed in context so that it is actionable. This is where tacit qualities (human insights) must be captured along with explicit facts. 4. Store knowledge. Useful knowledge must then be stored in a reasonable format in a knowledge repository so that other members of the organization can access it. 5. Manage knowledge. Like a library, the knowledge must be kept current. To accomplish this objective, knowledge must be reviewed regularly to verify that it is relevant and accurate. 6. Disseminate knowledge. Knowledge must be made available in a useful format to anyone in the organization who needs it, anywhere and anytime. Copyright John Wiley & Sons Canada

45 KNOWLEDGE MANAGEMENT SYSTEM CYCLE
Copyright John Wiley & Sons Canada

46 Copyright John Wiley & Sons Canada
CHAPTER CLOSING Organizations can use knowledge management to develop best practices, the most effective and efficient ways of doing things, and to make these practices readily available to a wide range of employees. The database approach minimizes the following problems: data redundancy, data isolation, data inconsistency, data security, data integrity, and data independence. Master data management provides companies with the ability to store, maintain, exchange, and synchronize a consistent, accurate, and timely “single version of the truth” for the company’s core master data. Copyright John Wiley & Sons Canada

47 Copyright John Wiley & Sons Canada


Download ppt "CHAPTER 3 Data and Knowledge Management"

Similar presentations


Ads by Google