Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.

Similar presentations


Presentation on theme: "Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D."— Presentation transcript:

1 Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.

2 File Organization Concepts Computer system uses hierarchies –Database: Group of related files –File: Group of records of same type –Record: Group of related fields Record: Describes an entity (person, place, thing) –Field: Group of characters Attribute: Characteristic describing the entity –e.g., Date or Grade belong to entity COURSE

3 The Data Hierarchy

4 Problems With Traditional File Environment Files maintained by different departments –Data redundancy and inconsistency Data redundancy: Duplicate data in multiple files Data inconsistency: Same attribute, different values –Program-data dependence Changes in program requires changes to data accessed by program

5 Traditional File Processing

6 The Database Approach to Data Management Database –Data organized to serve many applications by centralizing data and controlling redundant data Database management system (DBMS) –Separates logical and physical views of data –Solves problems of traditional file environment Controls redundancy Eliminated inconsistency Enables central management and security

7 Human Resources Database with Multiple Views

8 Relational DBMS Data as 2-dimension tables called relations or files Each table contains data on entity and attributes Table: Grid of columns and rows –Rows: Records for different entities –Columns: Represents attribute (field) for entity –Key field: Field used to uniquely identify each record Primary key: Field in table used for key fields Foreign key: Primary key used in second table as look-up field to identify records from original table

9 Relational Database Tables

10

11 Operations of a Relational DBMS Basic operations to develop useful sets of data –SELECT: Creates subset of data of all records that meet stated criteria –JOIN: Combines relational tables to provide more information than available in individual tables

12 Basic Relational DBMS Operations Select Part_Number = 137 or 150, Join by Supplier_Number

13 Example of an SQL Query Select Statement: Query data for specific info Conditional Selection: ID which rows of a table are displayed, based on criteria contained in the WHERE clause Joining Two Tables: Used to combine data from two or more tables and display the results

14 An Access Query

15 Designing Databases Design process identifies –Relationships among data elements, redundant database elements –Most efficient way to group data elements to meet business requirements, needs of app programs Normalization –Minimize redundant data elements

16 Normalization of Order

17 Using Databases to Improve Performance and Decision Making For very large databases and systems, special capabilities and tools are required for analyzing large quantities of data and for accessing data from multiple systems –Data warehousing –Data mining

18 Database Warehouses Store current and historical data from many core operational transaction systems Consolidates and standardizes information for use across enterprise, but data cannot be altered Provide query, analysis, and reporting tools

19 Components of a Data Warehouse

20 Business Intelligence Tools for consolidating, analyzing, and providing access to vast amounts of data to help users make better business decisions –e.g., Harrah’s Entertainment analyzes customers to develop gambling profiles and identify most profitable customers Principle tools include –Software for database query and reporting –Online analytical processing (OLAP) –Data mining

21 Online Analytical Processing (OLAP) Supports multidimensional data analysis –Gives first glimpse of possible relationships Enables viewing data using multiple dimensions –Each aspect of information (product, pricing, cost, region, time period) is different dimension e.g., How many washers sold in East in June? OLAP enables rapid, online answers to ad hoc queries

22 Multidimensional Data Model

23 Data Mining More discovery driven than OLAP –Finds hidden patterns, relationships in large dbs –Infers rules to predict future behavior –The patterns and rules are used to guide decision making and forecast the effect of those decisions Popularly used to provide detailed analyses of patterns in customer data for 1:1 marketing campaigns or to identify profitable customers

24 Using Databases to Improve Performance and Decision Making Predictive analysis –Uses data mining techniques, historical data, and assumptions about future conditions to predict outcomes of events e.g., Probability a customer will respond to an offer or purchase a specific product Data mining seen as challenge to individual privacy –Combines information from many diverse sources to create detailed “data image” about each of us e.g., income, driving habits, hobbies, families, and political interests

25 Text Mining: For’ and ‘Against’ Exercise 1.Read the article and the following statement. 2.Summarize the best evidence you can give FOR, or in support of, the statement. 3.Summarize the best evidence you can give AGAINST the statement. 4.Include only accurate evidence The benefits of text mining greatly outweigh the costs.

26 Web Mining Discovery and analysis of useful patterns and information from WWW –e.g., to understand customer behavior, evaluate effectiveness of Web site, etc. Web content mining –Knowledge extracted from content of Web pages Web structure mining –e.g., links to and from Web page Web usage mining –User interaction data recorded by Web server

27 Managing Data Resources Establishing an information policy –Information policy: Specifies firm’s rules, procedures, roles for sharing, standardizing data –Data administration: Responsible for specific policies and procedures; data governance –Database administration: Database design and management group responsible for defining, organizing, implementing, maintaining database

28 Ensuring Data Quality More than 25% critical data in Fortune 1000 company databases is inaccurate or incomplete –Before new database in place, need to identify and correct faulty data and establish better routines for editing data once database in operation –Most data quality problems stem from faulty input

29 Managing Data Resources Data quality audit –Structured survey of the accuracy and level of completeness of the data in an IS Data cleansing –Detecting, and correcting data that are incorrect, incomplete, improperly formatted, or redundant –Enforces consistency among different sets of data from separate IS


Download ppt "Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D."

Similar presentations


Ads by Google