Download presentation

Presentation is loading. Please wait.

Published byMadelyn Quarterman Modified over 2 years ago

1
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-1 David M. Kroenke Database Processing Chapter 15 Business Intelligence & Data Warehousing

2
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-2 Business Intelligence (BI) Systems Business Intelligence (BI) systems are information systems that assist managers and other professionals: –To analyze current and past activities, and –To predict future events. Two broad categories: –Reporting –Data mining

3
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-3 The Relationship of Operational and BI Applications

4
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-4 Reporting Applications Reporting system applications: –Filter, Sort, Group, Simple Calculations using SQL –Classify entities (customers, products, employees, etc.) RFM Analysis –Deal with critical report delivery

5
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-5 Data Mining Applications Data mining applications are used to: –Perform what-if analysis –Make predictions –Facilitate decision making Data mining applications use sophisticated statistical and mathematical techniques. Report delivery is not as critical.

6
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-6 Need for Data Warehousing Integrated, company-wide view of high-quality information (from disparate databases) Separation of operational and informational systems and data (for improved performance) Comparison of Operational and Informational Systems

7
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-7 Definitions Data Warehouse: –A subject-oriented, integrated, time-variant, non-updatable collection of data used in support of management decision- making processes –Subject-oriented: e.g. customers, patients, students, products –Integrated: Consistent naming conventions, formats, encoding structures; from multiple data sources –Time-variant: Can study trends and changes –Non-updatable: Read-only, periodically refreshed Data Mart : –A data warehouse that is limited in scope

8
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-8 Data Warehouse vs. Data Mart

9
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-9 Components of a Data Warehouse

10
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-10 Data Warehouse and Data Marts

11
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-11 Data Warehouses and Data Marts: Problems of Using Transaction Data for BI

12
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-12 The ETL Process Extract/Capture –Static vs. Incremental Transform –Scrub or data cleansing –Data selection, joining, aggregation Load and Index –Refresh vs. Update

13
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-13 Data Warehouse ETL Sequence

14
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-14 Components of a Star Schema

15
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-15 Star Schema Example

16
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-16 Issues Regarding Star Schema Dimension table keys should be surrogate : –Keys may change over time –Length/format consistency Granularity of Fact Table – what level of detail? –Transactional grain – finest level –Aggregated grain – more summarized –Finer grain: better market basket analysis capability, but much more data (more dimension tables, more rows in fact table) Duration of the database – how much history should be kept? –Natural duration – 13 months or 5 quarters –Financial institutions may need longer duration –Older data is more difficult to source and cleanse

17
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-17 Data Warehousing at Wal-Mart As of 2000 (Foote and Krishnamurthi, 2001) –Held 7.5 TB, with plans to reach 24 TB (1TB = 250M pages of text) –Kept 65 weeks of data –Had invested $4 Billion –Power users generated $12,000/query As of 2005 (Wall Street Journal, December 3-4, 2005) –Held 570 TB (more than Internet’s fixed pages) –Predicted Hurricane Ivan would spur demand for easy breakfasts Stocked Florida stores with Pop-Tarts

18
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-18 Reporting Systems: RFM Analysis RFM Analysis analyzes and ranks customers according to purchasing patterns: –R = Recent (most recent order) –F = Frequent (how often an order is made) –M = Money (dollar amount of orders) Customers are sorted into five groups, each containing 20% of the customers. Each group is given a numerical value: –1 = Top 20% –2, 3, 4 = Each 20% in between top and bottom 20% –5 = Bottom 20%

19
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-19 Reporting Systems: RFM Analysis (Continued) Ajax ordered recently (1), orders often (1) but does not order the most expensive items (3) – Try to sell Ajax more expensive goods! Bloominghams has not ordered recently (5), but has ordered often (1) and purchased the most expensive items (1). This customer may be looking for a different vendor – better call!

20
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-20 Reporting Systems: OnLine Analytical Processing [OLAP] An OLAP report has measures and dimensions: –Measure — A data item of interest. –Dimension — A characteristic of a measure. OLAP cube — A presentation of a measure with associated dimensions. –An OLAP cube can have any number of axes. –The terms OLAP cube and OLAP report are synonymous. OLAP allows drill-down — a further division of the data into more detail.

21
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-21 Reporting Systems: OLAP Drill Down: Product Family by Store Type

22
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-22 Reporting Systems: OLAP Drill Down: Product Family and Store Location by Store Type

23
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-23 Reporting Systems: OLAP Drill Down: Store Location and Product Family by Store Type

24
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-24 Data Mining Applications Data mining applications use sophisticated statistical and mathematical techniques to find patterns and relationships that can be used to classify and predict. –Unsupervised data mining — Statistical techniques are used to identify groups of entities with similar characteristics. Cluster Analysis –Supervised data mining: A model is developed. Statistical techniques are used to estimate parameter values of the model. –Regression analysis

25
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-25 Data Mining Applications: The Convergence of the Disciplines

26
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-26 Data Mining Applications: Three Popular Data Mining Techniques Decision tree analysis — Classifies entities into groups based on past history. Logistic regression — Produces equations that offer probabilities that certain events will occur. Neural Networks — Complex statistical prediction techniques

27
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-27 Data Mining Applications: Market Basket Analysis Market Basket Analysis — Determines patterns of associated buying behavior. –Support — The probability that two items will be purchased together. –Confidence — The probability that an item will be purchased given the fact that the customer has already purchased another particular item. –Lift — the ratio of confidence to the basic probability that a particular item will be purchased.

28
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-28 Data Mining Applications: Market Basket Analysis Example

Similar presentations

Presentation is loading. Please wait....

OK

Chapter 11: Data Warehousing

Chapter 11: Data Warehousing

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google