Presentation is loading. Please wait.

Presentation is loading. Please wait.

DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-1 David M. Kroenke Database Processing Chapter 15 Business Intelligence.

Similar presentations


Presentation on theme: "DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-1 David M. Kroenke Database Processing Chapter 15 Business Intelligence."— Presentation transcript:

1 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-1 David M. Kroenke Database Processing Chapter 15 Business Intelligence & Data Warehousing

2 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-2 Business Intelligence (BI) Systems Business Intelligence (BI) systems are information systems that assist managers and other professionals: –To analyze current and past activities, and –To predict future events. Two broad categories: –Reporting –Data mining

3 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-3 The Relationship of Operational and BI Applications

4 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-4 Reporting Applications Reporting system applications: –Filter, Sort, Group, Simple Calculations using SQL –Classify entities (customers, products, employees, etc.) RFM Analysis –Deal with critical report delivery

5 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-5 Data Mining Applications Data mining applications are used to: –Perform what-if analysis –Make predictions –Facilitate decision making Data mining applications use sophisticated statistical and mathematical techniques. Report delivery is not as critical.

6 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-6 Need for Data Warehousing Integrated, company-wide view of high-quality information (from disparate databases) Separation of operational and informational systems and data (for improved performance) Comparison of Operational and Informational Systems

7 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-7 Definitions Data Warehouse: –A subject-oriented, integrated, time-variant, non-updatable collection of data used in support of management decision- making processes –Subject-oriented: e.g. customers, patients, students, products –Integrated: Consistent naming conventions, formats, encoding structures; from multiple data sources –Time-variant: Can study trends and changes –Non-updatable: Read-only, periodically refreshed Data Mart : –A data warehouse that is limited in scope

8 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-8 Data Warehouse vs. Data Mart

9 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-9 Components of a Data Warehouse

10 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Data Warehouse and Data Marts

11 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Data Warehouses and Data Marts: Problems of Using Transaction Data for BI

12 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall The ETL Process Extract/Capture –Static vs. Incremental Transform –Scrub or data cleansing –Data selection, joining, aggregation Load and Index –Refresh vs. Update

13 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Data Warehouse ETL Sequence

14 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Components of a Star Schema

15 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Star Schema Example

16 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Issues Regarding Star Schema Dimension table keys should be surrogate : –Keys may change over time –Length/format consistency Granularity of Fact Table – what level of detail? –Transactional grain – finest level –Aggregated grain – more summarized –Finer grain: better market basket analysis capability, but much more data (more dimension tables, more rows in fact table) Duration of the database – how much history should be kept? –Natural duration – 13 months or 5 quarters –Financial institutions may need longer duration –Older data is more difficult to source and cleanse

17 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Data Warehousing at Wal-Mart As of 2000 (Foote and Krishnamurthi, 2001) –Held 7.5 TB, with plans to reach 24 TB (1TB = 250M pages of text) –Kept 65 weeks of data –Had invested $4 Billion –Power users generated $12,000/query As of 2005 (Wall Street Journal, December 3-4, 2005) –Held 570 TB (more than Internet’s fixed pages) –Predicted Hurricane Ivan would spur demand for easy breakfasts Stocked Florida stores with Pop-Tarts

18 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Reporting Systems: RFM Analysis RFM Analysis analyzes and ranks customers according to purchasing patterns: –R = Recent (most recent order) –F = Frequent (how often an order is made) –M = Money (dollar amount of orders) Customers are sorted into five groups, each containing 20% of the customers. Each group is given a numerical value: –1 = Top 20% –2, 3, 4 = Each 20% in between top and bottom 20% –5 = Bottom 20%

19 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Reporting Systems: RFM Analysis (Continued) Ajax ordered recently (1), orders often (1) but does not order the most expensive items (3) – Try to sell Ajax more expensive goods! Bloominghams has not ordered recently (5), but has ordered often (1) and purchased the most expensive items (1). This customer may be looking for a different vendor – better call!

20 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Reporting Systems: OnLine Analytical Processing [OLAP] An OLAP report has measures and dimensions: –Measure — A data item of interest. –Dimension — A characteristic of a measure. OLAP cube — A presentation of a measure with associated dimensions. –An OLAP cube can have any number of axes. –The terms OLAP cube and OLAP report are synonymous. OLAP allows drill-down — a further division of the data into more detail.

21 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Reporting Systems: OLAP Drill Down: Product Family by Store Type

22 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Reporting Systems: OLAP Drill Down: Product Family and Store Location by Store Type

23 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Reporting Systems: OLAP Drill Down: Store Location and Product Family by Store Type

24 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Data Mining Applications Data mining applications use sophisticated statistical and mathematical techniques to find patterns and relationships that can be used to classify and predict. –Unsupervised data mining — Statistical techniques are used to identify groups of entities with similar characteristics. Cluster Analysis –Supervised data mining: A model is developed. Statistical techniques are used to estimate parameter values of the model. –Regression analysis

25 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Data Mining Applications: The Convergence of the Disciplines

26 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Data Mining Applications: Three Popular Data Mining Techniques Decision tree analysis — Classifies entities into groups based on past history. Logistic regression — Produces equations that offer probabilities that certain events will occur. Neural Networks — Complex statistical prediction techniques

27 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Data Mining Applications: Market Basket Analysis Market Basket Analysis — Determines patterns of associated buying behavior. –Support — The probability that two items will be purchased together. –Confidence — The probability that an item will be purchased given the fact that the customer has already purchased another particular item. –Lift — the ratio of confidence to the basic probability that a particular item will be purchased.

28 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall Data Mining Applications: Market Basket Analysis Example


Download ppt "DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-1 David M. Kroenke Database Processing Chapter 15 Business Intelligence."

Similar presentations


Ads by Google