Download presentation
Presentation is loading. Please wait.
Published byJulianna Daniel Modified over 5 years ago
1
Business Analytics and Decision Making OLTP, OLAP & SAP
Chapter 9 & SAP Materials Updated 2019
2
OLTP vs. OLAP Online Transaction Processing (OLTP) = relational
database systems Online Analytical Processing (OLAP)
3
OLAP via Data Warehousing
Predefined reports MIS 3500 Interactive data analysis Operations’ data Periodical transfers Online Transaction Processing (OLTP): Querying Databases with 3NF tables Online Analytical Processing (OLAP); Data warehousing; Data Mining. Usually de-normalized data. Flat files
4
OLTP & OLAP in Enterprise Systems
Enterprise Systems (Enterprise Resource Planning Systems) support both. Example: An SAP-based system can be a TPS, MIS and DSS for the entire organization. DSS capability draws on data warehousing & cubing. Other major software players also offer OLAP: Microsoft, IBM, Oracle…
5
Date Warehousing Goals
Data warehouse (DW): Integrates data from different sources to get a larger picture of business Yields multidimensional view of data by creating data cubes Allows for statistical analysis on large data sets (test hypotheses on relationships between pieces of data) Allows for discovering new relationships by querying cubes or by applying data mining software.
6
Extraction, Transformation, and Loading
Preparations performed on data – ETL process Transform/Transport Customers Extract Load Convert “Client” to “Customer” Apply standard product numbers Convert currencies Fix region codes Data warehouse: All data must be consistent. Transaction data from diverse systems.
7
Three-Dimensional View of Data: Cube
Created in a datawarehouse Sales Date Months in year 1 2 3 P1 P2 P4 P3 Sales at Location L1 L2 L3 Logic similar to crosstab query And pivot table. Product
8
Data Hierarchy Year Levels Quarter Roll-up To get higher-level totals
Month Week Day Levels Roll-up To get higher-level totals Drill-down To get lower-level details
9
Datawarehouse Tables: Star Design
Design is: - Hierarchical (dimension tables have no direct association) - De-normalized (fact table): Price & Quantity inputted to Fact table; Dimension Dimension Sale SaleDate Quantity Discount Sale SaleDate Quantity Discount StoreID Product ProductID Price Fact Table * Revenue=Price*Quantity per customer, product, period Calculated fact Dimension Inputted from Product, Sale & Customer; most dimensions replicated in Fact table. Customer CustomerID Detail *Software vendors may have their own terminology; e.g., in SAP systems the fact table may sometimes be called “key figures” and “measures”. The term “measure” signifies the measurability of attributes, a quantitative data type. Location LocationID Detail Revenue broken down by product, sales location, and desired time period (time column/s – day of year, or even smaller; basis for rollup). New keys, often combined, usually used in the fact table (e.g., SaleTbl#-Row#).
10
Data Warehouse for Tyson Foods
Dimension tables (truncated) provide inputs for “facts” (calculated attributes) in Fact table. 10 of 26
11
Datawarehouse Tables: Snowflake Design
Design is: Network-like (dimension tables can connect directly) Still partly normalized (Sale-Customer-City) Product CityID ZipCode City State ItemID Description Price Category SaleID SaleDate CustomerID Discount SalesTax Sale CustomerID Phone FirstName LastName Address ZipCode CityID Customer OLAPItems MerchTblRow SaleTblRow Price Quantity Dimension Tables Fact Table Advantage: Design of Fact table simpler (Customer, City out); Faster processing with a reduced DW schema.
12
SAP Datawarehouse More on SAP Datawarehouse Can also be Dimensions
Datawarehouse Cube Details Can also be Dimensions More on SAP Datawarehouse
13
Multidimensional View of Data – Precursors to DW: Excel Pivot Table
Last Name, ID Facts (Key Figures, Measures) Dimensions: Quarter, Month Can place data in rows or columns. By grouping months, can instantly get quarterly or monthly totals.
14
Microsoft Platform On Azure Cloud
15
SQL 99: Multidimensional Data Views
SELECT Category, Month, Sum, GROUPING (Category) AS Gc, GROUPING (Month) AS Gm FROM … GROUP BY CUBE (Category, Month...) Category Month Sum Gc Gm Bird Bird … Bird (null) Bird (null) Cat Cat Cat (null) (null) (null) (null) (null) (null) SQL Server: More…
16
SQL GROUPING SETS - Hiding Details
SELECT Category, Month, Sum FROM … GROUP BY GROUPING SETS ( ROLLUP (Category), ROLLUP (Month), ( ) ) Category Month Amount Bird (null) Cat (null) … (null) (null) (null) (null) (null) More…
17
SQL: RANK Functions Jones 18,000 1 1 Smith 16,000 2 2 Blau 16,000 2 2
Calculates and rank orders; useful for analysis SELECT Employee, SalesValue RANK() OVER (ORDER BY SalesValue DESC) AS rank DENSE_RANK() OVER (ORDER BY SalesValue DESC) AS dense FROM Sales ORDER BY SalesValue DESC, Employee; Employee SalesValue rank dense Jones 18, Smith 16, Blau 16, Whitt 14, DENSE_RANK does not skip numbers Another example: find the rank of a particular salesperson: SELECT RANK(Jones) WITHIN GROUP (ORDER BY Employee) FROM Sales; Therefore, advances in SQL motivate DBMS vendors to support OLAP and data warehousing.
18
Data Mining Goal: To discover unknown relationships in the data that can be used to make better decisions. Exploratory analysis. A bottom-up approach that scans the data to find relationships Some statistical routines, but they are not sufficient Statistics relies on averages Sometimes the important data lies in more detailed pairs Supervised by developer vs. unsupervised (self-organizing artificial neural networks)
20
Common Techniques for Data Mining
1. Classification/Prediction 2. Association Rules/Market Basket Analysis 3. Clustering Some based heavily on classical statistics (1), others use specialized mining software (2), yet others combine statistics with specialized mining software (3). Currently, these techniques are considered part of data analytics.
21
1. Classification (Prediction)
Purpose: “Classify” things that are causes and those that are effects. Examples Which borrowers/loans are most likely to be successful? Which customers are most likely to want a new item? Which companies are likely to file bankruptcy? Which workers are likely to quit in the next six months? Which startup companies are likely to succeed? Which tax returns are fraudulent?
22
Classification Process
Clearly identify the outcome/dependent variable. Identify potential variables that might affect the outcome. Use sample data to test and validate the model. Run the data through the model. Regression/correlation analysis, decision trees & tables (below). Income Credit History Job Stability Credit Success 50000 Good Yes 75000 Mixed Bad No
23
2. Association/Market Basket
Purpose: Determine what events or items go together/co-occur. Examples: What items are customers likely to buy together? (Business use: Consider putting the two together to increase cross-selling.)
24
Association Challenges
If an item is rarely purchased, any other item bought with it seems important. So combine items into categories. Some relationships are obvious. Burger and fries. Some relationships are puzzling/meaningless/misleading. Hardware store found that toilet rings sell well only when a new store first opened. But what does it mean? Caution applies to data analytics: mere relationships without a background knowledge can be misleading.
25
3. Cluster Analysis Purpose: Determine groups of people or some entities. Examples Are there groups of customers? (If so, we could target them; market segmentation) Do the locations for our stores have elements in common? (If so, we can search for similar clusters for new locations.) Do employees have common characteristics? (If so, we can hire similar, or dissimilar, people.) Large inter-cluster distance Small intra-cluster distance
26
Summary: From Data Warehousing/Mining to Data Analytics
EXPLANATORY ANALYTICS PREDICTIVE ANALYTICS
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.