Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Data Warehousing. From DBMS to Decision Support DBMSs widely used to maintain transactional data Attempts to use of these data for analysis,

Similar presentations


Presentation on theme: "Introduction to Data Warehousing. From DBMS to Decision Support DBMSs widely used to maintain transactional data Attempts to use of these data for analysis,"— Presentation transcript:

1 Introduction to Data Warehousing

2 From DBMS to Decision Support DBMSs widely used to maintain transactional data Attempts to use of these data for analysis, exploration, identification of trends etc. has led to Decision Support Systems. Rapid Growth since mid 70’s DBMSs vendors have answered this trend by adding new features to existing products Rarely enough

3 DBs for Decision Support Trend towards Data Warehousing Data Warehousing – consolidation of data from several databases which are in turn maintained by individual business units along with historical and summary information

4 Characteristics of TPSs CharacteristicOLTP Typical operationUpdate Level of analytical requirementsLow ScreensUnchanging Amount of data per transactionSmall Data levelDetailed Age of dataCurrent OrientationRecords

5 Complex Analysis Historical information to analyze Data needs to be integrated Database design: Denormalized, star schema OLTP Information to support day-to-day service Data stored at transaction level Database design: Normalized TPS vs Decision Support

6 MIS and Decision Support Operational reports Decision makers Production platforms MIS systems provided business data Reports were developed on request Reports provided little analysis capability no personal ad hoc access to data Ad hoc access

7 Analyzing Data from Operational Systems Data structures are complex Systems are designed for high performance and throughput Data is not meaningfully represented Data is dispersed TPS systems unsuitable for intensive queries Operational reports Production platforms ERP

8 End user computing offloaded from the operational environment User’s own data Data Extract Processing Extracts Operational systems Decision makers

9 Management Issues Extract explosion Duplicated effort Multiple technologies Obsolete reports No metadata Extracts Operational systems Decision makers

10 Data Quality Issues No common time basis Different calculation algorithms Different levels of extraction Different levels of granularity Different data field names Different data field meanings Missing information No data correction rules No drill-down capability

11 From Extract to Warehouse DSS Controlled Reliable Quality information Single source of data Data warehouse Internal and external systems Decision makers

12 Data Warehousing Architecture Metadata respository Serves Extract Clean Transform Load Refresh OLAP Data Warehouse External Data Sources Operational Databases Visualisation Data Mining

13 Business Motivators Provide superior services and products Know the business New products Invest in customers Retain customers Invest in technology Reinvent to face new challenges

14 Centralised data warehouse Federated data warehouse

15 Tiered data warehouse

16 Data Warehouses Vs Data Marts Data Mart Department Single-subject Few < 100 GB Months Data Mart Data Warehouse Property Scope Subjects Data Source Size (typical) Implementation time Data Warehouse Enterprise Multiple Many 100 GB to > 1 TB Months to years

17 End-user Access Tools High performance is achieved by pre-planning the requirements for joins, summations, and periodic reports by end-users. There are five main groups of access tools: –Data reporting and query tools –Application development tools –Executive information system (EIS) tools –Online analytical processing (OLAP) tools –Data mining tools

18 Data Usage - $1000 questions Need to complement RDBMS technology with a flexible, multidimensional view of data

19

20 The Functionality of OLAP Rotate and drill down Create and examine calculated data Determine comparative or relative differences. Perform exception and trend analysis. Perform advanced analytical functions

21 The star structure

22 Multidimensional Database Model The data is found at the intersection of dimensions. Store Time FINANCE Store Product Time SALES Customer

23 Data Mining

24 Data mining functions Associations –85 percent of customers who buy a certain brand of wine also buy a certain type of pasta Sequential patterns –32 percent of female customers who order a red jacket within six months buy a gray skirt Classifying –Frequent customers are those with incomes about $50,000 and having two or more children Clustering –Market segmentation Predicting –predict the revenue value of a new customer based on that personal demographic variables


Download ppt "Introduction to Data Warehousing. From DBMS to Decision Support DBMSs widely used to maintain transactional data Attempts to use of these data for analysis,"

Similar presentations


Ads by Google