Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.

Similar presentations


Presentation on theme: "The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational."— Presentation transcript:

1

2 The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational goals using operational databases Data analysis provides information about short-term tactical evaluations and strategies

3 Business Intelligence 3 Comprehensive, cohesive, integrated tools and processes Capture, collect, integrate, store, and analyze data Generate information to support business decision making Framework that allows a business to transform: Data into information Information into knowledge Knowledge into wisdom

4 4

5 Decision Support Data 5 BI effectiveness depends on quality of data gathered at operational level Operational data seldom well-suited for decision support tasks Need reformat data in order to be useful for business intelligence

6 Operational Data vs. Decision Support Data 6 Operational data Mostly stored in relational database Optimized to support transactions representing daily operations Decision support data differs from operational data in three main areas: Time span Granularity Dimensionality

7 Decision Support Database Requirements 7 Specialized DBMS tailored to provide fast answers to complex queries Four main requirements: Database schema Data extraction and loading End-user analytical interface Database size

8 Decision Support Database Requirements (cont’d.) 8 Database schema Complex data representations Aggregated and summarized data Queries extract multidimensional time slices Data extraction and filtering Supports different data sources Flat files Hierarchical, network, and relational databases Multiple vendors Checking for inconsistent data

9 Decision Support Database Requirements (cont’d.) 9 End-user analytical interface One of most critical DSS DBMS components Permits user to navigate through data to simplify and accelerate decision-making process Database size In 2005, Wal-Mart had 260 terabytes of data in its data warehouses DBMS must support very large databases (VLDBs)

10 The Data Warehouse 10 Subject-oriented, integrated, time-variant, and nonvolatile collection of data Provides support for decision making Usually a read-only database optimized for data analysis and query processing Requires time, money, and considerable managerial effort to create

11 Data Warehouse is subject oriented. 11

12 Data Warehouse is Integrated. All data from multiple sources is required to be converted in a standard format to populate a data warehouse. Here are some of the items that would need standardization: Naming conventions Codes Data attributes Measurements 12

13 Data Warehouse is integrated 13

14 Data Warehouse is integrated. 14

15 15

16 Data Warehouse is time-variant 16 In order to discover trends in business, analysts need large amounts of data. Historical data is kept in a data warehouse. For example, one can retrieve data from 3 months, 6 months, 12 months, or even older data from a data warehouse. This contrasts with a transactions system, where often only the most recent data is kept. For example, a transaction system may hold the most recent address of a customer, where a data warehouse can hold all addresses associated with a customer.

17 Data Warehouse is nonvolatile collection of data 17 Data were stored in Data Warehouse for read only, small and slowly changed. Once data is in the data warehouse, it will not change. So, historical data in a data warehouse should never be altered.

18 Data Granularity in a Data Warehouse Data is stored at different detail levels in operational systems as well as in a data warehouse. Data is not normally stored in summarized form in an operational system. Data in a warehouse is required to be stored in summarized form in a warehouse. 18

19 Data Granularity in a Data Warehouse(cont’d) Data is stored at different detail levels in operational systems as well as in a data warehouse. Data is not normally stored in summarized form in an operational system. Data in a warehouse is required to be stored in summarized form in a warehouse. The more detail there is in the fact table, the higher its granularity and vice versa. 19

20 Data Granularity in a Data Warehouse(cont’d) 20 Example: Say we have a data mart with a single fact (Sales) and three dimensions (Time, Organization and Product). The fact table contains three metrics (Unit Price, Units Sold and Total Sale Amount). The Time dimension consists of four hierarchical elements (Year, Quarter, Month and Day). The Organization dimension consists of three hierarchical elements (Region, District and Store). The Product dimension consists of two hierarchical elements (Product Family and SKU).

21 Data Granularity in a Data Warehouse(cont’d) 21 As always, the metrics in the Sales fact table must be stored at some intersection of the dimensions (i.e., Time, Organization and Product). Hence, in this data mart, the highest granularity that we can store Sales metrics is by Day/Store/SKU (i.e., the lowest level in each dimensional hierarchy). Conversely, the lowest granularity that we can aggregate Sales metrics to in this data mart is by Year/Region/Product Family (i.e., the highest level in each dimensional hierarchy). We may also (for a variety of performance reasons) choose to store Sales metrics at some intermediate level of granularity (e.g., by Month/District/SKU).

22 Granularity levels in a data warehouse 22

23 The Data Warehouse (cont’d.) Data mart Small, single-subject data warehouse subset More manageable data set than data warehouse Provides decision support to small group of people Typically lower cost and lower implementation time than data warehouse 23

24 Data Marts 24

25 Components of Dataware House 25

26 Data Sources Source data coming to warehouse. Can be divided into four categories: Production Data: Data coming from operational databases. Internal Data: Data held in private files of employees and departments (not in operational database). Archived Data: Data available in backups of operational databases. External Data: Data not stored at organization end but coming from some external sources but that data is useful to organization. 26

27 Example of Production Data Data related to doctors, patients, treatments in a hospital system. This system will be an operational database or an online transaction processing system. Users will enter information in this system on regular basis. Data coming from this information system to data warehouse is called production data. 27

28 Example of Internal Data In a hospital, there may be some data which is not stored in operational database but in some excel sheets and word files. Manual registration slips of patient, when operational database was not active. Some standard operating procedures (SOP) documents which cannot be stored in operational system. Some notes taken by doctor about his patients in some word document. List of some patients who visited doctor for some consultancy, but were not registered patients of hospital. 28

29 Example of Archived Data Backups of databases are maintained on regular basis. When amount of data stored in an operational database increases, it is stored in backup files. Backup files are normally stored on some off-line storage like a Magnetic Tape. For example: backup of a hospital’s database is maintained on regular basis. This archived data is useful for a data warehouse to provide historical information about data. 29

30 Example of External Data A car rental company have a system to store data about the vehicles they provide for rent. Company need to maintain information from different manufacturers about new models of cars. This information will be external to that car rental company, not part of their system. 30


Download ppt "The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational."

Similar presentations


Ads by Google