Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Warehouse & Business Intelligence

Similar presentations


Presentation on theme: "Data Warehouse & Business Intelligence"— Presentation transcript:

1 Data Warehouse & Business Intelligence
Concepts & Architecture Sanjeev

2 Data Warehouse & Business Intelligence
Topics To Be Discussed: Why Do We Need A Data Warehouse ? What Exactly Is A Data Warehouse ? Features Of Data Warehouse Sources Of Data Warehouse Data Warehouse Designs Data Warehouse – Data Usage Why There Is A Need of Business Intelligence ? DWH & BI - Architecture Case Study

3 Data Access Problem – Data In “Jail”
Data Warehouse Why Do We Need A Data Warehouse ? Data Access Problem – Data In “Jail” The single key to survival in the 1990s (and beyond) is - being able to analyze, plan and react to changing business conditions in a much more rapid fashion. To do this, top managers, analysts and knowledge workers in our enterprises need more and better information. Information technology itself has made possible the revolutions in the way that Organizations today operate throughout the world. More and more powerful computers on everyone‘s desks, and Communication networks that span the globe BUT STILL Executives and decision makers can't get their hands on critical information that already exists in the organization. Continued…

4 Data Access Problem – Data In “Jail”
Data Warehouse Why Do We Need A Data Warehouse ? Data Access Problem – Data In “Jail” Organizations- large and small, create billions of bytes of data about all aspects of their business, millions of individual facts about their customers, products, operations and people But for the most part, this data is locked up in a myriad of computer systems and is exceedingly difficult to get at. This phenomenon has been described as "data in jail". Only a small fraction of the data that is captured, processed and stored in the enterprise is actually available to executives and decision makers. Technologies for the manipulation and presentation of data have literally exploded. Large segments of the enterprise are still "data poor.“ Whatever is BETTER, FASTER and CHEAPER, is not FUNCTIONALLY COMPLETE. Continued…

5 Data Access Problem – Data In “Jail” Solution – A Data Warehouse
Why Do We Need A Data Warehouse ? Data Access Problem – Data In “Jail” Solution – A Data Warehouse A set of significant new concepts and tools have evolved providing all the key people in the enterprise with access to whatever level of information needed for the enterprise to survive and prosper in an increasingly competitive world. The term that has come to characterize this new technology is “Data Warehousing.” to provide an Organization flexible, effective and efficient means of getting at the sets of data that have come to represent one of the Organization‘s most critical and valuable assets. To make sure that the enterprise-wise information should be available for decision making purpose at all levels, at any point of time.

6 Data Warehouse What Exactly Is A Data Warehouse ?
A Data Warehouse is a special kind of database, which stores SUBJECT ORIENTED, INTEGRATED, TIME VARIANT, NON-VOLATILE collection of data in support of management’s decision making process. It is a structured repository of historic data of the Organization which support managerial decision making. It is developed in an evolutionary process by integrating data from non-integrated legacy systems. Many design elements that optimize transaction processing are inefficient (in several ways) in a data warehouse. Managerial access to data for decision making requires access mechanisms that would violate many principles of regular DB design, like Normalization, Security, Integrity, etc. Continued…

7 Data Warehouse - Features
Key Features of Data Warehouse : Subject Oriented Integrated Time Variant Non-volatile Data is integrated & loaded by Subject. In DWH, data is obtained from various sources and is kept in a consistent format. For example, in a operational system, multiple ID values are generated for the same order in Order, Account Receivable & Product databases. But, in DWH, one single ID value is used for 1 order. In DWH, every data component is stored for a designated time period (3/10/20 yrs) in comparison to current operational data. Data is stored at summary level, with no frequent data-updates as compare to transaction level data, which is updated frequently. Create Read ODS Insert Replace Cust Prod Update Delete Order 1999 1998 2000 Load Read DWH DWH Load Read

8 Data Warehouse - Sources
Sources of Data Warehouse Data DWH Archives Historic Data Current system of records Recent History Operational Transactions Future Data Source

9 Data Warehouse Designs
Various Data Warehouse Designs Virtual DWH Central DWH Distributed DWH Enterprise DWH End users are allowed to access operational databases directly. Provides flexibility as well as the minimum amount of redundant data that must be loaded and maintained. Put the largest unplanned query load on operational systems. Mostly used where undefined needs to access operational data from a relatively large class of end-users are very high and the likely frequency of requests is low. A single physical database that contains all of the data for a specific functional area, department, division, or enterprise, for a specific time period. Often selected where there is a common need for informational data The data stored in the DWH is accessible from one place and must be loaded and maintained on a regular basis. They are data warehouses in which the certain components of the data warehouse are distributed across a number of different physical databases. Increasingly, large Organizations are pushing decision-making down to lower and lower levels of the Organization and in turn pushing the data needed for decision making down (or out) to the LAN or local computer serving the local decision-maker. Distributed DWH usually involve the most redundant data and, as a consequence, most complex loading and updating processes. If an Organization has a single, all-encompassing database, EDWH is recommended. If the Organization has fragmented databases then an enterprise data mart must be constructed for the data warehouse, which will - reflect the data after transformations - assists with retrieval and end-user access.

10 Data Warehouse - Data Nature of Data Warehouse Data
Data in a DWH is always historic and is static in nature. It is used to look at the information over periods of time. It is usually built from the operational data available in the Organization. The data may not necessarily be from with-in the Organization. Appropriate Use of Data Warehouse Data Produce Reports For Long Term Trend analysis Produce Reports Aggregating Enterprise Data Produce Reports of Multiple Dimensions (Earned revenue by month by product by branch) Inappropriate Use of Data Warehouse Data Replace Operational Systems Replace Operational Systems’ Reports Analyze Current Operational Results

11 Data Warehouse & Business Intelligence
SQL Is Inadequate – Need of B.I. SQL is inadequate for analytical applications due to the following reasons: The conditions in WHERE clause often contains too many AND , OR conditions. OR conditions are poorly handled by most RDBMS. Statistical functions such as standard deviations are not supported by SQL. Aggregation over time is not supported. Users often need to pose related queries to get the desired results. There is no convenient way to express commonly occurring groups of queries. Most of the times, the SQL queries are not optimized and hence take lot of time to produce results. Many business operations are hard or impossible to express in SQL Comparisons (with aggregation) Multiple Aggregations Reporting features To overcome the above limitations, some business intelligence is required over the available data using a separate set of tools, which can help in doing all kind of required analysis and generating reports. Continued…

12 Data Warehouse & Business Intelligence
DWH & Business Intelligence To overcome the limitations of SQL, Business Intelligence is required for the following tasks: Data Integration – extracting the data from different heterogeneous sources and store it in a consistent format at one location (DWH). Data Transformation – transform the data in the required format before loading, so that data can be maintained in a consistent format. Data Marts, Multi-dimensional databases and cubes – creating data-marts, multi-dimensional databases and cubes, which will act as a source for various reports and trend-analysis. Data Access & Analysis – make the data available to the end-users in the form of reports and dash-boards and help the data analyst to do different kind of analysis to support the decision making at the senior management level.

13 Data Warehouse & BI Architecture
End-To-End Data Warehouse & BI Architecture Data Sources Extraction Staging Load Central DWH Data Marts Data Access & Analysis

14 Data Warehouse - Case Study
The Problem - Data in ERP “Jail” – Virginia Tech University Data structures difficult to understand and inefficient to access for analysis and reports Data values change so point-in-time data lost Growing backlog of report requests The Solution – A DWH for ERP Data – Virginia Tech University Initial charge – Build a data warehouse Initial vision – Create business view of administrative data for Virginia Tech USER DATA WAREHOUSE TRANSACTIONAL ERP SYSTEM Data Access Architecture

15 Data Warehouse - Case Study
The Solution – A DWH for ERP Data Laying The Foundation Planning Surveyed other institutions Did site visits and interviews Established scope Identified first subject area Drafted project plan Delivered management briefings Staff Education and Training Data Warehouse Institute Ralph Kimball Approach to design DWH Staffing DBA Data Administrator Data Warehouse Architects Training Coordinators Web Application Developers Other Resources Hardware Software

16 Data Warehouse - Case Study
The Solution – A DWH for ERP Data Building The Data Warehouse Strategy Build by subject area Develop iteratively Design for enterprise Design Star Schema Time Dimension Transaction Detail Surrogate Keys Conformed Dimensions Slowly Changing Dimensions The Design – Multi Dimensional

17 Data Warehouse - Case Study
The Solution – A DWH for ERP Data The Design – STAR Schema FACT TABLE DIMENSIONS DIMENSIONS

18 Data Warehouse - Case Study
The Solution – A DWH for ERP Data Confirmed Dimensions

19 Data Warehouse - Case Study
The Solution – A DWH for ERP Data Design - Slowly Changing Data (Dimensions) There are 3 ways to manage the change in the slowing changing dimension data: Overwrite changed attribute in the same record add new record for the new value use additional fields for old and new values in the same record Proper standards should be followed while designing the DWH Object names should be meaningful and standardized Indicators should be used to simplify the queries Descriptions should be provided along with each piece of code Data should be available with business descriptions to make it clear to the end-users Special Features External data may be included Derivations, calculations, aggregations and summary data should be included

20 Data Warehouse - Case Study
The Solution – A DWH for ERP Data Building The Data Warehouse Development Process Data Model Design (Erwin) Source-To-Target mapping Business Definitions ETL Development / Testing (Data Stage) Data Verification Process Control Checks Pilot User Training Data Access Strategy Stewardship same as ERP ERP security definitions leveraged Warehouse security built as part of ETL Training precedes access

21 Data Warehouse - Case Study
The Solution – A DWH for ERP Data Building The Data Warehouse

22 Data Warehouse - Case Study
The Solution – A DWH for ERP Data Query Example Create a report which will show employee id, name, current hire date, gender, ethnicity, rank and tenure of all full time minority faculties. Result - ERP Query select spriden_id, concat(spriden_last_name,concat(', ',concat(spriden_first_name,concat(' ', spriden_mi)))), to_char(pebempl_current_hire_date,'DD-MON-YYYY'), decode(spbpers_sex,'M','Male','F','Female'), stvethn_desc, ptrrank_desc, ptrtenr_desc from spriden, spbpers, pebempl, stvethn, perrank a, ptrrank, perappt c, ptrtenr where pebempl_empl_status = 'A’ and pebempl_ecls_code in ('2A','2B','2C','2F','2G','2H','2K','2L', '3A','3B','3C','3D','3H','3I','3J','3M') and pebempl_pidm = spbpers_pidm and (spbpers_sex = 'F' or spbpers_ethn_code != '1') and pebempl_pidm = spriden_pidm and spriden_change_ind is null and spbpers_ethn_code = stvethn_code and pebempl_pidm = a.perrank_pidm and a.perrank_action_date = (select MAX(perrank_action_date) from perrank b where b.perrank_pidm = a.perrank_pidm) and a.perrank_rank_code = ptrrank_code and pebempl_pidm = c.perappt_pidm and c.perappt_action_date = (select max(perappt_action_date) from perappt d where c.perappt_pidm = d.perappt_pidm) and perappt_tenure_code = ptrtenr_code

23 Data Warehouse - Case Study
The Solution – A DWH for ERP Data Query Example Create a report which will show employee id, name, current hire date, gender, ethnicity, rank and tenure of all full time minority faculties. Result - DWH Query select ssn_fin_num, current_full_name, salary_hire_date, gender_desc, ethnicity_desc, rank_desc, tenure_desc from employee where current_record_ind = 'Y’ and active_employee_ind = 'Y‘ and faculty_ind = 'Y’ and full_time_ind = 'Y’ and (gender_code = 'F' or ethnicity_code != '1')

24 Data Warehouse - Case Study
The Solution – A DWH for ERP Data DWH – Design Features DWH Metadata System has Business definitions maintained by Data Experts, which are stored with the data Data Architecture is structured for query Data can be accessed by various clients DWH is designed to include historic data DWH provides a stable business view of data

25 Data Warehouse & BI Architecture


Download ppt "Data Warehouse & Business Intelligence"

Similar presentations


Ads by Google