Presentation is loading. Please wait.

Presentation is loading. Please wait.

Basic Concepts of Datawarehousing An Overview Prasanth Gurram.

Similar presentations


Presentation on theme: "Basic Concepts of Datawarehousing An Overview Prasanth Gurram."— Presentation transcript:

1 Basic Concepts of Datawarehousing An Overview Prasanth Gurram

2 What is the sales distribution region wise? What is Defaulter’s Profile? What are the slow movers in my product line? How did my revenue improve in the past 5 years? Which of my Sales Agents are doing better? Who are my profitable customers? Currency Risk, Interest Rate Risk, Liquidity Risk Strategic Planning / Budgeting Which channel costs me more and pays less? How to answer these Business Queries?

3 Enable users to get a “Business View” of the data Facilitate Data based Decision Making that would drive and improve the Business Discover “Hidden Trends” Decision Support Systems Decision Support Systems (DSS) are interactive computer- based systems intended to help decision makers utilize data and models to identify and solve problems and make decisions. Data Warehouse is the foundation of DSS process. It is a Strategy and a Process for Staging Corporate Data. Decision Support Systems Decision Support Systems (DSS) are interactive computer- based systems intended to help decision makers utilize data and models to identify and solve problems and make decisions. Data Warehouse is the foundation of DSS process. It is a Strategy and a Process for Staging Corporate Data. DSS

4 Driving Forces for DSS RESULT: Customers Reform Technology Business Speed COMPETITION

5  Unavailability of Tools and Techniques for acquisition of data from various sources for answering business questions and making decisions, in earlier days  Intensive efforts in data formatting than data analysis  Static and inflexible report generation  Time-lag in accessing the information at central place Scenario without DSS

6 OLTP Environment  get data IN  large volumes of simple transaction queries  continuous data changes  low processing time  mode of processing  transaction details  data inconsistency  mostly current data DSS Environment  get information OUT  small number of diverse queries  periodic updates only  high processing time  mode of discovery  subject oriented - summaries  data consistency  historical data is relevant OLTP v/s DSS Environment

7 OLTP Environment  high concurrent usage  highly normalized data structure  static applications  automates routines DSS Environment  low concurrent usage  fewer tables, but more columns per table  dynamic applications  facilitates creativity OLTP v/s DSS Environment

8 Benefits for Business User  Flexible Information Access  High Availability  Ease of Use  Quality & Completeness of Data  Focus on Information Processing  Information Base for Knowledge Discovery

9  Advances in dbms technology  Data warehousing  On-line analytical processing  Data mining Available line of technology

10 Datawarehouse  Data warehouses store large volumes of data which are frequently used by DSS.It is maintained separately from the organization’s operational databases  Data warehouse is subject-oriented, integrated, time- variant, and nonvolatile collection of data Subject-oriented : Contains information regarding objects of interest for decision support: Sales by region, by product, etc. Itegrated: Data are typically extracted from multiple, heterogeneous data sources (e.g., from sales, inventory, billing DBs etc.). Time-variant: Contain historical data, longer horizon than operational system. Nonvolatile : Data is not (or rarely) directly updated.

11 Datawarehouse  Is the enabling technology that facilitates improved business decision-making  It’s a process, not a product  A technique for assembling and managing a wide variety of data from multiple operational systems for decision support and analytical processing It’s a journey not a destination...

12 Transmission NETWORKNETWORK Metadata Layer Cleansing Transformation Aggregation Summarization Data Mart Population Knowledge Discovery ODS DW OLAP ANALYSIS Extraction DM1 DM2 DMn Legacy System FS1 FS2 FSn...... STAGINGAREASTAGINGAREA DW Components

13  Data extraction  Data Cleansing and Transformation  Data Load and refresh  Build derived data and views  Service queries  Administer the warehouse Operational Process

14 Data Capturing Process Feed System Application Business Transactions Incremental Data Control Metadata Extract the incremental data from feed system Store the extracted data into a temporary area Extract data from multiple, heterogeneous, and external sources Extraction Process ( Data Capturing )

15 Network Cloud Transmit the extracted data from Feed system to Staging area Periodicity of transmission ( daily / weekly ) depends upon the feed system Feed System Side Incremental Data Staging area Incremental Data FTP Extraction Process (Data Transmission )

16 Raw data (Staging Area) Process Metadata Cleansing Rules Control Metadata Cleansing Process Cleansing Reports Good Bad Clean data Detect errors in the data and rectify them when possible Mark it Good/Bad Generate the cleansing Reports and mail to the DWA and Feed System representatives Cleansing Process

17 Transformation Process Clean Operational Data Operational Data Store Transform the cleaned Operational Data into DSS Data Load the DSS data into ODS ODS contains the current DSS data at the lowest level of granularity Control Metadata Process Metadata Mapping Detail Transformation Rule Transformation Process

18 Summarization Process ODS WeeklyMonthlyYearly DW Summarize and aggregate ODS data and Populate to the Warehouse Periodicity of Summarization Process depends upon the level of summarization at Warehouse ( weekly, monthly, daily ) Control Metadata Summarization Process

19  Data about Data  Used to maintain Datawarehouse  Control data  Static:  Roles, permissions, naming standards, source system names,  Locations, target names, transformation and mapping rules  Dynamic:  Scheduling, scripts, load statistics, space usage,  Backup statistics  Business data  Business rules,Who validates data,Who controls,How they validate Metadata

20  Extraction/transformation/load tool (family of tools including data modeling tool, extraction tool, Meta data repository, and DW administration tools)  Meta data exchange architecture (API used to integrate all components of DW with central Meta data)  Target databases (relational, multidimensional, hybrid)  Data access and analysis tools for end users  Database servers, operating systems, networks DW Components/Tools

21 DW Tools


Download ppt "Basic Concepts of Datawarehousing An Overview Prasanth Gurram."

Similar presentations


Ads by Google