AN OVERVIEW OF DATA WAREHOUSING

Slides:



Advertisements
Similar presentations
Chapter 13 The Data Warehouse
Advertisements

Database Management3-1 L3 Database Management Santa R. Susarapu Ph.D. Student Virginia Commonwealth University.
Data Warehousing M R BRAHMAM.
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Oregon State University, Corvallis 16 th May, 2005.
ICS 421 Spring 2010 Data Warehousing (1) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/18/20101Lipyeow.
Managing Data Resources
Data Warehouse IMS5024 – presented by Eder Tsang.
Chapter 3 Database Management
Components and Architecture CS 543 – Data Warehousing.
Chapter 13 The Data Warehouse
1 © Prentice Hall, 2002 Chapter 11: Data Warehousing.
Designing a Data Warehouse
Chapter 4 Database Management Systems. Chapter 4Slide 2 What is a Database Management System (DBMS)?  Database An organized collection of related data.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Components of the Data Warehouse Michael A. Fudge, Jr.
Chapter 13 – Data Warehousing. Databases  Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age  Information,
M ODULE 5 Metadata, Tools, and Data Warehousing Section 4 Data Warehouse Administration 1 ITEC 450.
Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran.
Basic Concepts of Datawarehousing An Overview Prasanth Gurram.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Data Warehouse & Data Mining
1 Brett Hanes 30 March 2007 Data Warehousing & Business Intelligence 30 March 2007 Brett Hanes.
Database Systems – Data Warehousing
DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
Datawarehouse Objectives
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
1 Data Warehouses BUAD/American University Data Warehouses.
13 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management 4th Edition Peter Rob & Carlos Coronel.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
CISB594 – Business Intelligence
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Sachin Goel (68) Manav Mudgal (69) Piyush Samsukha (76) Rachit Singhal (82) Richa Somvanshi (85) Sahar ( )
Ch3 Data Warehouse Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
CISB594 – Business Intelligence Data Warehousing Part I.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
UNIT-II Principles of dimensional modeling
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
CISB594 – Business Intelligence Data Warehousing Part I.
DATA RESOURCE MANAGEMENT
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
CISB594 – Business Intelligence Data Warehousing Part I.
© 2003 Prentice Hall, Inc.3-1 Chapter 3 Database Management Information Systems Today Leonard Jessup and Joseph Valacich.
Business Intelligence Training Siemens Engineering Pakistan Zeeshan Shah December 07, 2009.
Advanced Database Concepts
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
Data Warehouse – Your Key to Success. Data Warehouse A data warehouse is a  subject-oriented  Integrated  Time-variant  Non-volatile  Restructure.
Managing Data Resources File Organization and databases for business information systems.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Business Intelligence Overview
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Data Warehouse.
Chapter 13 – Data Warehousing
Data Warehouse and OLAP
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
An Introduction to Data Warehousing
Introduction of Week 9 Return assignment 5-2
Data Warehouse.
Data Warehousing Concepts
Chapter 3 Database Management
Data Warehouse and OLAP
Presentation transcript:

AN OVERVIEW OF DATA WAREHOUSING

Definition of DATA WAREHOUSING A data warehouse is a subject-oriented, integrated, time variant, non-volatile, collection of data in support of management’s decision. In simple terms, it is a process of combining all of the enterprise data into one central repository which can be accessed by any one from any department at any time.

WHAT YOU NEED TO KNOW ABOUT DATA WAREHOUSING TERMINOLOGY? Data Warehouse: An enterprise wide data store comprising data from all company information system as well as data from external sources. Data Mart: A subset of a data warehouse that includes information from one department. Data Mining: A process used for decision support and discovering information from deep with in a database. Decision Support System (DSS): A set of programs based on statistical models or trend analysis that assists in management decisions.

Executive Information System (EIS): EIS provides a structured interface to predefined reports that provide highly summarized, top level information about the business. On Line Analytical Processing (OLAP): Database software tools that make use of an intermediary database to store summary information and pre-defined calculations. On-Line Transactional Processing (OLTP): A system that keeps track of an establishment’s daily transactions and updates the warehouse at periodic intervals. Data Cleansing: Data cleansing ensures that the data extracted from the operational database contains valid information. You can evaluate the data either on a logical or on a technical level.

THE NEED FOR DATA WAREHOUSING Company Information cannot be adequately analyzed in the form in which it is currently stored. Data is spread over multiple tables, too detailed, difficult to find, not documented, not accessible, not in a proper format, or all of the above. Operational database needs to be optimized for data entry since operational data is generated continuously. Simultaneous transaction processing and query processing can greatly slow down the system and give inconsistent results. Operational data processing differs significantly from Decision Support data processing.

WHAT IS DATA WAREHOUSING? The main purpose of a data warehouse is to turn operational data into information. This leads to the basic functions of a data warehouse: extracting data from existing internal or external operational databases, cleaning, transforming, auditing the data, and organizing it in an optimized format to allow fast, easy access and analysis of your company's performance.

THE DATA WAREHOUSING PIECES Software to extract the information from the operational database and load it into the datawarehouse. Software for removing inconsistencies from the data coming from various sources (Data Cleansing). Hardware and database software for the Decision Support Database. Software for managing the metadata (Stored Database Definition). Business Intelligence software that presents the data in the warehouse to the users. Data mining software that looks for hidden patterns in the information.

COMPONENTS OF DATA WAREHOUSE Data Extract Data Cleansing Data Load Data Storage Warehouse Management Data Mining Data Visualize EIS

DIFFERENT SYSTEMS: DSS Decision support systems assist decision makers as they access the information they need to improve their decision quality. A decision support system: - generally relies on databases that are different from an OLTP database, - the number of transactions in a data warehouse database is usually one per day, - this transaction (load) stores all new, updated, and changed data in the data warehouse, - if the load process fails the entire database is restored and not rolled back record-by-record, - the data warehouse database represents a static picture between two load processes.

DIFFERENT SYSTEMS: OLTP By OLTP systems, we mean all those applications that are used in the day-to-day business such as; - order entry, - inventory management, - payroll, - or production tracking. Another term has been established for those applications, OSS (Operational Support Systems). These systems handle millions of transactions per day and ensure data consistency between all of the related tables in the database. Each update of related data is committed to the database. Depending on the purpose of your OLTP system, the contents of the database tables change permanently during the business day.

DIFFERENT SYSTEMS : OLAP Online Analytical Processing (OLAP) is a totally different way to look at data compared to the OLTP method. To analyze a company's performance, the view of the data has to be more - global, - summarized, - and not focussed on detailed pieces of information. This way of looking at data is called OLAP. Users with OLAP requirements generally cannot be satisfied with a database architecture optimized for fast single record retrieval. In their case, the database architecture has to be more open and optimized to create summarized data quickly and support the user's analytical requirements.

DATA WAREHOUSE ARCHITECTURE There are three key components of the data warehouse architecture: 1. The enterprise data store 2. Host data sources 3. Client access “A typical enterprise data warehouse solution architecture is not just a collection of products, but rather an integrated whole consisting of products and an articulated set of interfaces between them, targeted at solving a particular set of problems.”

DATA WAREHOUSE PRODUCTS: SYBASE Sybase has a complete end-to-end data mart driven solution for enterprise, distributed decision support. The Sybase Data Warehouse consists of components including design and modeling, transformation and movement as well as data storage and access tools optimized to extend enterprise decision support systems to the Internet and beyond. WarehouseArchitect supports all three DW layers in terms of data modeling, metadata and data import, and interfaces with the various third-party tools that have an active part in the DW environment. Warehouse Architect allows you to: Import source information from OLTP databases. Design DW and DM models. Generate and maintain DW and DM databases. Enable Extraction command scripts to automate data transfering. Export/Import multidimensional information to/from OLAP engines. Generate reports on design activities.

DATA WAREHOUSE PRODUCTS: ORACLE Oracle Applications Data Warehouse is a powerful decision-support application that allows business managers to perform complex analysis of data extracted from Oracle's new client/server applications. Oracle provides a fully-assembled decision-support product combining the strength data management software with on-line analytical processing. Oracle Applications Data Warehouse provides analysis, query and reporting tools, and data-mapping capabilities, that will allow users to access data quickly and easily, with the power to "drill down" for more detail. The tools will also enable users to perform sophisticated multi-dimensional analysis and modeling. Some key features of Oracle Applications Data Warehouse: Integrating Data from Other Sources. Drilling Down and Slicing Data into User-Defined Views Analyze data using dimensions, hierarchies or views. Additional options include ranking, sorting, exception filtering and color coding of data.

DATA WAREHOUSE PRODUCTS: IBM IBM Visual Warehouse is the low-cost, end-to-end solution for data warehousing. With Visual Warehouse, you can extract a wide variety of heterogeneous data, transform the data using SQL, and store the data for use in decision support. IBM's datamart offering is the Visual Warehouse solution. Like datamarts, Visual Warehouse lets you: replicate data from a variety of operational data sources, aggregate, summarize, cleanse, or enhance the data, and make it available to end users through their favorite decision support tools. create business views, which model the structure and format of your data using an interface that people can understand and work with. experience IBM expertise and know-how. Our experienced consultants can help you with everything from the design and planning stages of your datamart to the implementation and maintenance of a production system.

DATA MODELING & ITS IMPORTANCE Data modeling is the process of analyzing and representing the "things" (entities) about which an enterprise must know. Just as a complex building requires a guiding architectural plan, building a complex data infrastructure requires a guiding data model. When performed properly, data modeling is valuable and absolutely essential for effective management of the enterprise's information resources. It is required for both effective shared databases with integrated information systems, and for effective data warehouses with comprehensive decision support systems.

DATA MODELS: OPERATIONAL vs DATA WAREHOUSE The data warehouse model has distinct differences from its operational counterpart: Operational Data Model: Data Warehouse Data Model: Data supports operational Data supports tactical and processes strategic processes (decision support and executive information support) Fully normalized for De-normalized for efficient effective integrity retrieval management   Current data values Historical data values Minimal derived data High degree of summarized data Contains all operational data Contains only data that has value over time

DIMENSIONAL MODEL Dimensional modeling is a technique that helps the database designer to build information structures that satisfy the way a business user asks for information. Its most important advantage is that it matches end users’ needs for simplicity. The purpose of dimensional modeling is to provide the data warehouse and managed-query tools with a database definition that lends itself to subject-oriented information processing. In this way, information can be re-arranged to be presented to the end user in multiple views, from different perspectives. Another name for the dimensional model is the STAR JOIN SCHEMA. There is one dominant table (fact table) in the center of the schema with multiple joins joining it to other tables (dimension tables).

DIMENSIONAL MODEL: STAR SCHEMA DIAGRAM

ENTITY-RELATION MODEL The ER model describes data as entities, relationships and attributes. This model seeks to drive all the redundancy out of the data. This means that a transaction that changes any data only needs to touch the database in one place. + Its advantage is that data exists in a highly normalized form. - In comparison to the star schema model, the ER model does not show which entity is “strong”. - For queries that span many records or many tables, ER diagrams are too complex for users to understand and to be successfully navigated by DBMS software.

ENTITY-RELATION DIAGRAM

TRENDS AND ISSUES IN WAREHOUSING Move towards WEB-based data warehouses. - Lack of Security - Business Value Need to assess the costs vs. benefits of implementing a data warehouse. Increasing need for customized models.