Data Warehousing: Defined and Its Applications Pete Johnson April 2002.

Slides:



Advertisements
Similar presentations
Chapter 13 The Data Warehouse
Advertisements

Database Management3-1 L3 Database Management Santa R. Susarapu Ph.D. Student Virginia Commonwealth University.
Management Information Systems, Sixth Edition
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Oregon State University, Corvallis 16 th May, 2005.
Database – Part 3 Dr. V.T. Raja Oregon State University External References/Sources: Data Warehousing – Mr. Sakthi Angappamudali.
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS MBNA
ICS 421 Spring 2010 Data Warehousing (1) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/18/20101Lipyeow.
Chapter 9 DATA WAREHOUSING Transparencies © Pearson Education Limited 1995, 2005.
Chapter 3 Database Management
Introduction to Data Warehousing. From DBMS to Decision Support DBMSs widely used to maintain transactional data Attempts to use of these data for analysis,
Database – Part 2b Dr. V.T. Raja Oregon State University External References/Sources: Data Warehousing – Sakthi Angappamudali at Standard Insurance; BI.
Components and Architecture CS 543 – Data Warehousing.
DATA WAREHOUSING.
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Designing a Data Warehouse
Lecture-8/ T. Nouf Almujally
M ODULE 5 Metadata, Tools, and Data Warehousing Section 4 Data Warehouse Administration 1 ITEC 450.
Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran.
Basic Concepts of Datawarehousing An Overview Prasanth Gurram.
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS MBNA ebay
©Silberschatz, Korth and Sudarshan18.1Database System Concepts - 5 th Edition, Aug 26, 2005 Buzzword List OLTP – OnLine Transaction Processing (normalized,
Database Systems – Data Warehousing
MD240 - MIS Oct. 4, 2005 Databases & the Data Asset Harrah’s & Allstate Cases.
Systems analysis and design, 6th edition Dennis, wixom, and roth
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
Business Intelligence Zamaneh Jahed. What is Business Intelligence? Business Intelligence (BI) is a broad category of applications and technologies for.
Case 2: Emerson and Sanofi Data stewards seek data conformity
© 2007 by Prentice Hall 1 Introduction to databases.
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
Data Warehouse Fundamentals Rabie A. Ramadan, PhD 2.
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
CISB594 – Business Intelligence
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Sachin Goel (68) Manav Mudgal (69) Piyush Samsukha (76) Rachit Singhal (82) Richa Somvanshi (85) Sahar ( )
Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane.
Data resource management
CISB594 – Business Intelligence Data Warehousing Part I.
Data Warehouses and OLAP Data Management Dennis Volemi D61/70384/2009 Judy Mwangoe D61/73260/2009 Jeremy Ndirangu D61/75216/2009.
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.
DATA RESOURCE MANAGEMENT
© 2003 Prentice Hall, Inc.3-1 Chapter 3 Database Management Information Systems Today Leonard Jessup and Joseph Valacich.
Advanced Database Concepts
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
INTRODUCTION TO INFORMATION SYSTEMS LECTURE 9: DATABASE FEATURES, FUNCTIONS AND ARCHITECTURES PART (2) أ/ غدير عاشور 1.
Management Information Systems by Prof. Park Kyung-Hye Chapter 7 (8th Week) Databases and Data Warehouses 07.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Intro to MIS – MGS351 Databases and Data Warehouses
Chapter 8 Business Intelligence & ERP
Defining Data Warehouse Concepts and Terminology
Decision Support System by Simulation Model (Ajarn Chat Chuchuen)
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Data Warehouse.
Defining Data Warehouse Concepts and Terminology
المحاضرة 4 : مستودعات البيانات (Data warehouse)
MANAGING DATA RESOURCES
Data Warehouse and OLAP
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
An Introduction to Data Warehousing
Data Warehouse.
Data Warehousing Concepts
Chapter 3 Database Management
Data Warehouse and OLAP
Presentation transcript:

Data Warehousing: Defined and Its Applications Pete Johnson April 2002

Introduction The Past and The Problem What is a Data Warehouse? Components of a Data Warehouse OLAP, Metadata, Data Mining Getting the Data in Benefits vs. Costs Conclusion & Questions

The Past and The Problem Only had scattered transactional systems in the organization – data spread among different systems Transactional systems were not designed for decision support analysis Data constantly changes on transactional systems Lack of historical data Often resources were taxed with both needs on the same systems

The Past and The Problem Operational databases are designed to keep transactions from daily operations. It is optimized to efficiently update or create individual records A database for analysis on the other hand needs to be geared toward flexible requests or queries (Ad hoc, statistical analysis)

What is a Data Warehouse? Data warehousing is an architectural model designed to gather data from various sources into a single unified data model for analysis purposes.

What Is a Data Warehouse? Term was introduced in 1990 by William Immon A managed database in which the data is: Subject Oriented Integrated Time Variant Non Volatile

Subject Oriented Organized around major subject areas in the enterprise (Sales, Inventory, Financial, etc.) Only includes data which is used in the decision making processes Elements used for transactional processing are removed

Integrated Data from different sources are brought together and consolidated The data is cleaned and made consistent Example – Bank Systems using Different Codes Loan Department – COMM Transactional System - C

Time Variant Data in a Data Warehouse contains both current and historical information Operational Systems contain only current data Systems typically retain data: Operational Systems – 60 to 90 Days Data Warehouse – 5 to 10 Years

Non Volatile Operational systems have continually changing data Data Warehouses continually absorb current data and integrates it with its existing data (Aggregate or Summary tables) Example of volatile data would be an account balance at a bank

What Is a Data Warehouse? Not a product, it is a process Combination of hardware and software Concept of a Data Warehouse is not new, but the technology that allows it is

What Is a Data Warehouse? Can often be set up as one VLDB (Very Large Database) or a collection of subject areas called Data Marts. There are now tools which “unify” these Data Marts and make it appear as a single database.

What Is a Data Warehouse? Transformation of Data to Information Transaction Processing Cleansing / & Normalization Relational Warehouse SQL reporting Exploration / Analysis Data Information

Components of a Data Warehouse Four General Components: Hardware DBMS - Database Management System Front End Access Tools Other Tools In all components scalability is vital Scalability is the ability to grow as your data and processing needs increase

Components of a Data Warehouse - Hardware Power - # of Processors, Memory, I/O Bandwidth, and Speed of the Bus Availability – Redundant equipment Disk Storage - Speed and enough storage for the loaded data set Backup Solution - Automated and be able to allow for incremental backups and archiving older data

Components of a Data Warehouse - DBMS Physical storage capacity of the DBMS Loading, indexing, and processing speed Availability Handle your data needs Operational integrity, reliability, and manageability

Components of a Data Warehouse - Front End & Other Tools Query Tools (SQL & GUI based) Report Writers Metadata Repositories OLAP (Online Analytical Processing) Data Mining Products

Components of a Data Warehouse – Metadata Repositories Metadata is Data about Data. Users and Developers often need a way to find information on the data they use. Information can include: Source System(s) of the Data, contact information Related tables or subject areas Programs or Processes which use the data Population rules (Update or Insert and how often) Status of the Data Warehouse’s processing and condition

Components of a Data Warehouse – OLAP Tools OLAP - Online Analytical Processing. It works by aggregating detail data and looks at it by dimensions Gives the ability to “Drill Down” in to the detail data Decision Support Analysis Tool Multidimensional DB focusing on retrieval of precalculated data Ends the “big reports” with large amounts of detailed data These tools are often graphical and can run on a “thin client” such as a web browser

Components of a Data Warehouse – Data Mining Answers the questions you didn’t know to ask Analyzes great amounts of data (usually contained in a Data Warehouse) and looks for trends in the data Technology now allows us to do this better than in the past

Components of a Data Warehouse – Data Mining Most famous example is the Huggies - Heineken case Used in Retail sector to analyze buying habits Used in financial areas to detect fraud Used in the stock market to find trends Used in scientific research Used in national security

Getting the Data In Data will come from multiple databases and files within the organization Also can come from outside sources Examples: Weather Reports Demographic information by Zip Code

Getting the Data In Three Steps : 1. Extraction Phase 2. Transformation Phase 3. Loading Phase

Getting the Data In Extraction Phase: Source systems export data via files or populates directly when the databases can “talk” to each other Transfers them to the Data Warehouse server and puts it into some sort of staging area

Getting the Data In Transformation Phase: Takes data and turns it into a form that is suitable for insertion into the warehouse Combines related data Removes redundancies Common Codes (Commercial Customer) Spelling Mistakes (Lozenges) Consistency (PA,Pa,Penna,Pennsylvania) Formatting (Addresses)

Getting the Data In Loading Phase: Places the cleaned data into the DBMS in its final, useable form Compare data from source systems and the Data Warehouse Document the load information for the users

Benefits vs. Costs

Benefits Creates a single point for all data System is optimized and designed specifically for analysis Access data without impacting the operational systems Users can access the data directly without the direct help from IT dept

Costs Cost of implementation & maintenance (hardware, software, and staffing) Lack of compatibility between components Data from many sources are hard to combine, data integrity issues Bad designs and practices can lead to costly failures

Conclusion What is a Data Warehouse? Components of a Data Warehouse How the Data Gets In OLAP, Metadata, and Data Mining Benefits vs. Costs

Questions