© 2003, Prentice-Hall Chapter 2 - 1 Chapter 2: The Data Warehouse Modern Data Warehousing, Mining, and Visualization: Core Concepts by George M. Marakas.

Slides:



Advertisements
Similar presentations
Information Systems Analysis and Design
Advertisements

Accessing Organizational Information—Data Warehouse
Chapter 9 Designing Systems for Diverse Environments.
Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall Essentials of Systems Analysis and Design Fourth Edition Joseph S. Valacich Joey F.
Managing Data Resources
DATA WAREHOUSING.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 1: Introduction to Decision Support Systems Decision Support.
Copyright 2004 Prentice-Hall, Inc. Essentials of Systems Analysis and Design Second Edition Joseph S. Valacich Joey F. George Jeffrey A. Hoffer Chapter.
13 Chapter 13 The Data Warehouse Hachim Haddouti.
Business Driven Technology Unit 2
Copyright 2006 Prentice-Hall, Inc. Essentials of Systems Analysis and Design Third Edition Joseph S. Valacich Joey F. George Jeffrey A. Hoffer Chapter.
Chapter 13 The Data Warehouse
Managing Data Resources. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single.
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Leaving a Metadata Trail Chapter 14. Defining Warehouse Metadata Data about warehouse data and processing Vital to the warehouse Used by everyone Metadata.
CHAPTER 08 Accessing Organizational Information – Data Warehouse
ETL By Dr. Gabriel.
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Basic Concepts of Datawarehousing An Overview Prasanth Gurram.
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
L/O/G/O Metadata Business Intelligence Erwin Moeyaert.
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Data Warehouse & Data Mining
Understanding Data Warehousing
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
AN OVERVIEW OF DATA WAREHOUSING
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
© 2007 by Prentice Hall 1 Introduction to databases.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 10: The Data Warehouse Decision Support Systems in the 21 st.
Chapter 2: The Data Warehouse Modern Data Warehousing, Mining, and Visualization: Core Concepts by George M. Marakas Spring 2012.
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
Data Warehouse Fundamentals Rabie A. Ramadan, PhD 2.
2 Copyright © Oracle Corporation, All rights reserved. Defining Data Warehouse Concepts and Terminology.
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Sachin Goel (68) Manav Mudgal (69) Piyush Samsukha (76) Rachit Singhal (82) Richa Somvanshi (85) Sahar ( )
Database Management System Prepared by Dr. Ahmed El-Ragal Reviewed & Presented By Mr. Mahmoud Rafeek Alfarra College Of Science & Technology- Khan younis.
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
Creating a Data Warehouse Data Acquisition: Extract, Transform, Load Extraction Process of identifying and retrieving a set of data from the operational.
Foundations of Information Systems in Business. System ® System  A system is an interrelated set of business procedures used within one business unit.
Advanced Database Concepts
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 1 Database Systems.
Data Warehouse A place the information system department puts the data that is turned into information. Data must be properly prepared,organized,and presented.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
Data Warehouse – Your Key to Success. Data Warehouse A data warehouse is a  subject-oriented  Integrated  Time-variant  Non-volatile  Restructure.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 8: Data Warehousing.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
Slide 1 Data Warehousing in CIM  2000 YourNameHere Data Warehousing in Computer Integrated Manufacturing Steve Daino IEM 5303.
Managing Data Resources File Organization and databases for business information systems.
Advanced Applied IT for Business 2
Decision Support System by Simulation Model (Ajarn Chat Chuchuen)
Manajemen Data (2) PTI Pertemuan 6.
Chapter 13 The Data Warehouse
MANAGING DATA RESOURCES
Data Warehouse and OLAP
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
Data Warehouse A place the information system department puts the data that is turned into information. Data must be properly prepared,organized,and presented.
C.U.SHAH COLLEGE OF ENG. & TECH.
Introduction of Week 9 Return assignment 5-2
Data Warehouse.
Metadata The metadata contains
Data Warehousing Concepts
Introduction What is Data Warehouse?
Data Warehouse and OLAP
Presentation transcript:

© 2003, Prentice-Hall Chapter Chapter 2: The Data Warehouse Modern Data Warehousing, Mining, and Visualization: Core Concepts by George M. Marakas

© 2003, Prentice-HallChapter : Stores, Warehouses and Marts A data warehouse is a collection of integrated databases designed to support a DSS. An operational data store (ODS) stores data for a specific application. It feeds the data warehouse a stream of desired raw data. A data mart is a lower-cost, scaled-down version of a data warehouse, usually designed to support a small group of users (rather than the entire firm). The metadata is information that is kept about the warehouse.

© 2003, Prentice-HallChapter The Data Warehouse Environment The organization’s legacy systems and data stores provide data to the data warehouse or mart. During the transfer of data from the various sources, cleansing or transformation may occur, so the data in the DW is more uniform. Simultaneously, metadata is recorded. Finally, the DW or mart may be used to create one or more “personal” warehouses.

© 2003, Prentice-HallChapter Organizational Data Flow and Data Storage Components

© 2003, Prentice-HallChapter Characteristics of a Data Warehouse Subject oriented – organized based on use Integrated – inconsistencies removed Nonvolatile – stored in read-only format Time variant – data are normally time series Summarized – in decision-usable format Large volume – data sets are quite large Non normalized – often redundant Metadata – data about data are stored Data sources – comes from nonintegrated sources

© 2003, Prentice-HallChapter A Data Warehouse is Subject Oriented

© 2003, Prentice-HallChapter Data in a Data Warehouse are Integrated

© 2003, Prentice-HallChapter : The Data Warehouse Architecture The architecture consists of various interconnected elements: Operational and external database layer – the source data for the DW Information access layer – the tools the end user access to extract and analyze the data Data access layer – the interface between the operational and information access layers Metadata layer – the data directory or repository of metadata information

© 2003, Prentice-HallChapter The Data Warehouse Architecture (cont.) Additional layers are: Process management layer – the scheduler or job controller Application messaging layer – the “middleware” that transports information around the firm Physical data warehouse layer – where the actual data used in the DSS are located Data staging layer – all of the processes necessary to select, edit, summarize and load warehouse data from the operational and external data bases

© 2003, Prentice-HallChapter Components of the Data Warehouse Architecture

© 2003, Prentice-HallChapter Data Warehousing Typology The virtual data warehouse – the end users have direct access to the data stores, using tools enabled at the data access layer The central data warehouse – a single physical database contains all of the data for a specific functional area The distributed data warehouse – the components are distributed across several physical databases

© 2003, Prentice-HallChapter : Data Have Data -- The Metadata The name suggests some high-level technological concept, but it really is fairly simple. Metadata is “data about data”. With the emergence of the data warehouse as a decision support structure, the metadata are considered as much a resource as the business data they describe. Metadata are abstractions -- they are high level data that provide concise descriptions of lower-level data.

© 2003, Prentice-HallChapter The Metadata in Action The metadata are essential ingredients in the transformation of raw data into knowledge. They are the “keys” that allow us to handle the raw data. For example, a line in a sales database may contain: 1023 K This is mostly meaningless until we consult the metadata (in the data directory) that tells us it was store number 1023, product K596 and sales of $

© 2003, Prentice-HallChapter The Need for Consistency in the Metadata The data warehouse is set up for the benefit of business analysts and executives across all functional areas. In their individual databases, the different areas may define and store data according to their own version of the “truth”. When data are retrieved from these different areas and placed in the warehouse, the transformation and cleansing process ensures that there is a single, integrated “truth” at the organizational level.

© 2003, Prentice-HallChapter : Interviewing the Data—Metadata Extraction Regardless of the nature of a query, certain aspects of the metadata are important to all decision-makers. Some of these are: What tables, attributes and keys does the DW contain? Where did each set of data come from? What transformations were applied with cleansing?

© 2003, Prentice-HallChapter : Interviewing the Data—Metadata Extraction (cont.) How have the metadata changed over time? How often do the data get reloaded? Are there so many data elements that you need to be careful what you ask for?

© 2003, Prentice-HallChapter Components of the Metadata Transformation maps – records that show what transformations were applied Extraction history – records that show what data was analyzed Algorithms for summarization – methods available for aggregating and summarizing Data ownership – records that show origin Access patterns – records that show what data are accessed and how often

© 2003, Prentice-HallChapter Typical Mapping Metadata Transformation mapping records include: Identification of original source Attribute conversions Physical characteristic conversions Encoding/reference table conversions Naming changes Key changes Values of default attributes Logic to choose from multiple sources Algorithmic changes

© 2003, Prentice-HallChapter : Implementing the Data Warehouse Kozar assembled a list of “seven deadly sins” of data warehouse implementation: 1. “If you build it, they will come” – the DW needs to be designed to meet people’s needs 2. Omission of an architectural framework – you need to consider the number of users, volume of data, update cycle, etc. 3. Underestimating the importance of documenting assumptions – the assumptions and potential conflicts must be included in the framework

© 2003, Prentice-HallChapter “Seven Deadly Sins”, continued 4. Failure to use the right tool – a DW project needs different tools than those used to develop an application 5. Life cycle abuse – in a DW, the life cycle really never ends 6. Ignorance about data conflicts – resolving these takes a lot more effort than most people realize 7. Failure to learn from mistakes – since one DW project tends to beget another, learning from the early mistakes will yield higher quality later

© 2003, Prentice-HallChapter : Data Warehouse Technologies No one currently offers an end-to-end DW solution. Organizations buy bits and pieces from a number of vendors and hopefully make them work together. SAS, IBM, Software AG, Information Builders and Platinum offer solutions that are at least fairly comprehensive. The market is very competitive. Table 2-6 in the text lists 90 firms that produce DW products.

© 2003, Prentice-HallChapter : The Future of Data Warehousing As the DW becomes a standard part of an organization, there will be efforts to find new ways to use the data. This will likely bring with it several new challenges: Regulatory constraints may limit the ability to combine sources of disparate data. These disparate sources are likely to contain unstructured data, which is hard to store. The Internet makes it possible to access data from virtually “anywhere”. Of course, this just increases the disparity.