The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.

Slides:



Advertisements
Similar presentations
The Organisation As A System An information management framework The Performance Organiser Data Warehousing.
Advertisements

Data Warehousing Design Transparencies
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Oregon State University, Corvallis 16 th May, 2005.
Data Warehouse IMS5024 – presented by Eder Tsang.
Distributed DBMSs A distributed database is a single logical database that is physically distributed to computers on a network. Homogeneous DDBMS has the.
Data Warehousing Design Transparencies
Data Warehousing Dale-Marie Wilson, Ph.D..
Chapter 13 The Data Warehouse
Data Warehousing ISYS 650. What is a data warehouse? A data warehouse is a subject-oriented, integrated, nonvolatile, time-variant collection of data.
1 © Prentice Hall, 2002 Chapter 11: Data Warehousing.
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Designing a Data Warehouse
An Overview of Data Warehousing and OLTP Technology Presenter: Parminder Jeet Kaur Discussion Lead: Kailang.
© 2003, Prentice-Hall Chapter Chapter 2: The Data Warehouse Modern Data Warehousing, Mining, and Visualization: Core Concepts by George M. Marakas.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
Data Warehouse & Data Mining
Understanding Data Warehousing
Data Warehouse Concepts Transparencies
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
1 Adapted from Pearson Prentice Hall Adapted form James A. Senn’s Information Technology, 3 rd Edition Chapter 7 Enterprise Databases and Data Warehouses.
Program Pelatihan Tenaga Infromasi dan Informatika Sistem Informasi Kesehatan Ari Cahyono.
Data Warehousing Concepts, by Dr. Khalil 1 Data Warehousing Design Dr. Awad Khalil Computer Science Department AUC.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 10: The Data Warehouse Decision Support Systems in the 21 st.
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
Data Warehouse Fundamentals Rabie A. Ramadan, PhD 2.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
1 Data Warehouses BUAD/American University Data Warehouses.
2 Copyright © Oracle Corporation, All rights reserved. Defining Data Warehouse Concepts and Terminology.
Data Warehousing.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
13 1 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Sachin Goel (68) Manav Mudgal (69) Piyush Samsukha (76) Rachit Singhal (82) Richa Somvanshi (85) Sahar ( )
 Understand the basic definitions and concepts of data warehouses  Describe data warehouse architectures (high level).  Describe the processes used.
UNIT-II Principles of dimensional modeling
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
Creating a Data Warehouse Data Acquisition: Extract, Transform, Load Extraction Process of identifying and retrieving a set of data from the operational.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Pooja Sharma Shanti Ragathi Vaishnavi Kasala. BUSINESS BACKGROUND Lowe's started as a single hardware store in North Carolina in 1946 and since then has.
By N.Gopinath AP/CSE.  The data warehouse architecture is based on a relational database management system server that functions as the central repository.
Managing Data for DSS II. Managing Data for DS Data Warehouse Common characteristics : –Database designed to meet analytical tasks comprising of data.
Two-Tier DW Architecture. Three-Tier DW Architecture.
Advanced Database Concepts
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Acct 6910 Building Business Intelligence Systems An Introduction to Data Warehouse.
Data Warehouse A place the information system department puts the data that is turned into information. Data must be properly prepared,organized,and presented.
 Definition of terms  Reasons for need of data warehousing  Describe three levels of data warehouse architectures  Describe two components of star.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
© 2009 Pearson Education, Inc. Publishing as Prentice Hall 1 Lecture 14: Data Warehousing Modern Database Management 9 th Edition Jeffrey A. Hoffer, Mary.
Data Warehouse Data Mart Elahe Soroush. Agenda  Data Warehouse definition  Concepts  Logical transformation  Physical transformation  DW components.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 9: DATA WAREHOUSING.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 8: Data Warehousing.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
1 Data Warehousing Data Warehousing. 2 Objectives Definition of terms Definition of terms Reasons for information gap between information needs and availability.
Data Warehousing Design DT211/4. Designing Data Warehouses To begin a data warehouse project, we need to find answers for questions such as: – Which user.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Advanced Applied IT for Business 2
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Data Warehouse—Subject‐Oriented
Data Warehousing and Data Mining By N.Gopinath AP/CSE
Data Warehouse.
Data Warehouse and OLAP
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
Data Warehouse A place the information system department puts the data that is turned into information. Data must be properly prepared,organized,and presented.
Introduction of Week 9 Return assignment 5-2
CSD305 Data Warehouse Design
Data Warehouse and OLAP
Presentation transcript:

The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support of management’s decision making process.” –Data warehouses developed because E.G.: – if you want to ask “How much does this customer owe?” then the sales database is probably the one to use. However if you want to ask “Was this ad campaign more successful than that one?”, you require data from more disparate sources Other sources e.g. production, marketing etc.

Organizational Data Flow and Data Storage Components

Characteristics of a Data Warehouse Subject oriented – organized based on use; e.g. business process Integrated – inconsistencies removed Nonvolatile – stored in read-only format Time variant – data are normally time series Summarized – in decision-usable format Large volume – data sets are quite large Non normalized – often redundant:

Non-volatile and non normalised Data Data in the warehouse is not updated in real-time but is refreshed from operational systems on a regular basis. New data is always added as a supplement to the database, rather than a replacement. Data is non –normalised this is achieved using the star flake and similar schema’s… © Pearson Education Limited 1995, 2005

A data warehouse process model

Data Warehousing Architecture Fusion and cleansing: sourcing, acquisition, cleanup and transformation of data –Implementing data warehouses involves extracting data from operational systems including legacy systems and putting it into a suitable format. –These tools perform all the conversions, summarisations, key changes, structural changes, and condensations needed to transform disparate data into information can be used by decision support tools –

Data in a Data Warehouse are Integrated

Meta Data A key concept behind D.W. is Meta Data. –Meta data is data about the data (which has come from the data sources) and shows what data is contained in the DW, where it came from, and what changes have been made to it. The metadata are essential ingredients in the transformation of raw data into knowledge. They are the “keys” that allow us to handle the raw data. –For example, a line in a sales database may contain: 1023 K –This is mostly meaningless until we consult the metadata (in the data directory) that tells us it was store number 1023, product K596 and sales of $

Data marts A data mart is a data store that is subsidiary to a data warehouse of integrated data. The data mart is directed at a partition of data (subject area) that is created for the use of a dedicated group of users and is sometimes termed a “subject warehouse” The data mart might be a set of denormalised, summarised or aggregated data that can be placed on the data warehouse database or more often placed on a separate physical store. Data marts can be “dependent data marts” when the data is sourced from the data warehouse. Independent data marts represent fragmented solutions to a range of business problems in the enterprise, however, such a concept should not be deployed as it doesn’t have the “data integration” concept that’s associated with data warehouses.

Data Warehousing Typology –THE D.W. can be at single location i.e. a central data warehouse –or –The collection of data is replicated around multiple locations. This means users have a local copy of the data warehouse. This can improve query run-times, and reduce communications overheads. Distributed Data warehouse (Note: The principles associated with distributed database equally apply to Distributed Data warehouses ).

Data Warehousing Design DT211/4

Designing Data Warehouses Need to find answers for questions such as: –Which user requirements are most important? –which data should be considered first…. The database component of a data warehouse is described using a technique called dimensionality modelling. 12

Dimensionality modeling A logical design technique that aims to present the data in a standard, intuitive form that allows for high-performance access Every dimensional model (DM) is composed of one table with a composite primary key, called the fact table, and a set of smaller tables called dimension tables. 13

Fact and dimension tables for each business process of DreamHome 14

ER model of property sales business process of DreamHome 15

Star schema for property sales of DreamHome 16

Dimensionality modeling Star schema is a logical structure that has a fact table containing factual data in the centre, surrounded by dimension tables containing reference data, which can be denormalised. For example: dimension tables (propertyfor sale, client, branch and staff) all have city region and county repeated. 17

Dimensionality modeling Star schemas can be used to speed up query performance by denormalizing reference information into a single dimension table. For example: dimension tables (propertyfor sale, client, branch and staff) all have city region and county repeated. 18

Database Design Methodology for Data Warehouses ‘Methodology’ includes following steps: –Choosing the process –Choosing the facts and dimensions –Choosing the facts –Storing pre-calculations in the fact table – 19

Choosing the process The process (function) refers to the subject matter of a particular data warehouse: to answer the most commercially important business questions. Identify the discrete business processes; For example: property sales. 20

ER model of property sales business process of DreamHome 21

Choosing the facts Decide what a record of the fact table is to represents: e.g. Property sales. Facts should be numeric and additive. Identify dimensions of the fact table. The contents for the fact table also determines the contents for each dimension table. Dimensions set the context for asking questions about the facts in the fact table; Clientbuyer: clientno., client name, city, region, county. 22

Star schema for property sales of DreamHome 23