Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Warehousing Design DT211/4. Designing Data Warehouses To begin a data warehouse project, we need to find answers for questions such as: – Which user.

Similar presentations


Presentation on theme: "Data Warehousing Design DT211/4. Designing Data Warehouses To begin a data warehouse project, we need to find answers for questions such as: – Which user."— Presentation transcript:

1 Data Warehousing Design DT211/4

2 Designing Data Warehouses To begin a data warehouse project, we need to find answers for questions such as: – Which user requirements are most important and which data should be considered first? – Which data should be considered first? – Should the project be scaled down into something more manageable? – Should the infrastructure for a scaled down project be capable of ultimately delivering a full-scale enterprise- wide data warehouse? 2

3 Designing Data Warehouses Interviews provide the necessary information for the top-down view (user requirements) and the bottom-up view (which data sources are available and will be supported over the next few years) of the data warehouse. The database component of a data warehouse is described using a technique called dimensionality modelling. 3

4 Dimensionality modeling A logical design technique that aims to present the data in a standard, intuitive form that allows for high-performance access Every dimensional model (DM) is composed of one table with a composite primary key, called the fact table, and a set of smaller tables called dimension tables. 4

5 Fact and dimension tables for each business process of DreamHome 5

6 ER model of property sales business process of DreamHome 6

7 Star schema for property sales of DreamHome 7

8 Dimensionality modeling Star schema is a logical structure that has a fact table containing factual data in the centre, surrounded by dimension tables containing reference data, which can be denormalised. Facts are generated by events that occurred in the past, and are unlikely to change, regardless of how they are analyzed. Important to treat fact data as read-only reference data that will not change over time. Most useful fact tables contain one or more numerical measures, or ‘facts’ that occur for each record and are numeric and additive. 8

9 Dimensionality modeling Star schemas can be used to speed up query performance by denormalizing reference information into a single dimension table. For example: dimension tables (propertyfor sale, client, branch and staff) all have city region and county repeated. This avoids having to certain join tables 9

10 Database Design Methodology for Data Warehouses ‘Methodology’ includes following steps: – Choosing the process – Choosing the grain – Identifying and conforming the dimensions – Choosing the facts – Storing pre-calculations in the fact table – Choosing the duration of the database 10

11 Step 1: Choosing the process The process (function) refers to the subject matter of a particular data warehouse: to answer the most commercially important business questions. Identify the discrete business processes. For example: property sales; 11

12 ER model of an extended version of DreamHome 12 © Pearson Education Limited 1995, 2005

13 ER model of property sales business process of DreamHome 13

14 Step 2: Choosing the grain Decide what a record of the fact table is to represents: e.g. Property sales. Identify dimensions of the fact table. The grain decision for the fact table also determines the grain of each dimension table. Also include time as a core dimension, which is always present in star schemas. 14

15 Star schema for property sales of DreamHome 15

16 Step 3: Identifying and conforming the dimensions Dimensions set the context for asking questions about the facts in the fact table. Clientbuyer: clientno., client name, city, region, county. If any dimension occurs in two data marts (subset of the data warehouse) 16

17 Star schemas for property sales and property advertising 17

18 Step 4: Choosing the facts The grain of the fact table determines which facts can be used in the data mart. Facts should be numeric and additive. Unusable facts include: – non-numeric facts – non-additive facts – fact at different granularity from other facts in table 18

19 Star schema for property sales of DreamHome 19

20 Property rentals with fact table corrected 20

21 Step 5: Storing pre-calculations in the fact table Once the facts have been selected each should be re- examined to determine whether there are opportunities to use pre-calculations. For example in the lease fact table: total revenue = total rent - (client allowance + staff commission) Important to decide on duration of fact table: mostly 1 to 2 years. If it is too large could mean that access to older data is difficult; problems in reading data 21

22 Fact and dimension tables for each business process of DreamHome 22

23 Dimensional model (fact constellation) for the DreamHome data warehouse 23


Download ppt "Data Warehousing Design DT211/4. Designing Data Warehouses To begin a data warehouse project, we need to find answers for questions such as: – Which user."

Similar presentations


Ads by Google