Loading Multiple Fact Data Structures in QlikView QlikView 8.50+

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

Dimensional Modeling.
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
Dimensional Modeling CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 From Requirements to Data Models.
Dimensional Modeling – Part 2
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Build a database IV: Create queries for a new Access database Overview: Ask your data — create queries It’s time to create queries, one of the most powerful.
1//hw Cherniak Software Development Corporation ARM Features Presentation Alacrity Results Management (ARM) Major Feature Description.
4-1 INTERNET DATABASE CONNECTOR Colorado Technical University IT420 Tim Peterson.
IMS 6217: Data Warehousing / Business Intelligence Part 3 1 Dr. Lawrence West, Management Dept., University of Central Florida Analysis.
1 Data Warehousing Lecture-13 Dimensional Modeling (DM) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research.
Microsoft ® Access ® 2010 Training Create Queries for a New Database If a yellow security bar appears at the top of the screen in PowerPoint, click Enable.
DataMart - Advanced Presenter: Dave Bennett.  Advanced use of Datamart  Review - Datamart and its tools  Review - Finding data  Review -Connecting.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
Examining data using Microsoft Access Queries Using Criteria and Calculations SESSION 3.2 This section covers specifying an exact match condition in a.
Chapter 1 Adamson & Venerable Spring Dimensional Modeling Dimensional Model Basics Fact & Dimension Tables Star Schema Granularity Facts and Measures.
BI Terminologies.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
Ch3 Data Warehouse Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Designing a Data Warehousing System. Overview Business Analysis Process Data Warehousing System Modeling a Data Warehouse Choosing the Grain Establishing.
UNIT-II Principles of dimensional modeling
Chapter 4 Logical & Physical Database Design
Student Centered ODS ETL Processing. Insert Search for rows not previously in the database within a snapshot type for a specific subject and year Search.
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
CSC314 DAY 8 Introduction to SQL 1. Chapter 6 © 2013 Pearson Education, Inc. Publishing as Prentice Hall SQL OVERVIEW  Structured Query Language  The.
Physical Layer of a Repository. March 6, 2009 Agenda – What is a Repository? –What is meant by Physical Layer? –Data Source, Connection Pool, Tables and.
Database Constraints ICT 011. Database Constraints Database constraints are restrictions on the contents of the database or on database operations Database.
Extending and Creating Dynamics AX OLAP Cubes
John Tran Business Program Manager, The Suddath Companies
Databases: What they are and how they work
Logical Database Design and the Rational Model
SQL Query Getting to the data ……..
View Integration and Implementation Compromises
392-"JOIN" us on DataServers Journey
Dynamic SQL Writing Efficient Queries on the Fly
Creating & Editing Reports
Data Virtualization Demoette… Flat-File Data Sources
On-Line Analytic Processing
Data warehouse and OLAP
Advanced QlikView Performance Tuning Techniques
Physical Database Design and Performance
Using Partitions and Fragments
Dynamic SQL Writing Efficient Queries on the Fly
Data Warehouse.
Star Schema.
Entity Relationship Diagram
REPORT QUERY BUILDER featuring: Report Query Viewer.
Please use speaker notes for additional information!
Relational Database Model
Java Programming Arrays
Java Programming Loops
Chapter 4 Summary Query.
Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009
Retail Sales is used to illustrate a first dimensional model
Dimensional Modeling.
Price and Volume Measures
Chapter 4 Company Code Global Parameters
Adding Multiple Logical Table Sources
Retail Sales is used to illustrate a first dimensional model
Contents Preface I Introduction Lesson Objectives I-2
Dimensional Model January 16, 2003
DWH – Dimesional Modeling
Price and Volume Measures
Joins and other advanced Queries
A – Pre Join Indexes.
Analysis Services Analysis Services vs. the Data Warehouse vs. OLTP DB
Getting Started with Data
Presentation transcript:

Loading Multiple Fact Data Structures in QlikView QlikView 8.50+ Phil Bishop Principal – Natural Synergies, Inc. phil.bishop@naturalsynergies.com

Goals and Agenda Introduce, define, and demonstrate Multiple Fact Table (a.k.a. Link Tables) Structure. Define limitations and benefits of Multiple Fact Table Design. Demonstrate use of generic keys. Demonstrate absolute market share using Set Analysis. Provide practice example for students to learn design techniques.

The Typical Question Users: Id UserId Password Level LastLogin Active Posted: Oct 10 2006 - 03:17 AM Subject: Help on Table definitions (and not only..) Hi, i'm working with QlikView Enterprise 7.5 and with sql server tables (loaded via ODBC) that contains my intranet data (calls, planning, fax and so on..) I have these tables: Users (the intranet users) Customers (our customers) Calls (the daily calls) Planning (our daily planning) i'm trying to build a working sheet that allow me, selecting data, users and customer to show the calls and appointments of that users with that customer. Well i'm in trouble.. i don't understand how to join correctly the Calls and the Planning table to users and customers. these are my tables: Users: Id UserId Password Level LastLogin Active Group CompleteName IsAGroup Customers: CUSTOMER_ID CUSTOMER_NAME ADDREESS CAP CITY PR P_IVA COD_FISCALE PHONE FAX Planning: UserId (ID of Users table) Date Time CUSTOMER_ID Note Calls: Date CUSTOMER_ID Operator UserId (ID of Users table) Argument 3

The Typical Question – A Possible Solution This structure will load, as is, into QlikView cleanly, with no circular loops, and 1 synthetic key: 4

The Typical Question – Add Product/Market However, if Product information is added, and Product is added to Calls, Market added to Planning: The structure quickly degrades to a circular loop and a loosely coupled table 5

Problems with traditional Data Structures Synthetic keys Greater possibility of circular loops Selection of one type of data precludes other types of data Difficult to add new types of data, especially fact Does not support Generic Keys cleanly Relationships are typically complex and hard to understand 6

Multiple Fact Table Structure An alternative approach for the QlikView logical data structure in complex applications. Multiple Fact Table Structures are designed to avoid the problems with traditional structures by creating a star schema in QlikView, separating data into Dimensions and a single Fact table. This is cleaner, easier, and more efficient than multiple Fact tables in a traditional link table structure. Then, associating these two types of data with “linking” tables to obtain Generic Key benefits 7

Multiple Fact Data Structures Design Technique Phil Bishop Principal – Natural Synergies, Inc. phil.bishop@naturalsynergies.com 8

Design Technique - Dimensions Dimension tables contain relatively static component data, such as Products, Customers, Employees, and Dates. All selectable fields (including chart dimensions) should reside in these tables (and optionally in a secondary link table). These tables will almost always have a single key field. It is possible that some dimension tables will share the same key field. Dimension tables can also include "counting“ tables, which contain the DISTINCT key field values and a constant (typically 1) used to sum the current number of selected values, e.g. sum(Physician_Count) Note: It is generally good practice in QlikView to try to stay as close to the STAR schema as possible, and avoid normalizing (or snowflaking) dimension tables. This will result in the shortest possible chain lengths (traverses) when resolving a chart expression. 9

Design Technique – Dimensions Example D_Territory: Load Territory as [%Territory], Territory, District, Region FROM [Sales Geography.csv] (ansi, txt, delimiter is ',', embedded labels); 10

Open LinkTable85_Exercise.qvw Review Dimension tab in load script 11

Design Technique - Fact The Fact table contains transaction data. The data is almost always time-based, and each row is typically defined by combinations of dimension keys, e.g. Date, Product, Customer (Physician in this example). The final load of the Fact Table data will contain all key fields. Use generic for key fields that do not exist in a particular fact table set of records. Other fields will include measure fields, e.g. TRx (sales amount) that will be aggregated in charts and other UI objects. In addition, flag fields should be included here. The Fact Table should NEVER contain selectable fields, since selection in one set of fact table records will cause almost all other fact table records to become non referenced (due to specific keys). 12

Design Technique – Fact Example All_Facts: Load // KEYS DateIndex as [%Date], $(v_A) $(v_S) PhysicianID as [%Territory_Physician], ProductID as [%Product], $(v_A) as [%DetailPositionLink], // FLAGS -1*(right(ProductID,2)='01') as flag_TRx_Company, -1*(DateIndex < 4) as flag_curr_3m, -1*(DateIndex > 3 and DateIndex < 7) as flag_prev_3m, applymap('M_Current_Year',DateIndex,0) as flag_curr_y, // MEASURES TRx FROM Sales_Source.qvd (qvd) WHERE TRx <> 0; Generic Key for sales territory Define flags here to avoid double counting w/ generic keys Should rarely, if ever need zero values 13

Open LinkTable85_Exercise.qvw Add additional Fact table records from Calls_Sample.qvd - Use [Fact Table: Call Activity] sheet - Concatenate to All_Facts table - Define all standard key fields - Define all standard flag fields - Use 1 as Sample_Count - Include SampleQuantity as a measure 14

Design Technique – Secondary Link Tables Secondary Link tables are optional, but serve a number of purposes. These tables define relationships between dimension and fact tables as well as between multiple dimension tables. For example, the relationship between territories and physicians will be defined in the secondary link table L2_Alignment. This feature allows a generic key that still honors the inherent data relationships and hierarchies. This is where all generic keys are defined. A generic key is a specific value (typically 'ALL', set in variable [v_A]) that replaces the existing value in the portion of the field (or the field itself) that connects to the FACT table. The generic value should never exist in the key field link to the dimension table. Selection fields are allowed in these tables, but are generally rare. Recommendation is to use DISTINCT on all loads 15

Design Technique – Secondary Link Tables Example This is link to Fact table. It should match Fact table keys. Example: L2_Detail_Position: Load DISTINCT [%DetailPosition] as [%DetailPositionLink], [%DetailPosition] Resident D_Detail_Position; // GENERIC DETAIL POSITION CONCATENATE (L2_Detail_Position) $(v_A) as [%DetailPositionLink], This is the Generic value. This key links directly to the Dimension table. These types of keys should never be generic. 16

Generic Keys Phil Bishop Principal – Natural Synergies, Inc. phil.bishop@naturalsynergies.com 17

Generic keys The main purpose of generic keys is to allow the use of a standard set of Fact table keys without requiring key values for all dimensions and all types of fact data. A generic key is a specific value (typically 'ALL', set in variable [v_A]) that replaces the existing value in the portion of the field (or the field itself) that connects to the Fact table. Make sure that the value you use will never occur in your actual data. The generic value should never exist in the key field link to the dimension table. 18

Generic keys - Single Key Field To create a generic value for a single key, simply repeat each record with the generic value. 19

Generic keys - Multiple Key Fields To create a generic value for multiple keys, reload data for all required combinations of generic data. 20

Open LinkTable85_Exercise.qvw Add load for generic Territory in table [L2_Alignment] - Use [Secondary Link Tables] sheet - Concatenate to L2_Alignment table - Use defined variables for generic key and separator 21

Calculating Absolute Market Share Using Set Analysis Phil Bishop Principal – Natural Synergies, Inc. phil.bishop@naturalsynergies.com 22

Set Analysis With Set Analysis, we no longer need to pre-aggregate denominator values for market share. Use a Set that defines the current selection minus any product selection for the denominator. In this case, we should also add a select for the associated Market in order to get the correct result regardless of selection state. 23

Set Analysis Calculating Market Share: Market Share = Sum of Product Sales / Sum of all Products in Market Sales In our example: Sum of Product Sales = Sum (TRx) Sum of all Products in Market Sales = Sum ({ $<Product=>} TRx) Since the calculation is taking place within a chart, we need to account for the dimensions using TOTAL <MonthYear,Market> { $<Product=,Market={'$(=concat(DISTINCT Market,chr(39)&','&chr(39)))'}>} Since it is possible that no Market is selected, we need to account for this in the Set Analysis by “selecting” the possible markets. Use the Concat() function to list the Market values associated with each Product (value in the dimension) 24

Set Analysis seems to use chart cache a little different Set Analysis Observations: Be careful of syntax Avoid using key fields Set Analysis seems to use chart cache a little different Understand use of TOTAL when working within chart dimensions Be careful of using Suppress When Value Is Null on chart dimensions 25

Exercise 4 Open LinkTable85_Exercise.qvw Client would like to add Goal data to this document. This is a rolling comparison of 24 months of sales data used to compare to the 24 months of current data for Company products. Goal data is located in Sales_Goal.qvd Goal data has Product and Physician (Customer), but no time dimension. Add a new tab in load script, and add Goal data to All_Facts Add script code to account for generic dates Add a bar chart comparing sum(TRx) to sum(TRx_Goal) by Product_Company 26