Cubes for Flat Table Land Michael P. Antonovich http://SharePointMike.wordpress.com #SharePointMikeA
My Published Books Speaker at: User’s Guide to the Apple ][ - 1983 FoxPro 2 Programming Guide – 1992 Debugging and Maintaining FoxPro – 1992 Using Visual FoxPro 3.0 – 1995 Using Visual FoxPro 5.0 – 1996 Office and SharePoint User’s Guide – 2007 Office and SharePoint User’s Guide – 2010 Speaker at: Code Camp 2009, 2010, 2011, 2012 Orlando SharePoint Saturday 2011 & 2012 Tampa, 2012 Orlando SQL Saturday - #1, #4, #8, #10, #15, #16, #21, #32, #38, #49, #62, #74, #79, #85, #86, #110, #130, #151, #168 IT PRCamp – Jacksonville 2012 OPASS Mtg October 24, 2012
Some Basic BI Terminology IMPORTANT BI TERMS Aggregate A mathematical function that allows you to summarize values of an attribute Dimension A dimension is essentially a look-up table that may define a hierarchy or drill-down path such as Year > Quarter > Month Measure A measure is something that identifies a value Fact A fact is another term for a measure that contains numeric data that can be grouped along one or more dimensional hierarchy Star Schema All dimension tables radiate out from a single fact table Snowflake Schema One fact table may relate to another fact table before relating to dimension tables. One dimension table can also have a related dimension table A Pivot table or chart is usually based around a single fact table OPASS Mtg October 24, 2012
Two Models in SSAS Multidimensional Model Tabular Model No major functionality changes since 2008 R2 Tabular Model Visually and functionally resembles PowerPivot 2012 Both can be installed as separate instances on the same server. OPASS Mtg October 24, 2012
Advantages of the MultiDimensional Model Tested technology since SQL 2000 Pre-calculated aggregates provide performance enhancements. Can handle larger data since it can store data on disk (MOLAP) or directly query the relational data source (ROLAP) Uses MDX which is supported by many 3rd party client tools. OPASS Mtg October 24, 2012
Disadvantages of the MultiDimensional Model Model is getting ‘old’ and is not being revised. (designed for 32 bit, row based data and disk storage). MDX is perceived as being difficult to learn. Processing a multidimensional model can result in substantial downtime for large models. Changes to one table require the entire model to be reprocessed. Not compatible with PowerPivot OPASS Mtg October 24, 2012
Advantages of the Tabular Model A 100% memory-based model provides greater performance. The xVelocity analytics column based engine offers significant query performance improvements. Queries and formulas use DAX which is ‘easier’ to learn than MDX. (MDX is also supported) Queries data from many different data sources. OPASS Mtg October 24, 2012
Disadvantages of the Tabular Model Does not support many-to-many relationships. Does not support true role-playing dimensions. Does not support cell-level security. Does not support security on measures. Does not support translations of metadata for locales. Does not support ragged hierarchies. OPASS Mtg October 24, 2012
Which to Choose? For most applications (60-70%) either model will work. Do you currently have a model in Multidimensional mode? Are you just learning Analysis Services? Licensing issues? Compatibility with PowerPivot? Hardware? Performance Issues? OPASS Mtg October 24, 2012
SSAS Tabular Uses DAX DAX Stands for Data Analysis Expressions DAX is used to: Create calculated columns Create custom measures OPASS Mtg October 24, 2012
Basic Syntax DAX Data Types DAX Operators Integer Real Currency DAX expressions always begin with an equal sign: = Column References can be qualified or unqualified TableName[ColumnName] [ColumnName] DAX Data Types DAX Operators Integer Real Currency Date(DateTime) TRUE/FALSE (Boolean) Text +, - *, / =, <> >,< >=, <= & AND && OR || NOT ! OPASS Mtg October 24, 2012
DAX Functions 2010 Version consisted of 135 functions 71 functions are similar to Excel functions 69 have the same name – 2 do not TEXT FORMAT DATEDIF YEARFRAC 64 functions are unique to DAX Aggregate data functions Date related functions 2012 Version has a little over 170 functions (no, I will not cover them all today) OPASS Mtg October 24, 2012
Types of DAX Calculations Simple Calculations Calculated columns within fact tables Calculated columns for dimension tables Calculated columns between tables Calculated columns to eliminate lookup tables Calculated columns to serve as links to tables using multiple columns (Calculated columns are calculated for every row in the table) Context is the row Aggregate Calculations Calculate unique measures Context is in the evaluation of the pivot data (Aggregate measures are only calculated for the displayed data in the Pivot table) OPASS Mtg October 24, 2012
Tabular Model Can Import From Microsoft Access 2003, 2007, 2010 Microsoft SQL Server 2005, 2008, 2008 R2 Oracle Relational DB 9i, 10g, 11g Teradata V2R6, V12 IBM Relational Database 8.1 Sybase Relational Databases Many other ODBC Databases Text files (.txt, .tab, .csv) Analysis Services Cubes from SQL Server Data Feeds using Atom 1.0 Format Excel Files from 97-2003, 2007, 2010 OPASS Mtg October 24, 2012
Demo 1a: Retrieve Data from Multiple Sources Open C:\Contoso2012\Stores.xlsx and rename to C:\Contoso2012\SQLSaturday1.xlsx Go to PowerPivot window and load SQL Server database: Contoso2012 using all tables Add to Data Model, Stores from the current spreadsheet. This is a linked table. Add Access database ProductCategories. Add Excel file: Geography.xlsx Use the following data sources: Open C:\Contoso2012\Stores.xlsx and rename to C:\Contoso2012\SQLSaturday1.xlsx Go to PowerPivot window and load SQL Server database: Contoso2012 using all tables Add to Data Model, Stores from the current spreadsheet. This is a linked table. Add Access database ProductCategories. Add Excel file: Geography.xlsx OPASS Mtg October 24, 2012
Demo 1a: Load Data OPASS Mtg October 24, 2012
Loading Data into the Tabular Model Demo OPASS Mtg October 24, 2012
Demo 1b: Create Relationships OPASS Mtg October 24, 2012
Demo 1c: Show Diagram View OPASS Mtg October 24, 2012
Creating Relations Between Tables Demo OPASS Mtg October 24, 2012
Technical vs. Useless Columns Technical Columns Used to link tables (IDs) Use to calculate other columns Hide from Pivot Table Field List Useless Columns Came in when data imported from data source Not used in pivot table or to link tables Delete to improve performance OPASS Mtg October 24, 2012
Demo 2: Eliminate Useless Columns and Hide Technical Columns OPASS Mtg October 24, 2012
Denormalize Data Model Eliminate tables and columns that are not going to be used. Flatten structure by created calculated dimension attributes based on values in other tables. Hide columns used in calculations but which users no longer need to see. OPASS Mtg October 24, 2012
Create a Hierarchy Predefine common hierarchies for users Hierarchies are defined from the largest grouping to the smallest: Product Category Product Subcategory Product After defining the hierarchy, you can remove the individual columns used to define the hierarchy. OPASS Mtg October 24, 2012
Demo 3: Define a Product Hierarchy OPASS Mtg October 24, 2012
Demo 4: Demo of Cube (so far) Using Excel Pivot OPASS Mtg October 24, 2012
Building Hierarchies Demo OPASS Mtg October 24, 2012
Create a Calculated Measure For those times when a built-in measure just isn’t enough… …you need a custom measure creating using DAX to satisfy the need. What is new in 2012 is that calculated measures can now be defined in the calculation area of the fact table. OPASS Mtg October 24, 2012
Creating Custom DAX Measures For example, suppose you want to display the percent increase or decrease in sales by product in your stores channel for this year vs last year. You need a new measure to calculate store sales: StoreSales:=CALCULATE(SUM([SalesAmount]),DimChannel[Ch annelName]="Store") By default, the above calculates sales for the entire table. However, in the pivot table, we can use the dimension: YEAR as a filter or slicer to perform the calculation by each year in the table. OPASS Mtg October 24, 2012
Dimensions Serve as Filters Use Time Functions to calculate measures for other time periods. LastYrSales:=CALCULATE([StoreSales], DATEADD(DimDate[Datekey],-1,year)) The above expression allows us to reference an existing expression but apply an additional filter to the calculation of StoreSales (which is already filtering on the channel: store). That additional filter in this case calculates the Store sales for the date one year prior to the current date of the record. OPASS Mtg October 24, 2012
Calculate the Percent Change in Sales Given the prior two calculated measures, store sales for the current year and store sales for the prior year for each period in the cube, you can now calculate the percent change using an expression like: YearlyGrowth:=([StoreSales]-[LastYrSales])/[LastYrSales] OPASS Mtg October 24, 2012
Using Error Checking Actually, the above sample works only because the slicer limited the calculations to specific years. However, in general, you need to check equations for error conditions like dividing by zero by using a formula more like: YearlyGrowth:=IF(ISBLANK([StoreSales]) || ISBLANK([LastYrSales]), 0, ([StoreSales]- [LastYrSales])/[LastYrSales]) OPASS Mtg October 24, 2012
Demo 5a: Define a Calculated Measure OPASS Mtg October 24, 2012
Demo 5b: The Pivot Table with Calculated Measures OPASS Mtg October 24, 2012
Turning a Calculated Measure into a KPI KPI are nothing more than calculated measures in a fact table that are compared to a target value to determine whether the value is good or bad. OPASS Mtg October 24, 2012
Adding KPI Calculations What is a KPI? Key Performance Indicator Key Performance Indicators provide information at a glance to indicate status of a measureable fact about your company/organization OPASS Mtg October 24, 2012
Adding KPI Calculations A KPI Needs: A Base Value A Target Value A number of status intervals Thresholds for each interval Symbols to use to indicate status OPASS Mtg October 24, 2012
Demo 6a: Using DAX to Create a KPI C:SQLSaturday/DAX/Vaccinations2Time =COUNTROWS(DISTINCT(Patient_Vaccinations[VisitID])) OPASS Mtg October 24, 2012
Demo 6b: Using DAX to Create a KPI C:SQLSaturday/DAX/Vaccinations2Time =COUNTROWS(DISTINCT(Patient_Vaccinations[VisitID])) OPASS Mtg October 24, 2012
Adding a KPI Demo OPASS Mtg October 24, 2012
Sorting by Other Fields You notice in the previous demo that while the rows displayed the sales by month, the months were sorted alphabetically, not chronologically. No one will accept that. How can you sort the months correctly. PowerPivot 2012 introduces a Sort by Another Column feature! OPASS Mtg October 24, 2012
Define a Calculated Column with the Month # OPASS Mtg October 24, 2012
Associate the Month Label with the New Column OPASS Mtg October 24, 2012
Demo 6c: Correctly Ordered Months C:SQLSaturday/DAX/Vaccinations2Time =COUNTROWS(DISTINCT(Patient_Vaccinations[VisitID])) OPASS Mtg October 24, 2012
Sorting on Alternate Columns Demo OPASS Mtg October 24, 2012
Useful Links My blog is running a series of articles on working with the Tabular model http://SharePointMike.wordpress.com Using the SSAS Tabular Model Week 1 http://sharepointmike.wordpress.com/2012/10/06/using-the-ssas-tabular-model-week-1/ Gathering Data From Different Data Sources Week 2 http://sharepointmike.wordpress.com/2012/10/13/using-the-ssas-tabular-model-week-2/ Displaying your first Pivot Table from a Tabular Model http://sharepointmike.wordpress.com/2012/10/20/using-the-ssas-tabular-model-week-3/ Hierarchies http://sharepointmike.wordpress.com/2012/10/27/using-the-ssas-tabular-model-week-4-hierarchies http://sharepointmike.wordpress.com/2012/11/03/using-the-ssas-tabular-model-week-5-hierarchies-2 KPIs http://sharepointmike.wordpress.com/2012/11/10/using-the-ssas-tabular-model-week-6-kpi Clean-Up in Week 7 http://sharepointmike.wordpress.com/2012/11/17/using-the-ssas-tabular-model-clean-up-in-week-7 DAX On-line Function Reference http://technet.microsoft.com/en-us/library/ee634396.aspx OPASS Mtg October 24, 2012
Got Questions? OPASS Mtg October 24, 2012
Thank You Michael P. Antonovich Mike@micmin.org Blog site: Don’t forget your evaluations. Michael P. Antonovich Mike@micmin.org Blog site: http://sharepointmike.wordpress.com/ OPASS Mtg October 24, 2012