Implementing Common Business Calculations in DAX

Slides:



Advertisements
Similar presentations
Microsoft® Access® 2010 Training
Advertisements

Chris Webb Crossjoin Consulting Ltd
Cache –Warming Strategies for Analysis Services 2008 Chris Webb Crossjoin Consulting Limited
8. Introduction to Spreadsheet CSCI N207 Data Analysis Using Spreadsheet Lingma Acheson Department of Computer and Information Science,
Characteristic Functions. Want: YearCodeQ1AmtQ2AmtQ3AmtQ4Amt 2001e (from fin_data table in Sybase Sample Database) Have: Yearquartercodeamount.
Introduction to Excel Formulas, Functions and References.
Technical BI Project Lifecycle
Sqlbi.com. sqlbi.com When Many To Many Are Really Too Many Alberto Ferrari Senior Consultant SQLBI.COM.
Tutorial 7: Using Advanced Functions and Conditional Formatting
Implementing Business Analytics with MDX Chris Webb London September 29th.
Introduction Paul Turley SqlServerBiBlog.com Mentor, SQL Server MVP
DAX uses a syntax similar to Excel TableX [ColY] or [ColY] (fully qualified vs. not qualified) Cells cannot be referenced (like B23, B12:C15, …)
Presenter: Dave Bennett
McGraw-Hill/Irwin The Interactive Computing Series © 2002 The McGraw-Hill Companies, Inc. All rights reserved. Microsoft Excel 2002 Exploring Formulas.
Advanced Lesson 1: Advanced Data Organization In Excel 2007, you can use a table to manage and organize related data. You can use the Autofilter tools.
XP Copyright 2003 Peter McDevitt 1 Microsoft Excel 2002 Lecture 2 – Working With Formulas and Functions.
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
SPONSORS. Microsoft PowerPivot for SQL Server, Excel 2010, and SharePoint 2010 Michael Herman Syntergy, Inc.
Introduction to Solving Business Problems with MDX Robert Zare and Tom Conlon Program Managers Microsoft.
IST722 Data Warehousing Business Intelligence Development with SQL Server Analysis Services and Excel 2013 Michael A. Fudge, Jr.
Performance Tuning Cubes and Queries in Analysis Services 2008 Chris Webb
With Microsoft Office 2007 Introductory© 2008 Pearson Prentice Hall1 PowerPoint Presentation to Accompany GO! with Microsoft ® Office 2007 Introductory.
SQL Unit 5 Aggregation, GROUP BY, and HAVING Kirk Scott 1.
Lists in Python.
 What is a formula in Excel?  A formula is statement written by the user to be calculated. Formulas can be as simple or as complex as the user wants.
Objects for Business Reporting MIS 497. Objective Learn about miscellaneous objects required for business reporting. Learn about miscellaneous objects.
Fun with Scoped Assignments
Analyzing Data For Effective Decision Making Chapter 3.
Consolidate Consolidate Multiple Worksheets to a Single Sheet in Excel.
DataMart - Advanced Presenter: Dave Bennett.  Advanced use of Datamart  Review - Datamart and its tools  Review - Finding data  Review -Connecting.
CIS 338: Using Queries in Access as a RecordSource Dr. Ralph D. Westfall May, 2011.
Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance.
 Agenda 2/20/13 o Review quiz, answer questions o Review database design exercises from 2/13 o Create relationships through “Lookup tables” o Discuss.
Database Systems Microsoft Access Practical #3 Queries Nos 215.
David Dye.  Introduction  Introduction to PowerPivot  Working With PowerPivot.
BI Terminologies.
Platinum Gold Silver Group BY: [Remember get your Tickets for Entry, Coach, Drink] Feedback Forms: [Fill these out at the end of each session.
Microsoft ® Office Excel 2003 Training Using XML in Excel SynAppSys Educational Services presents:
Highline Class, BI 348 Basic Business Analytics using Excel Introduction to PowerPivot 1.
Advanced Tips And Tricks For Power Query
INTRODUCTION TO SPREADSHEETS MICROSOFT EXCEL. Spreadsheets Allows users to perform simple and complex sorting Allows users to perform calculations quickly.
Access Queries Agenda 6/16/14 Review Access Project Part 1, answer questions Discuss queries: Turning data stored in a database into information for decision.
Overview Excel is a spreadsheet, a grid made from columns and rows. It is a software program that can make number manipulation easy and somewhat painless.
Intro to Power BI Azhagappan Arunachalam.  Senior Database Architect   PowerBICentral.com  (blog on getting started.
BISM Introduction Marco Russo
Advanced MDX Tips And Tricks Chris Webb. Who Am I? Chris Webb UK-based consultant and trainer:
@CRMUG Get2Know CRM 2015 Covering the Field(s) - Rollup and Calculated Style.
A highway through the mountains of data with the SQL Server Tabular Model This presentation is a walk through in-memory database and reporting features.
Or How I Learned to Love the Cube…. Alexander P. Nykolaiszyn BLOG:
Example Materials Power BI Desktop File (4/16): 6WPbWY5UTRhaHYwdDA/view?usp=sharing.
Advanced Analysis Services Security Chris Webb Crossjoin Consulting Limited.
about me – Austin Senseman, CFA 5 years in Financial Services, Managed analytics for sales, marketing, risk, finance, &
Microsoft PowerBI – Advanced Solutions with Microsoft Excel and PowerBI Presented by: Phillip Guglielmi, CPA | Senior BI Consultant and Solutions Architect.
John Tran Business Program Manager, The Suddath Companies
Building Tabular Models
Using Advanced Functions and Conditional Formatting
Tutorial 5: Working with Excel Tables, PivotTables, and PivotCharts
Cross UG Summit EMEA /6/2018 7:24 PM
Module 11: Introduction to DAX Module 11 Introduction to DAX
Oracle Analytic Views Enhance BI Applications and Simplify Development
Introduction to tabular models
Introduction to tabular models
Enhance BI Applications and Simplify Development
Evaluation Context Concepts and Examples.
Chapter 4 Summary Query.
Modeling scenarios for DAX
Loading Multiple Fact Data Structures in QlikView QlikView 8.50+
Introducing DAX July 11th, 2015.
Presentation transcript:

Implementing Common Business Calculations in DAX Chris Webb chris@crossjoin.co.uk www.crossjoin.co.uk

Who am I? Chris Webb (chris@crossjoin.co.uk) Independent consultant specialising in Analysis Services and MDX (and PowerPivot and DAX!): http://www.crossjoin.co.uk Blogger: http://cwebbbi.spaces.live.com Author: MDX Solutions 2nd Edition Expert Cube Development With Analysis Services 2008

Agenda What is DAX and why should I learn it? Calculated columns and calculated measures Row context and filter context Calculate() Values() Demos, demos, demos!

What is DAX? DAX is the new calculation language for PowerPivot MDX is still the query language It is a multidimensional calculation language like MDX, but the design goals were: Make it easy to do common calculations (easier than MDX) Make it easy to use for Excel power users, hence the Excel-based syntax DAX expressions are limited to a single (though often very long) line of code

Why learn DAX? Your power users will be learning it, and you’ll need to understand what’s going on in their PowerPivot models It’s very likely that it will be used in future versions of SSAS – possibly as an alternative to MDX for some calculations We love both DAX and MDX, and MDX is not going away. The challenge for us is to bring them together. We have some promising directions. That said – I think you’ll see DAX evolving in new important directions that MDX will never cover, and a large and growing portion of the calc work will be done in DAX. So my advice stands: all you guys need to become DAX gurus ASAP. Amir Netz, Microsoft

DAX Syntax and Functions DAX syntax is based on Excel syntax Only thing to watch out for is you need to use && and || for AND and OR Supports around 80 Excel functions Uses its own data types In Excel you only have numbers or strings Uses the concept of BLANK rather than NULL PowerPivot BLANKs behave in exactly the same way as SSAS nulls The UI for editing DAX is terrible – can use NotePad++ and Colin Banfield’s DAX language template instead

Referencing Columns and Measures DAX usually requires fully qualified names, eg: MyTable[MyColumn] MyTable[MyMeasure] You can use unqualified names in calculated column definitions, when referring to other columns in the same table DAX cannot use standard Excel cell references to data elsewhere in the workbook.

Calculated Columns DAX can be used to create two types of calculation: calculated columns and calculated measures Calculated columns are derived columns in a PowerPivot table Are defined inside the PowerPivot UI After they are created, they behave like any other column Their values are calculated immediately after the real data has been loaded into PowerPivot Can be used to do basic ETL work, eg concatenating first and last names, deriving years from dates etc

Calculated Measures Calculated measures provide the numeric values you see aggregated in pivot tables All measures are calculated measures in PowerPivot! Are defined within the pivot table in Excel Are calculated at query time Basic sum, count, min, max and average calculated measures can be created from the right-click menu More advanced calculated measures can be created by entering your own DAX expression

Row Context and Filter Context Row context refers to the current row where a calculation is taking place There is no row context without a DAX aggregation function Filter context refers to the currently selected item on each column on each table Very similar to the CurrentMember function in MDX, but handles multiselect elegantly Filter context follows the one-to-many relationships between tables Both row and filter context must be taken into account when writing expressions for calculated measures

Row and Filter Context Examples

Filter Context

Filter Context

Row Context

Row Context with Calculation

Row Context with Calculation

Calculate() The Calculate() function is the key to all advanced DAX calculation functionality Signature: Calculate(Expression, SetFilter1, SetFilter2,...) It allows any expression to be evaluated in a specific filter context It works as follows: Modifies the current filter context according to the SetFilter arguments you pass in Shifts the row context onto the filter context Evaluates your expression in the new filter context

Calculate() SetFilter arguments can either be: Boolean expressions, eg CALCULATE(COUNTROWS(), Consultants[Skill]="MDX") Table functions, so the filter context for any table is set to the rows returned by the table function, eg CALCULATE(COUNTROWS(), FILTER(Consultants, Consultants[Measure1]>3)) Useful table functions include: Values(), which returns a list of distinct values in a column in the current filter context All(), which returns a list of all the values in a column ignoring filter context

Ratio to All and Ratio to Parent A common example of a ‘ratio to all’ and ‘ratio to parent’ calculations is a market shares The numerator for this kind of calculation is easy: SUM('Consultants'[Measure1]) The denominator is the ‘all’ or ‘parent’ total To get the total value of a measure with all filter context cleared we need to use All(Table), eg: CALCULATE(SUM('Consultants'[Measure1]), ALL(Consultants)) To get the total value of a measure with filter context for just one column removed, we need All(Column), eg: CALCULATE(SUM('Consultants'[Measure1]), ALL(Consultants[Consultant]))

Ratio to All and Ratio to Parent This makes the final calculations: =SUM('Consultants'[Measure1])/ CALCULATE(SUM('Consultants'[Measure1]), ALL(Consultants)) and =SUM(Consultants[Measure1])/ CALCULATE(SUM('Consultants'[Measure1]), ALL(Consultants[Consultant]))

Previous Year Growth Basic algorithm to calculate year-on-year growth is: ((Sales for Current Time Period) - (Sales for Same Time Period in Previous Year)) / (Sales for Same Time Period in Previous Year) The big problem is how to calculate sales for the same time period in the previous year in the absence of SSAS-like hierarchies?

Previous Year Growth The following is the best approach: =CALCULATE(SUM(FactInternetSales[SalesAmount]) , DATEADD(DimDate[FullDateAlternateKey], -1, YEAR) , ALL(DimDate)) Works by: Finding all the dates in the current year Shifting each date back one year Setting this as the filter context Summing SalesAmount for these dates BUT will only give correct results if you have a complete set of dates in your year!

Previous Year Growth Always a good idea in DAX to break complex calculations up into a series of simpler calculated measures So, if we use the previous formula to define a calculated measure called Previous Year Sales, the growth calculation becomes: =IF(FactInternetSales[Previous Year Sales]=0 , BLANK() ,(SUM(FactInternetSales[SalesAmount]) - FactInternetSales[Previous Year Sales]) / FactInternetSales[Previous Year Sales]) Notice, like in MDX, we need to trap division by zero

Rules for Time Intelligence functions Five rules for using Time Intelligence functions: Never use the datetime column from the fact table in time functions. Always create a separate Time table, and make sure it contains complete years Create relationships between fact tables and the Time table. Make sure that relationships are based on a datetime column (and NOT based on another artificial key column). The datetime column in the Time table should be at day granularity (without fractions of a day).

If you don’t follow the rules... As we’ve already seen, not having complete years makes it hard to do relative time calculations In my example, since I joined on a surrogate key and not a DateTime key, I needed to add All(DimDate) to my calculation You can’t use RELATED() to bring the date down to the fact table either, because this causes a circular reference error! Need to either alter the underlying relational table or view, or import the date dimension table twice

Year to Date Luckily there are many built-in time intelligence functions for common calculations Eg for doing year-to-date sums, we have TotalYTD: =TOTALYTD( SUM(FactInternetSales[SalesAmount]) ,DimDate[FullDateAlternateKey] , ALL(DimDate)) In this case, TotalYTD is a variant of Calculate with some filters set automatically

Total to Date A total-to-date gets the running total from the first date we have in our data No built-in function for it We need to use Calculate and set the filter to all dates from the first ever date to the last date in the current context We can use DATESBETWEEN to get this date range The first date can be got by using BLANK() The last date with the LASTDATE() function

Total to Date The final version is: =CALCULATE( SUM(FactInternetSales[SalesAmount]) , DATESBETWEEN('DimDate'[FullDateAlternateKey] , BLANK() , LASTDATE( 'DimDate'[FullDateAlternateKey])) , All('DimDate'))

Values() – like CurrentMember but better! The Values() function acts like the MDX CurrentMember function But it is better: Values() returns a table, so it handles multiselect Although when the table it returns only contains one row we can still do a direct comparison with another value The Distinct() function works exactly the same way as Values() but: Values() will return the Unknown Member Distinct() will not

Distinct Count To find a distinct count, we need to count the rows in a table containing only the distinct values from a column The Values() function returns such a table Because it’s a table, all we need to do is count the number of rows in it: =COUNTROWS(VALUES(Consultants[Skill]))

Ranks There is no rank function in PowerPivot So, for example if we wanted to calculate a rank for dates by sales, we would need to use the following approach: For each date Find a table containing the complete list of all dates Filter that list of all dates to find those whose sales were higher Count the number of rows in that table Add 1

Ranks This gives as a first attempt: =COUNTROWS( FILTER(ALL(DimDate[DateKey]) ,(DimDate[DailySales]( VALUES(DimDate[DateKey])) < DimDate[DailySales]))) + 1

Rank only for Dates With this knowledge, we can avoid doing the calculation when more than one date is in the row context: =IF( COUNTROWS(VALUES(DimDate[DateKey]))>1 , BLANK() ,COUNTROWS( FILTER(ALL(DimDate[DateKey]) ,(DimDate[DailySales](VALUES(DimDate[DateKey])) <DimDate[DailySales]) )) + 1)

Time Utility dimensions in PowerPivot Time Utility dimensions are an SSAS technique for applying one calculation to multiple measures Also known as Date Tool or Shell dimensions Since PowerPivot does not allow calculated members on anything other than the Measures dimension, it isn’t possible to create a true Time Utility dimension  But... You can build something almost the same: Create a calculated measure that allows you to select which real measure you want Create a calculated measure that allows you to select which calculation you want, and apply it to the previous calculated measure

Final Thoughts DAX does the easy stuff very easily DAX does the medium-hard stuff well too Not sure if power users will get it, but It’s easier than MDX It’s more SQL-like, so more developers will get it It’s more elegant than MDX in many ways, eg multiselect DAX does not do the really hard stuff well at all So for financial apps, for example, MDX still wins But for how long? I like DAX a lot

Links http://blogs.msdn.com/powerpivot/ http://www.powerpivot.com/ http://powerpivotgeek.com http://powerpivottwins.com/ http://powerpivotpro.com/ http://www.powerpivot-info.com/ http://cwebbbi.spaces.live.com http://social.msdn.microsoft.com/Forums/en-US/sqlkjpowerpivotforexcel/threads http://www.business-intelligence.kdejonge.net/ http://sqlblog.com/blogs/marco_russo/

Thanks!

Coming up… P/X001 How to Get Full Access to a Database Backup in 3 Minutes or Less Idera P/L001 End-to-end database development has arrived Red Gate P/L002 Weird, Deformed, and Grotesque –Horrors Stories from the World of IT Quest P/L005 Expert Query Analysis with SQL Sentry SQLSentry P/T007 Attunity Change Data Capture for SSIS Attunity #SQLBITS