Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sqlbi.com. sqlbi.com When Many To Many Are Really Too Many Alberto Ferrari Senior Consultant SQLBI.COM.

Similar presentations


Presentation on theme: "Sqlbi.com. sqlbi.com When Many To Many Are Really Too Many Alberto Ferrari Senior Consultant SQLBI.COM."— Presentation transcript:

1 sqlbi.com

2 sqlbi.com When Many To Many Are Really Too Many Alberto Ferrari Senior Consultant SQLBI.COM

3 Who’s Speaking? BI Expert and Consultant  Problem Solving  Complex Project Assistance  DataWarehouse Assesments and Development  Courses, Trainings and Workshops SQLBI: Branch of a Microsoft BI Gold Partner Book Author, SSAS Maestro alberto.ferrari@sqlbi.com «Spaghetti English»  Get prepared, I can’t help it

4 Agenda  Many To Many Relationships Data Model DAX Formula pattern  Classical Many To Many  Cascading Many To Many  Survey Data Model  Basket Analysis

5 Prerequisites  We will use, but not describe: Tabular basics DAX basics Evaluation contexts and interactions with relationships Data modeling with Many To Many  After all … this is a 400 session

6 Current Accounts Model M2M

7 No support for M2M in Tabular  Facts Multidimensional handles M2M relationships Tabular does not  Thus, my fastest ever session ends here  Or … we can dive into DAX and play with it

8 Demo – Classical M2M  We will always start looking at the final result  Then, we dive into the DAX code

9 The M2M DAX Pattern  Leverages CALCULATE Row Contexts Filter Contexts Automatic transformation of Row Context into Filter Context using CALCULATE  Next slide: the formula. Keep it in mind It will appear quite often from now on

10 The DAX Formula of M2M Pattern AmountM2M := CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, CALCULATE ( COUNTROWS (Bridge_AccountCustomer) ) > 0 ) )

11 What the formula should perform Filter on Dim_CustomerFilter on Dim_Account

12 How the formula works AmountM2M := CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, CALCULATE ( COUNTROWS (Bridge_AccountCustomer) ) > 0 ) ) The filter is applied by the Tabular data model

13 How the formula works AmountM2M := CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, CALCULATE ( COUNTROWS (Bridge_AccountCustomer) ) > 0 ) ) The filter is applied by the Tabular data model FILTER: For each row in Dim_Account CALCULATE: filters the bridge

14 AmountM2M := CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, CALCULATE ( COUNTROWS (Bridge_AccountCustomer) ) > 0 ) ) How the formula works The filter is applied by the Tabular data model Only rows in Dim_Account where COUNTROWS () > 0 survive the FILTER FILTER: For each row in Dim_Account CALCULATE: filters the bridge

15 AmountM2M := CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, CALCULATE ( COUNTROWS (Bridge_AccountCustomer) ) > 0 ) ) How the formula works This filter is applied by the Tabular data model FILTER: For each row in Dim_Account CALCULATE: filters the bridge Only rows in Dim_Account where COUNTROWS () > 0 survive the FILTER SUM is evaluated only for the accounts filtered from the customers

16 The Karma of CALCULATE BI123|many-to-many In DAX16 AmountM2M_Wrong := CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, COUNTROWS (Bridge_AccountCustomer) > 0 ) ) AmountM2M_Correct := CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, CALCULATE ( COUNTROWS (Bridge_AccountCustomer) ) > 0 ) )

17 AmountM2M := CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, COUNTROWS (Bridge_AccountCustomer) > 0 ) ) Wrong formula in action All the rows in the Account table survived the FILTER

18 AmountM2M := CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, CALCULATE ( COUNTROWS (Bridge_AccountCustomer) ) > 0 ) ) Many-to-many in action Two rows in the Account table survived the FILTER

19 The Karma of CALCULATE BI123|many-to-many In DAX19 AmountM2M_Wrong := CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, COUNTROWS (Bridge_AccountCustomer) > 0 ) ) AmountM2M_Correct := CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, CALCULATE ( COUNTROWS (Bridge_AccountCustomer) ) > 0 ) )

20 Many to Many - Conclusions  Complexity Not in the data model Only in the formula Good understanding of CALCULATE and of relationships in Tabular needed This is just the beginning of our journey

21 Cascading Many To Many

22 Filter on Dim_CategoryFilter on Dim_Account The pattern is the same, but this time we need to jump two steps to complete our task

23 Cascading M2M- Demo

24 Cascading Many To Many CALCULATE ( CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER ( Dim_Account, CALCULATE ( COUNTROWS (Bridge_AccountCustomer) ) > 0 ) ), FILTER ( Dim_Customer, CALCULATE ( COUNTROWS (Bridge_CustomerCategory) ) > 0 ) )

25 Cascading Many To Many  Generic Formula  Works with any number of steps  Be careful Always start with the farthest table And move one step at a time in the direction of the fact table One CALCULATE for each step  Complexity: M x N (geometric … )

26 Cascading Alternative Need for some ETL here

27 Cascading Alternative The bridge now «feels» three different filters, but the formula becomes a classical many to many and runs faster

28 Flattened Cascading Many To Many  Flattened Data Model  Faster than the cascading one  Simpler formula  Needs ETL  Worth a try in Multidimensional too …

29 Survey

30 Survey Data Model  Facts: One customer answers many questions One question is answered by many customers We want to use a PivotTable to query this

31 Survey Scenario  Question: « What is the yearly income of consultants? » In other words …  Take the customers who answered «Consultant» at the «Job» question  Slice them using the answer to the question «Yearly Income»

32 Survey - Demo

33 Survey Analytical Data Model  Two «Filter» Dimensions One for «Job» = «Consultants» One for «Yearly Income» Question1 = «Job» Answer1 = «Consultant» Question2 = «Yearly Income»

34 Survey: The Final Result  Value = Customer Count  The fact table, this time, acts as the bridge

35 Survey Analytical Data Model No relationship between the fact table and the two filter dimensions Because here is only one value for ID_Answer

36 Survey Analytical Data Model  The two filters Are applied on the Customers table Use separate instances of the Answers table

37 Survey: The DAX Formula IF ( COUNTROWS (VALUES (Filter1[ID_Answer])) = 1 && COUNTROWS (VALUES (Filter2[ID_Answer])) = 1, CALCULATE ( COUNTROWS (Customers), FILTER ( Customers, CALCULATE ( COUNTROWS (Answers), Answers[ID_Answer] = VALUES (Filter2[ID_Answer])) > 0 ) ), FILTER ( Customers, CALCULATE ( COUNTROWS (Answers), Answers[ID_Answer] = VALUES (Filter1[ID_Answer])) > 0 ) Additional conditions to set the relationships with two tables in DAX only

38 Survey: The DAX Formula The two references to «Answers» work on separate instances of the same table IF ( COUNTROWS (VALUES (Filter1[ID_Answer])) = 1 && COUNTROWS (VALUES (Filter2[ID_Answer])) = 1, CALCULATE ( COUNTROWS (Customers), FILTER ( Customers, CALCULATE ( COUNTROWS (Answers), Answers[ID_Answer] = VALUES (Filter2[ID_Answer])) > 0 ) ), FILTER ( Customers, CALCULATE ( COUNTROWS (Answers), Answers[ID_Answer] = VALUES (Filter1[ID_Answer])) > 0 )

39 Survey - Conclusions  Very powerful data model  Very compact Duplicates only dimensions Different from the same pattern in Multidimensional  Handles any number of questions We have shown two But it is not a limit  Interesting topics Fact table as the bridge Relationships set in DAX Can be queried with a simple PivotTable

40 Basket Analysis

41 Basket Analysis: The Scenario Of all the customers who have bought a Mountain Bike, how many have never bought a mountain tire tube?

42 Basket Analysis in SQL Two iterations over the fact table needed SELECT COUNT (DISTINCT A.CustomerID) FROM FactSales A INNER JOIN FactSales B ON A.CustomerID = B.CustomerID WHERE A.ProductModel = ‘MOUNTAIN TIRE TUBE' AND A.Year <= 2004 AND B.ProductModel = ‘MOUNTAIN-100' AND B.Year <= 2004

43 Look the query plan … This is the fact table … Do you really like to self-join it? This is the fact table … Do you really like to self-join it?

44 Basket Analysis: The Data Model Of all the customers who have bought a Mountain Bike, how many have never bought a mountain tire tube? We can filter «Mountain Tire Tube» with this table but… where do we filter «Mountain Bike»?

45 Basket Analysis: The Data Model 1° filter: «Mountain Tire Tube» 2° filter: «Mountain Bike» No relationships

46 Basket Analysis- Demo

47 The Final Result 1° filter: «Mountain Bike» 2° filter: «Mountain Tire Tube» HavingProduct = Bought Both NotHavingProduct = Bought Bike, No Tire HavingProduct = Bought Both NotHavingProduct = Bought Bike, No Tire

48 HavingProduct := CALCULATE ( COUNTROWS (DISTINCT (FactSales[CustomerKey])), FILTER ( ALL (DimTime), DimTime[TimeKey] 0 ) )

49 Count the number of customers…

50 HavingProduct := CALCULATE ( COUNTROWS (DISTINCT (FactSales[CustomerKey])), FILTER ( ALL (DimTime), DimTime[TimeKey] 0 ) ) In the period of time before the end of the currently selected filter

51 HavingProduct := CALCULATE ( COUNTROWS (DISTINCT (FactSales[CustomerKey])), FILTER ( ALL (DimTime), DimTime[TimeKey] 0 ) ) This filter needs a bit of attention…

52 FILTER ( DimCustomer, SUMX ( ProductFilter, CALCULATE ( COUNTROWS (FactSales), ALL (FactSales), FactSales[CustomerKey] = EARLIER (DimCustomer[CustomerKey]), FactSales[ProductKey] = EARLIER (ProductFilter[ProductKey]), FILTER ( ALL (DimTime), DimTime[TimeKey] 0 ) ) Only the customers

53 FILTER ( DimCustomer, SUMX ( ProductFilter, CALCULATE ( COUNTROWS (FactSales), ALL (FactSales), FactSales[CustomerKey] = EARLIER (DimCustomer[CustomerKey]), FactSales[ProductKey] = EARLIER (ProductFilter[ProductKey]), FILTER ( ALL (DimTime), DimTime[TimeKey] 0 ) ) Only the customers Where, for each filtered product

54 FILTER ( DimCustomer, SUMX ( ProductFilter, CALCULATE ( COUNTROWS (FactSales), ALL (FactSales), FactSales[CustomerKey] = EARLIER (DimCustomer[CustomerKey]), FactSales[ProductKey] = EARLIER (ProductFilter[ProductKey]), FILTER ( ALL (DimTime), DimTime[TimeKey] 0 ) ) Only the customers Where, for each filtered product The number of rows in the fact table, for that product The number of rows in the fact table, for that product

55 FILTER ( DimCustomer, SUMX ( ProductFilter, CALCULATE ( COUNTROWS (FactSales), ALL (FactSales), FactSales[CustomerKey] = EARLIER (DimCustomer[CustomerKey]), FactSales[ProductKey] = EARLIER (ProductFilter[ProductKey]), FILTER ( ALL (DimTime), DimTime[TimeKey] 0 ) ) Only the customers Where, for each filtered product The number of rows in the fact table, for that product The number of rows in the fact table, for that product Is greater than zero i.e. there exists at least one sale

56 FILTER ( DimCustomer, SUMX ( ProductFilter, CALCULATE ( COUNTROWS (FactSales), ALL (FactSales), FactSales[CustomerKey] = EARLIER (DimCustomer[CustomerKey]), FactSales[ProductKey] = EARLIER (ProductFilter[ProductKey]), FILTER ( ALL (DimTime), DimTime[TimeKey] 0 ) ) Only the customers Where, for each filtered product The number of rows in the fact table, for that product The number of rows in the fact table, for that product Is greater than zero i.e. there exists at least one sale These conditions are needed because there aren’t relationships between Filter and the fact table. Thus, we add them manually in DAX

57 Ok, but «NotHavingProduct»? NotHavingProduct := CALCULATE ( COUNTROWS ( FILTER ( DimCustomer, CALCULATE (COUNTROWS (FactSales)) = 0 ) ), FILTER ( ALL (DimTime), DimTime[TimeKey] 0 ) ) Instead of: COUNTROWS ( DISTINCT (FactSales[CustomerKey]) ) Instead of: COUNTROWS ( DISTINCT (FactSales[CustomerKey]) )

58 Nice, but where are the M2M? NotHavingProduct := CALCULATE ( COUNTROWS ( FILTER ( DimCustomer, CALCULATE (COUNTROWS (FactSales)) = 0 ) ), FILTER ( ALL (DimTime), DimTime[TimeKey] 0 ) ) 1° M2M Pattern 2° M2M Pattern

59 Denali new features  Multiple relationships between tables Role Dimensions Role Keys  Active / Inactive relationships  USERELATIONSHIP New filter function Selects a model relationship to use in calculations  These additions make the data model much easier

60 Inactive Relationships in Denali Inactive Relationships

61 The Formula with Denali COUNTROWS ( FILTER ( DimCustomer, CALCULATE ( COUNTROWS (FactSales), USERELATIONSHIP ( Factsales[ProductKey], ProductFilter[ProductKey]) ) > 0 && CALCULATE ( COUNTROWS (FactSales), USERELATIONSHIP ( Factsales[ProductKey], DimProduct[ProductKey]) ) = 0 ) )

62 The formula with Denali  It is not Shorter Faster (but it is too early to evaluate)  It is Much clearer Shows the real intentions of the author  Key Point It can be written by Excel Users It is straightforward, no complex and geeky logic  In this example, Denali moves toward the users

63 Basket Analysis: Conclusions  Modeled with the many to many pattern Fact table as a bridge one Relationships set in DAX  Very Powerful  Search for missing values «Easy» as searching for existing ones

64 No support for M2M in Tabular  Facts Multidimensional handles M2M relationships. Tabular does not (apparently) DAX is the key to handle them  Many interesting models can be created leveraging DAX and the many to many patterns

65 Conclusions  Many To Many Relationships Can be handled in Tabular Even if not very easily  Very Powerful Cascading Many To Many Survey Data Model Basket Analysis  Are they worth some hours of study?

66 sqlbi.com

67 Links  SQLBI Website www.sqlbi.com  PowerPivot Workshop www.powerpivotworkshop.com  Alberto Ferrari blog www.sqlblog.com/blogs/alberto_ferrari  Marco Russo blog www.sqlblog.com/blogs/marco_russo For any question contact us at info@sqlbi.com

68 Coming up… # SQLBITS SpeakerTitleRoom Quest Trivia Quiz: Test Your SQL Server and IT Knowledge and Win Prizes Aintree SQL Server Community SQL Server Community presents : The A to Z of SQL NuggetsLancaster SQLSentry Real Time and Historical Performance Troubleshooting with SQL Sentry Pearce Attunity Data Replication Redefined – best practices for replicating data to SQL Server Empire Fusion-ioIs it a bird? Is it a plane?Derby


Download ppt "Sqlbi.com. sqlbi.com When Many To Many Are Really Too Many Alberto Ferrari Senior Consultant SQLBI.COM."

Similar presentations


Ads by Google