Presentation is loading. Please wait.

Presentation is loading. Please wait.

Designing Efficient Workbooks Brian Montgomery Tableau Software

Similar presentations


Presentation on theme: "Designing Efficient Workbooks Brian Montgomery Tableau Software"— Presentation transcript:

1 Designing Efficient Workbooks Brian Montgomery Tableau Software

2 What is efficiency? Design - focus on communicating the message
Performance - responds in a timely fashion (< 15 sec)

3 Visual Analytics

4 This is you. The swimming pool – it’s your data.
This is what you do with your data, using Tableau. You swim through it – you interact with it. No one’s data is as clear as this pool, so you use the interactivity of Tableau to help you explore. The power of Tableau is interactivity. ©2011 Tableau Software Inc. All rights reserved.

5 We’re Faster When We Can See Data

6 We’re Faster When We Can See Data

7 We’re Faster When We Can See Data

8 Pre-Attentive Visual Attributes

9 Best Practices Overview
Representing data for humans Color Maps Creating dashboards

10 How Do Humans Like Their Data?
Time: on an x-axis Location: on a map Comparing values: bar chart Exploring relationships: scatter plot Relative proportions: treemap

11 How Do Humans Like Their Data?
Orient data so people can read it easily Good Better

12 Color Me Impressed Humans can only distinguish ~8 colors
This is not helpful.

13 Color Me Impressed Humans can only distinguish ~8 colors
This is helpful.

14 Color Me Impressed For quantitative data, color intensity and diverging color palettes work well

15 Mapping to Insight Use maps when location is relevant

16 Mapping to Insight Use filled maps (“cloropleths”) for defined areas and only ONE measure

17 Mapping to Insight Filled maps won’t work for multiple measures

18 Mapping to Insight Maps don’t have to be geographic

19 Mapping to Insight Maps don’t have to be geographic

20

21 Make a good first impression
1 Make a good first impression First up is the always-critical first impression. You need to pull your audience in, draw them to engage, Because even if your audience is absolutely forced to use your tool, because your or their manager have ordered them to, your audience can always miss the point, miss the power of the interactivity, and fail to engage and really get something out of it. ©2011 Tableau Software Inc. All rights reserved.

22 simple Zen in simplicity easy beautiful
Even the first surface-level impression can be tough to get right. Within the development team, as we design Tableau, we strive to adhere to a philosophy that we call the “Tableau Zen”. What that means… If I had to reduce it to three words, they would be: Simple, easy, and beautiful. That’s what we want to convey, and that’s what we want to make. I want to talk about that word, “simple”. Simple combines well with interactivity. Simple doesn’t mean trivial. Simple doesn’t mean shallow, or limited. To us simplicity is about deep, submerged, harnessed power. ©2011 Tableau Software Inc. All rights reserved.

23 These interfaces, of course, are the opposite of simple.
These look terrible, scary, and confusing. But, could these be made simple without giving up their power and functionality? It’s not easy to do sometimes, but my answer is yes. Zen in simplicity ©2011 Tableau Software Inc. All rights reserved.

24 Zen in simplicity How simple is it to use Google?
Very, right? There’s an easy flow. You try something- Zen in simplicity ©2011 Tableau Software Inc. All rights reserved.

25 A “simple” visual Title: Purpose Top Overview to Details Bottom
And as far as a simple layout, I generally think those things should go top to bottom. The important question is, where are they going to look first, and where will their eyes move next? This ordering tries to cooperate with the natural progression of learning more and more as our eyes move to take in the content. ©2011 Tableau Software Inc. All rights reserved.

26 Dashboards Dashboards bring together multiple views

27 Dashboards Dashboards should pass the 5-second test

28 Catch their eye ©2011 Tableau Software Inc. All rights reserved.
These images are eye-gaze heat-maps, from studies showing how people consume information visually. This goes for web pages, documents, anything structured. Top to bottom is the standard approach we take We also scan left-to-right, at least in our English-speaking culture. Now, you can break this flow: faces draw our attention. Big text draws our attention. Bold things, colorful things, bright things draw our attention. There’s a fine line sometimes between catching our audience’s attention, and distracting them. So as we plan out a Tableau dashboard, I recommend mostly cooperating with the typical progression. We want to structure our layout so people find it easy to digest, and that means remembering this “F” shaped pattern of eye gaze scanning. Start with the high level (30,000 foot view) and then move down through the levels of detail. ©2011 Tableau Software Inc. All rights reserved.

29 Dashboarding for the 5-second Test
Most important view goes on top or top-left Legends go near their views Avoid using multiple color schemes on a single dashboard Use 5 views or fewer in dashboards Provide interactivity

30 Dashboarding for the 5-second Test
Use your words! Titles Axes Key facts and figures Units Remove extra digits in numbers Great tooltips

31 Performance - Basic Principles
Here are three broad principles that will help you author efficient dashboards and views: Everything in moderation! If it is not fast in the database, then it will not be fast in Tableau. If it is not fast in Tableau Desktop, then it will not be fast in Tableau Server.

32 Data Connection: Extract
An Extract is: a persistent cache of data that is written to disk and reproducible a columnar data store ̶ a format where the data has been optimized for analytic querying completely disconnected from the database during querying. ̶ It can be used offline refreshable (Complete or Incremental) not constrained by the amount of physical RAM available often much faster than the underlying live data connection.

33 When to use extracts vs. live connections?
Not recommended Recommended When need to leverage database functionalities: Balance resource usage by sessions When slow query execution (slow database) When workbook uses pass-through RAWSQL functions When using Custom SQL When Real-Time analysis is required. When Offline analysis is required When Robust user-level security is required ̶ except extracts that have been published to Tableau Server When Additional functionalities (set, rank, distinct, median) are needed. (MS JET Driver) When massive data set is needed. (millions or billions of rows) When smaller data set is needed (fewer columns/rows, aggregation) When using Tableau Online or Reader

34 Extracts: Best Practices
Publish Extract to Tableau Server to prevent redundancy. It saves disk space and CPU resource for refresh Hide unused fields ̶ it reduces columns in the extract Aggregate ̶ it decreases data extract size while retaining necessary level of detail

35 Extract Performance | Aggregate
Decreased number of rows and size Also aggregate views or custom SQL can be used for very specific aggregation ©2011 Tableau Software Inc. All rights reserved.

36 Extract Performance | Aggregate
Simple Excel ~8,000 Records – 3,211 kb to 942 kb Performance virtually the same Big Excel 1,000,000 records mb to 39 mb 1 Minute to Seconds Big SQL ~5,000,000 Records - 2+ gb to 193 mb Many seconds to few seconds ©2011 Tableau Software Inc. All rights reserved.

37 Drill Through Using our Extracts
Once you have gotten to the largest dataset, you have already filtered twice Extract 3 Extract 2 Intro: You can drill from one level of detail to another so that you only query some of the data at the most granular level and only when needed. How To: We should show the tricks of pausing auto updates and also how to structure a basic action. We do not want to get too far into actions, because those are covered elsewhere. Exercise: Build actions from one extract to another and then pause automatic updates. Close the workbook and re-open. Workbook should render much faster and only when a selection is made. The drill path should go like this. Extract 3-> Extract 2 -> Original Data. Original Data

38 Things to know about queries
Queries logs can be found here: (between <QUERY> and </QUERY>) “C:\Users\<username>\Documents\My Tableau Repository\Logs\log.txt” Query performance can be recorded Help > Settings and Performance > Start Performance Recording Join culling is used to improve query performance. Join culling allows Tableau to query only the relevant tables instead of all tables defined in the joins. Custom SQL cannot leverage join culling.

39 Things to know about Blending
Blend data when you have more than one data source Join data when you use two data connections from the same data source – It improves performance and filtering control Cardinality of the blending field(s) matter – Blending queries the data from both data sources at the level of the linking fields and then merges the results of both queries together in memory. Avoid blending on dimensions with large numbers of unique values (e.g. Order ID, Customer ID, exact date/time, etc)

40 Blending

41 Filtering categorical dimensions
Consider the following visualization – a map of The United-States with the marks for each postcode:

42 Filtering categorical dimensions
There are several ways we could filter the map to just show the postcodes for the Western region (the red dots): We could select all the marks in the Western region and keep-only the selection We could select all the marks outside of the Western region and exclude the selection We could keep-only on another attribute such as the Region dimension We could filter by range – either on the postcode values or the latitude/longitude values.

43 Filtering categorical dimensions
Options (a) and (b) will perform poorly. They can even be slower than the unfiltered data set. Because they are expressed as a discrete list of postcode values that are filtered IN or OUT by the database either through a complex WHERE clause or by joining with a temp table that has been populated with the selection.

44 Filtering categorical dimensions
Options (c) is fast. Because the resulting filter (WHERE Region=”West”) is very simple and can be efficiently processed by the database. However this approach becomes less effective as the number of dimension members increase.

45 Filtering categorical dimensions
Options (d) is fast too. Because using the ranged filter also allows the database to evaluate a simple filter clause resulting in fast execution. (WHERE POSTCODE >= AND POSTCODE < 60000 OR POSTCODE >= AND POSTCODE < ) However this approach, unlike a filter on a related dimension, doesn’t become more complex as the number of dimension members increase.

46 Filtering categorical dimensions
Take-away: Ranged filters are often faster to evaluate than large itemized lists of discrete values and they should be used in preference to a keep-only or exclude for large mark sets if possible.

47 Filtering dates: discrete, range, relative
Date filters are extremely common and fall into three categories: Relative Date Filters – Show a date range that is relative to a specific day Range of Date Filters – Show a defined range of discrete dates Discrete Date Filters – Show individual dates that you’ve selected from a list The method used can have an impact on the efficiency of the resulting query.

48 Filtering dates: Discrete
Discrete date filter includes dates or entire date levels. This filter type results in the date expression being passed to the database as a dynamic calculation: SELECT [FactSales].[Order Date], SUM([FactSales].[SalesAmount]) FROM [dbo].[FactSales] [FactSales] WHERE (DATEPART(year,[FactSales].[Order Date]) = 2010) GROUP BY [FactSales].[Order Date] This resulting WHERE clause can result in poor query execution. Because the table isn’t partitioned on the DATEPART value. Some databases will evaluate the calculation across all partitions, even though this is not necessary.

49 Filtering dates: Range
Range date filter includes a specific range of contiguous dates. This filter type results in the following query structure being passed to the database : SELECT [FactSales].[Order Date], SUM([FactSales].[SalesAmount]) FROM [dbo].[FactSales] [FactSales] WHERE (([FactSales].[Order Date] >= {ts ' :00:00'}) AND ([FactSales].[Order Date] <= {ts ' :00:00'})) GROUP BY [FactSales].[Order Date] This type of WHERE clause is very efficient it leverages indexes.

50 Filtering dates: Relative
Relative date filter includes a defined range of contiguous dates. This filter type results in the following query structure being passed to the database : SELECT [FactSales].[Order Date], SUM([FactSales].[SalesAmount]) FROM [dbo].[FactSales] [FactSales] WHERE (([FactSales].[Order Date] >= DATEADD(year,(-2),DATEADD(year, DATEDIFF(year, 0, {ts ' :37:51.490'}), 0))) AND ([FactSales].[Order Date] < DATEADD(year,1,DATEADD(year, DATEDIFF(year, 0, {ts ' :37:51.490'}), 0)))) GROUP BY [FactSales].[Order Date] The resulting WHERE clause uses a ranged date filter, so this is also an efficient form of date filter.

51 Quick Filters Despite the name, too many quick filters will actually slow you down. Try a more guided analytics approach using action filters instead.

52 Filter: Enumerated Enumerated quick filters require Tableau to query the data source for all potential field values before the quick filter object can be rendered. These include: Multiple value list Single value list Compact List Slider Measure filters Ranged date filters

53 Filter: Non-Enumerated
Non-enumerated quick filters on the other hand, do not require knowledge of the potential field values. These include: Custom value list Wildcard match Relative date filters Browse period date filters Non-enumerated quick filters reduce the number of quick filter related queries that need to be executed by the data source.

54 Context Filters By default, all filters that you set in Tableau are computed independently. Each filter accesses all rows in your data source without regard to other filters Context filters are implemented by writing the filter result set to a temporary table. Any other filters are now dependent of the context filter. Only process the data stored in the temporary table Context filters are used to improve performance. However the process of creating the temporary table can be expensive on the database

55 Context Filters Context filter is recommended when:
It reduces the size of the data set significantly – filtering 90% of the data It is used against slow changing dimensions – if the filter is updated the database must re-compute and re-write the temporary table.

56 Filters: Best Practices
Avoid “Exclude” ̶ it prevents Tableau from leveraging indexes, forces a scan of all of the selected data Avoid discrete filter use range filter whenever it is possible Use relative date filter ̶ it is an easy way to show current week or month without republishing workbook Avoid too many quick filters ̶ Use guided analytics using action filters

57 Calculations Follow those techniques to ensure your calculations are as efficient as possible Date functions – Today() is better than NOW() Use NOW() only if you need the time stamp level of detail otherwise use TODAY() for date level calculations. Use Boolean whenever is possible Slower: IF [Date] = TODAY() THEN “Today” ELSE “Not Today” END Faster: [Date] = TODAY()

58 Calculations Logic statements – ELSEIF better than ELSE IF
A nested IF computes a second IF statement rather than being computed as part of the first. Slower: IF [Region] = "East" and [Customer Segment] = "consumer" THEN "East-Consumer" ELSE IF [Region] = "East" and [Customer Segment] <> "consumer" THEN "East-All Others" END Faster: IF [Region] = "East" and [Customer Segment] = "consumer" ELSEIF [Region] = "East" and [Customer Segment] <> "consumer"

59 Calculations Avoid Redundant logic checks Slower: Faster:
IF [Sales] < 10 THEN "Bad" ELSEIF [Sales] >= 10 AND [Sales] < 30 THEN "OK" ELSEIF [Sales] >= 30 THEN "Great" END Faster: IF [Sales] < 10 ELSEIF [Sales] >= 30 ELSE "OK"

60 Calculations Parameters for conditional calculations ̶ It helps the end user to change dynamically a calculation. Slow: IF [Parameters].[Date Part Picker] = "Year" THEN DATEPART('year',[Order Date]) ELSEIF [Parameters].[Date Part Picker] = "Quarter" THEN DATEPART('quarter',[Date]) ELSE NULL END

61 Calculations Faster: IF [Parameters].[Date Part Picker] = 1
THEN DATEPART('year',[Order Date]) ELSEIF [Parameters].[Date Part Picker] = 2 THEN DATEPART('quarter',[Date]) ELSE NULL END

62 Calculations Fastest:
DATEPART([Parameters].[Date Part Picker],[Order Date])


Download ppt "Designing Efficient Workbooks Brian Montgomery Tableau Software"

Similar presentations


Ads by Google