Hugo Kornelis Now where does THAT estimate come from? The nuts and bolts of cardinality estimation.

Slides:



Advertisements
Similar presentations
SQL Server performance tuning basics
Advertisements

Query Optimization Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) Imperative query execution plan: SELECT S.sname FROM Reserves.
Cardinality How many rows? Distribution How many distinct values? density How many rows for each distinct value? Used by optimizer A histogram 200 steps.
CS4432: Database Systems II
Dave Ballantyne Clear Sky SQL. ›Freelance Database Developer/Designer –Specializing in SQL Server for 15+ years ›SQLLunch –Lunchtime usergroup –London.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
Access Path Selection in a Relational Database Management System Selinger et al.
Module 7 Reading SQL Server® 2008 R2 Execution Plans.
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
Indexes and Views Unit 7.
Maciej Pilecki | Project Botticelli Ltd.. SELECT Bio FROM Speakers WHERE FullName=‘Maciej Pilecki’;  Microsoft Certified Trainer since 2001  SQL Server.
Meta Data Cardinality Explored CSSQLUG User Group - June 2009.
How to kill SQL Server Performance Håkan Winther.
SQL Server Statistics DEMO SQL Server Statistics SREENI JULAKANTI,MCTS.MCITP,MCP. SQL SERVER Database Administration.
Execution Plans Detail From Zero to Hero İsmail Adar.
Module 6: Creating and Maintaining Indexes. Overview Creating Indexes Understanding Index Creation Options Maintaining Indexes Introducing Statistics.
SQL Server Statistics DEMO SQL Server Statistics SREENI JULAKANTI,MCTS.MCITP SQL SERVER Database Administration.
Diving into Query Execution Plans ED POLLACK AUTOTASK CORPORATION DATABASE OPTIMIZATION ENGINEER.
High Performance Functions SQLBits VI. Going backwards is faster than going forwards.
Session Name Pelin ATICI SQL Premier Field Engineer.
SQL Server Magic Buttons! What are Trace Flags and why should I care? Steinar Andersen, SQL Service Nordic AB Thanks to Thomas Kejser for peer-reviewing.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
Tuning Oracle SQL The Basics of Efficient SQL Common Sense Indexing
SQL Server Statistics and its relationship with Query Optimizer
Parameter Sniffing in SQL Server Stored Procedures
Query Optimization Techniques
SQL Server Statistics 101 Travis Whitley Senior Consultant, Oakwood Systems whitleysql.wordpress.com.
Stored Procedures – Facts and Myths
Query Tuning without Production Data
Parameter Sniffing in SQL Server Stored Procedures
Query Tuning without Production Data
Query Tuning without Production Data
Choosing Access Path The basic methods.
Database Performance Tuning and Query Optimization
Reading Execution Plans Successfully
Introduction to Execution Plans
Chapter 15 QUERY EXECUTION.
Statistics And New Cardinality Estimator (CE)
Decoding the Cardinality Estimator to Speed Up Queries
SQL Server 2017 has more cool features than just running on Linux
Now where does THAT estimate come from?
Cardinality Estimator 2014/2016
Query Optimization Statistics: The Driving Force Behind Good Performance G. Vern Rabe -
Statistics What are the chances
Query Optimization Techniques
Akshay Tomar Prateek Singh Lohchubh
Execution Plans Demystified
Deep Dive into Adaptive Query Processing
Statistics: What are they and How do I use them
Ascending Key Problem in SQL Server Large Tables
Introduction to reading execution plans
Parameter Sniffing: the Good, the Bad, and the Ugly
Introduction to Execution Plans
Parameter Sniffing: the Good,the Bad, and the Ugly
Parameter Sniffing on SQL Server
Parameter Sniffing: the Good, the Bad, and the Ugly
Chapter 8 Advanced SQL.
Chapter 11 Database Performance Tuning and Query Optimization
Score a (row) goal and beat a query optimizer
Diving into Query Execution Plans
SQL Server Query Design and Optimization Recommendations
A – Pre Join Indexes.
Introduction to Execution Plans
From adaptive to intelligent: query processing in SQL Server 2019
Query Optimization Techniques
Bob Duffy 22 years in database sector, 250+ projects
T-SQL Basics: Coding for performance
Introduction to Execution Plans
All about Indexes Gail Shaw.
From adaptive to intelligent:
Presentation transcript:

Hugo Kornelis Now where does THAT estimate come from? The nuts and bolts of cardinality estimation

WHAT estimate????

WHAT estimate????

BIG Thanks to SQL Sat Denmark sponsors GOLD SILVER BRONZE

Raffle and goodbye Beer Remember to visit the sponsors, stay for the raffle and goodbye beers  Join our sponsors for a lunch break session in : cust 0.01 and cust 1.06 We hope you’ll all have a great Saturday. Regis, Kenneth

Hugo Kornelis I make SQL Server Fast

Hugo Kornelis I make SQLServerFast.com Blog: http://sqlserverfast.com/blog Execution Plan Reference: http://sqlserverfast.com/epr Resources Deck and demo for this session are there Articles (planned)

Hugo Kornelis I make SQLServerFast.com I do other community things Technical editor for various books SQL Server Execution Plans, 3rd edition (“Real Soon Now”™) 11 years MVP (2006 – 2016, SQL Server/Data Platform) Lots of other things

Hugo Kornelis I make SQLServerFast.com I do other community things I work Independent database consultant Will do (almost) anything for money

Hugo Kornelis I make SQLServerFast.com I do other community things I work Contact details Email: hugo@perFact.info Twitter: @Hugo_Kornelis

Which version? “Old” (legacy) cardinality estimator Introduced in SQL Server 7.0 Unchanged until SQL Server 2012 “New” cardinality estimator Introduced in SQL Server 2014 Small changes in later versions Better? Database compatibility level Trace flags 2312 (force new) / 9481 (force old) Usually used with OPTION (QUERYTRACEON nnnn) query hint SQL 2016+: ALTER DATBASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = [ ON | OFF ];

Overview Cardinality estimation How: What? Why? How? The usual suspects Statistics Single-table queries (simple filters, complex filters) Multiple tables (single equijoin condition, single non-equijoin condition, multiple join conditions)

Cardinality estimation: What is it? Prediction of the number of rows that an operator will return

Cardinality estimation: Why is it important? Determines plan choice Bad join strategy Non-optimal index choice Serial or parallel plan

Cardinality estimation: Why is it important? Determines plan choice Determines memory grant Operators that store data in memory: Sort, Hash Match (Join), Hash Match (Aggregate) Required total memory computed based on cardinality estimates Query execution waits until memory available Insufficient memory: Spill to tempdb

The usual suspects Table variables No statistics Estimated table cardinality Always 1 row Except … Fix: temporary tables

The usual suspects Table variables Multi-statement table-valued functions Implemented using table variable No statistics Estimated cardinality: always 1 row Changed to 100 rows in SQL Server 2014 Fixes: Inline table-valued function Copy results to temporary table SQL Server 2017: Interleaved Execution Rest of plan recompiled after table-valued function is evaluated Restrictions apply

DEMO The usual suspects Table variables Multi-statement table-valued functions

The usual suspects Table variables Multi-statement table-valued functions Stale and outdated statistics AUTO_UPDATE_STATISTICS AUTO_UPDATE_STATISTICS_ASYNC Trace flag 2371 SQL Server 2008R2 and up On by default as on SQL 2016+ (requires compatibility level 130) http://blogs.msdn.com/b/saponsqlserver/archive/2011/09/07/changes-to-automatic-update-statistics-in-sql-server-traceflag-2371.aspx

The usual suspects Table variables Multi-statement table-valued functions Stale and outdated statistics Unrepresentative statistics Use higher sampling rate, or use FULLSCAN Drawbacks: slower, more resources used Cannot be configured for automatic statistic updates Not guaranteed to work

The usual suspects Table variables Multi-statement table-valued functions Stale and outdated statistics Unrepresentative statistics Parameter sniffing Estimates based on value of first execution Plan reused for later executions Even when variables change

The usual suspects Table variables Multi-statement table-valued functions Stale and outdated statistics Unrepresentative statistics Parameter sniffing OPTIMIZE FOR hint Estimates based on hard-coded value Even when actual value is different

Statistics Number of rows sampled Total number of rows in table Last update of this statistics

Statistics 1.0 / COUNT(DISTINCT LastName) 1.0 / COUNT(DISTINCT LastName, FirstName, MiddleName)

Statistics 7 rows have LastName = ‘Brown27’ 146 rows have LastName > ‘Brook6’ and LastName < ‘Brown27’ 23 distinct LastName values in those rows So on average ~ 6.35 (146 / 23) rows for each of those values

Statistics DEMO DBCC SHOW_STATISTICS

Statistics Assumptions made when using statistics: Independence Predicates are not correlated Uniformity Values are evenly spread Containment / Inclusion Values searched for will exist

Single-table queries Comparison with a variable Equality 4.05679E-05

Single-table queries Comparison with a variable Equality * 29750 sys.dm_db_partition_stats

Single-table queries Comparison with a variable Equality * 29750 = 1.2069 DBCC SHOW_STATISTICS

Single-table queries Comparison with a variable Equality Inequality Fixed selectivity assumption: 30% 0.3 * 29750 = 8925 sys.dm_db_partition_stats

DEMO Single-table queries Comparison with a variable Equality Inequality

Single-table queries Comparison with a constant, sniffed parameter, or sniffed variable Equality 7 * 29750 / 29750 = 7 29750 29750 sys.dm_db_partition_stats DBCC SHOW_STATISTICS

Single-table queries Comparison with a constant, sniffed parameter, or sniffed variable Equality 6.347826 * 29750 / 29750 29750 29750

DEMO Single-table queries Comparison with a constant, sniffed parameter, or sniffed variable Equality

Single-table queries Comparison with a constant, sniffed parameter, or sniffed variable Equality Inequality 1 + 34 + 1 + 0 + 1 = 37 * 29750 / 29750 = 37

Single-table queries Comparison with a constant, sniffed parameter, or sniffed variable Equality Inequality 1 + 34 + 1 + 0 + 1 = 37 36 * 29750 / 29750 = 37 36

Single-table queries Comparison with a constant, sniffed parameter, or sniffed variable Equality Inequality 1 + 0 + 1 + (~67% * 34) = ~24.78 * 29750 / 29750 = ~24.78 (actual estimate: 24.6755)

Single-table queries Comparison with a constant, sniffed parameter, or sniffed variable Equality Inequality 1 + 0 + 1 + (~67% * 34) = ~24.78 * 29750 / 29750 = ~24.78 (actual estimate: 24.6755)

DEMO Single-table queries Comparison with a constant, sniffed parameter, or sniffed variable Equality Inequality

Single-table queries Comparison with a constant, sniffed parameter, or sniffed variable Equality Inequality Changes on SQL Server 2014: Different interpolation Difference between < and <= is observed > ‘Zwilling4’  24.0085 >= ‘Zwilling4’  25.0085

DEMO Ascending Key problem Not just for keys! (And not just ascending either)

Ascending Key problem Workaround since SQL Server 2005 SP1: Trace flag 2389 (“known ascending columns”) Trace flag 2390 (“all other columns”) Value out of range and column indexed? find MIN/MAX from table If within actual range, use average density Example: Statistics: rowcount 10,000; highest value 250; density 0.004 Actual: rowcount 11,500; highest value 300 WHERE value = 300 Density * Rowcount = 0.004 * 10,000 = 40 WHERE value = 310 Out of actual range, so estimate is still 1 WHERE value = 290 In new range. Estimate should be 40, but is 1

Ascending Key problem Workaround since SQL Server 2005 SP1: Trace flag 2389 (“known ascending columns”) Trace flag 2390 (“all columns”) Value out of range and column indexed? find MIN/MAX from table If within actual range, use average density Example: Statistics: rowcount 10,000; highest value 250; density 0.01 Actual: rowcount 11,500; highest value 300 WHERE value > 290 Interpolation based on 1,500 rows in 250-300 range: 332 WHERE value > 310 Out of actual range, so estimate is still 1

Ascending Key problem Change in SQL Server 2014: WHERE value = 300 Value out of range? (Regardless of “known ascending” and index!) Estimate as if variable was used, based on number of rows added Example: Statistics: rowcount 10,000; highest value 250; density 0.01 Actual: rowcount 11,500; highest value 300 WHERE value = 300 Density * Rowcount = 0.004 * 1,500 10,000 = 40 WHERE value = 310 Density * Rowcount = 0.004 * 10,000 = 40 WHERE value = 290 Density * Rowcount = 0.004 * 10,000 = 40

Ascending Key problem Change in SQL Server 2014: WHERE value > 290 Value out of range? (Regardless of “known ascending” and index!) Estimate as if variable was used, based on number of rows added Example: Statistics: rowcount 10,000; highest value 250; density 0.01 Actual: rowcount 11,500; highest value 300 WHERE value > 290 Inequality assumption * New rows = 0.3 * 1,500 = 450 WHERE value > 310 Inequality assumption * New rows = 0.3 * 1,500 = 450

Single-table queries Multiple predicates SQL 7 – SQL 2012: Assume independency Estimate selectivity of each predicate (Estimate / Table cardinality) Find combined selectivity (Selectivity 1 * Selectivity 2 * Selectivity 3 * …) Find rowcount (Table cardinality * Combined selectivity)

Single-table queries Multiple predicates Example: Table has 100,000 rows WHERE Col1 = ‘A’  4,500 rows Selectivity = 4,500 / 100,000 = 0.045 WHERE Col2 = @Variable2  Density = 0.025 Selectivity = 0.025 WHERE Col3 > @Variable3  Fixed selectivity Selectivity = 0.3

Single-table queries Multiple predicates Example: Table has 100,000 rows WHERE Col1 = ‘A’ AND Col2 = @Variable2 AND Col3 > @Variable3 Selectivity = 0.045 * 0.025 * 0.3 = 0.0003375 Estimate = 100,000 * 0.0003375 = 33.75

Single-table queries Multiple predicates SQL 2014: Assumed partially dependent Estimate selectivity of each predicate Sort in order of ascending selectivity Most selective selectivity comes first Find combined selectivity (“exponential backoff”) (Selectivity 1× Selectivity 2 × Selectivity 3 ×…) Find rowcount

Single-table queries Multiple predicates Example: Table has 100,000 rows Selectivity values: 0.045, 0.025, 0.3 Ascending order: 0.025, 0.045, 0.3 Selectivity = 0.025× 0.045 × 0.3 ≈ 0.00392488 Estimate = 100,000 * 0.00392488 = 392.488

Single-table queries Multiple predicates Fix (for SQL 7 – SQL 2012) Create multi-column statistics (These are never created automatically) Will use average density for column combination Not used on SQL 2014 RTM Was fixed in SQL 2016

Single-table queries DEMO Multiple predicates

Multiple tables DEMO Simple joins One equality predicate

One equality predicate Aligning histograms Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2)

One equality predicate Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) 2 * 2 = 4 2 * 2 = 4 4 + 4

One equality predicate Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) (98/49) * 2 = 4 1 * (61/55) ≈ 1.1 (149/149) * 1 = 1 4 + 4 + 4 + 1.1 + 1

One equality predicate ~20% * 48 = 9.6 distinct values Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) 9.6 * (98/49) * (84/42) = 38.4 4 + 4 + 4 + 1.1 + 1 + 38.4

One equality predicate ~80% * 48 = 38.4 distinct values Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) ~80% * 54 = 43.2 distinct values 38.4 * (98/49) * (61/55) ≈ 85.2 4 + 4 + 4 + 1.1 + 1 + 38.4 + 85.2

One equality predicate ~10% * 148 = 14.8 Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) ~20% * 54 = 10.8 10.8 * (149/149) * (61/55) ≈ 12.0 4 + 4 + 4 + 1.1 + 1 + 38.4 + 85.2 + 12.0

One equality predicate ~90% * 148 = 133.2 Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) 100 * (149/149) * (100/100) = 100 4 + 4 + 4 + 1.1 + 1 + 38.4 + 85.2 + 12.0 + 100

One equality predicate Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) (actual estimate: 267.455) 4 + 4 + 4 + 1.1 + 1 + 38.4 + 85.2 + 12.0 + 100 = 249.7 * 29750 / 29750 = 249.7

One equality predicate SQL Server 2014: Much simpler! Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2)

One equality predicate SQL Server 2014: Much simpler! Abercrombie0 (2) Ackerman0 (2) 248/199 248/199 Abercrombie0 (2) Ackerman0 (2) 2 * 2 = 4 2 * 2 = 4 4 + 4

One equality predicate SQL Server 2014: Much simpler! Abercrombie0 (2) Ackerman0 (2) 248/199 248/199 Abercrombie0 (2) Ackerman0 (2) 199 * (248/199) * (248/199) ≈ 309.1 4 + 4 + 309.1 = 317.1 * 29750 / 29750 = 317.1 (actual estimate: 316.5)

One equality predicate SQL Server 2014: Actual Abercrombie0 (2) Ackerman0 (2) 248/199 248/199 Abercrombie0 (2) Ackerman0 (2) 200 * (250/200) * (250/200) = 312.5 2 * 2 = 4 4 + 312.5 = 316.5 * 29750 / 29750 = 316.5 (actual estimate: 316.5)

Multiple tables Simple joins One inequality predicate Nothing documented So … let’s speculate! SQL Server 7.0 – 2012: Variation on equality algorithm

One inequality predicate 2 * 250 = 500 Abercrombie0 (2) Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) 84 + 2 + 61 + 1 + 100 + 2 = 250 500

One inequality predicate ~20% * 48 = 9.6 distinct values Abercrombie0 (2) Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) 84 84 * 0.5 2 + 61 + 1 + 100 + 2 = 166 9.6 * (98/49) * (166 + (84 * 0.5)) * 166 = 3993.6 500 + 3993.6

One inequality predicate 98/49 * 164 = 328 Abercrombie0 (2) Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) 61 + 1 + 100 + 2 = 164 500 + 3993.6 + 328

One inequality predicate 38.4 * 98/49 * (116.09 + ((43.2 * 61/55) * 0.5) = 10755.56 Abercrombie0 (2) Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) (43.2 * 61/55) * 0.5 (11.8 * 61/55) + 1 + 100 + 2 = 116.09 500 + 3993.6 + 328 + 10755.56

One inequality predicate 1 * 114.98 = 114.98 Abercrombie0 (2) Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) (10.8 * 61/55) + 1 + 100 + 2 = 114.98 500 + 3993.6 + 328 + 10755.56 + 114.98

One inequality predicate 14.8 * 149/149 * (114.98 + ((10.8 * 61/55) * 0.5) = 1790.34 Abercrombie0 (2) Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) (10.8 * 61/55) * 0.5 1 + 100 + 2 = 114.98 500 + 3993.6 + 328 + 10755.56 + 114.98 + 1790.34

One inequality predicate 149/149 * 102 = 102 Abercrombie0 (2) Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) 100 + 2 = 102 500 + 3993.6 + 328 + 10755.56 + 114.98 + 1790.34 + 102

One inequality predicate 133.2 * 149/149 * (2 + (100 * 0.5)) = 6926.4 Abercrombie0 (2) Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) 100 * 0.5 2 500 + 3993.6 + 328 + 10755.56 + 114.98 + 1790.34 + 102 + 6926.4

One inequality predicate Abercrombie0 (2) Abercrombie0 (2) Abolrous0 (1) Ackerman0 (2) (98/49) (149/149) . (84/42) (61/55) (100/100) Abercrombie0 (2) Abercrombie48 (2) Abolrous9 (1) Ackerman0 (2) (actual estimate: 30797.9) 500 + 3993.6 + 328 + 10755.56 + 114.98 + 1790.34 + 102 + 6926.4 = 24510.88

Multiple tables Simple joins One inequality predicate Nothing documented So … let’s speculate! SQL Server 7.0 – 2012: Variation on equality algorithm SQL Server 2014: Same with the simplified histogram

One inequality predicate 2 * 250 = 500 Abercrombie0 (2) Ackerman0 (2) 248/199 248/199 Abercrombie0 (2) Ackerman0 (2) 248 + 2 = 250 500

One inequality predicate 250 * (250 * 0.5) = 31250 Abercrombie0 (2) Ackerman0 (2) 248/199 248/199 Abercrombie0 (2) Ackerman0 (2) 250 * 0.5 500 + 31250

One inequality predicate Abercrombie0 (2) Ackerman0 (2) 248/199 248/199 Abercrombie0 (2) Ackerman0 (2) (actual estimate: 31750) 500 + 31250 = 31750

Multiple tables Complex joins (actual estimate: 13.1007) Two equality predicates SQL Server 7.0 – 2012 Compute selectivity for each predicate Multiply (assume independence) Cartesian product (estimate): 5875625 (2975 * 1975) Join on LastName only (estimate): 2492.33 Selectivity = 2492.33 / 5875625 = 4.242E-04 Join on FirstName only (estimate): 30884.5 Selectivity = 30884.5 / 5875625 = 5.256E-03 Estimate = 2.230E-06 * 5875625 = 13.1006 Combined selectivity = 4.242E-04 * 5.256E-03 = 2.230E-06 (actual estimate: 13.1007)

Multiple tables Complex joins (actual estimate: 21327.1) Two equality predicates SQL Server 7.0 – 2012 Compute selectivity for each predicate Multiply (assume independence) Or use density for column combination Multi-column statistics Statistics for multi-column index DISTINCT p.FirstName, p.LastName (estimate): 27550 DISTINCT p2.FirstName, p2.LastName (estimate): 19200 DISTINCT matching combination: 19200 Rows per combination in p: rows * density = 1.08 Rows per combination in p2: rows * density = 1.03 Estimate: 19200 * 1.08 * 1.03 = 21358.1 (actual estimate: 21327.1)

Multiple tables Complex joins Two equality predicates SQL Server 2014 Estimate #distinct combinations on each side Multiply smallest with estimated densities  Same as SQL Server 7.0-2012 with multi-column statistics (even when there are no multi-column statistics)

Multiple tables Complex joins Equality and inequality SQL Server 7.0 – 2012 Compute selectivity for each predicate Multiply (assume independence) Same as for two equality predicates Multi-column statistics never used

Multiple tables Complex joins Equality and inequality SQL Server 2014 Estimate cardinality of each input Assume each row from large input matches one row from smaller input

Multiple tables Complex joins Join with extra filter predicates on other columns SQL Server 7.0 – 2012 Assume filters are correlated Determine filter selectivity Scale down histograms Align scaled-down histograms

Multiple tables Complex joins Join with extra filter predicates on other columns SQL Server 2014 Assumes no correlation between filters Align original histograms (simplified) Determine filter selectivity Reduce estimate after histogram alignment

Twitter: @Hugo_Kornelis T H E E N D Questions? Email: hugo@perFact.info Twitter: @Hugo_Kornelis

Raffle and goodbye Beer Remember to visit the sponsors, stay for the raffle and goodbye beers  Join our sponsors for a lunch break session in : cust 0.01 and cust 1.06 We hope you’ll all have a great Saturday. Regis, Kenneth

BIG Thanks to SQL Sat Denmark sponsors GOLD SILVER BRONZE

Twitter: @Hugo_Kornelis T H E E N D Questions? Email: hugo@perFact.info Twitter: @Hugo_Kornelis