Presentation is loading. Please wait.

Presentation is loading. Please wait.

Indexing Your Data Warehouse Troy Gallant, MTA. Agenda  A little about me  Indexing review  Enterprise Data Warehouse (EDW) vs. OLTP  EDW structure.

Similar presentations


Presentation on theme: "Indexing Your Data Warehouse Troy Gallant, MTA. Agenda  A little about me  Indexing review  Enterprise Data Warehouse (EDW) vs. OLTP  EDW structure."— Presentation transcript:

1 Indexing Your Data Warehouse Troy Gallant, MTA

2 Agenda  A little about me  Indexing review  Enterprise Data Warehouse (EDW) vs. OLTP  EDW structure  EDW indexing  Too many / too few  Considerations  Dimension / fact indexing  Maintenance

3

4 Bio  15 years as a database professional  Last 2 yrs in NYC, all previous in Jax  Microsoft MTA certified  Speaker – 16x SQL Saturday, 4x JSSUG  Working on MS in IT Mgmt  Twitter: @GratefulDBA@GratefulDBA  LinkedIn: https://www.linkedin.com/in/tgallanthttps://www.linkedin.com/in/tgallant  Website: http://www.troygallant.comhttp://www.troygallant.com  Email: tgallant@outlook.comtgallant@outlook.com

5 Indexing Review  Broad definition  What an index DOES do.  What an index DOESN’T do.

6

7 Types of Indexes  Heap*  Clustered  Non-clustered  Non-clustered w/ included columns  Unique  Full-text  Spatial  Filtered  XML  Columnstore

8 EDW vs. OLTP (pt. 1)  EDW definition  Single, complete, consistent  Decision-support  Integrate divergent information  Historical

9 EDW vs. OLTP (pt. 2)  Comparisons  Integrated data vs. application-specific  Current/Historical data vs. current data  Non-volatile vs. updated  Encoded vs. descriptive  Detailed/summarized vs. raw

10

11 EDW Structure  Source  Staging  Storage  Dimensions  Fact tables  Presentation

12 EDW Indexing (pt. 1)  Too few indexes  Data loads quickly  QRT suffers  Too many indexes  Data loads slowly  QRT improves  Storage requirements increase

13

14 EDW Indexing (pt. 2)  Major considerations  Warehouse type  Size of tables  Access  How?  Who?  What?  Storage requirements  Response-time expectations

15

16 EDW Indexing (pt. 3)  Dimensions  Clustered Index on business/natural key  Identifier from the source system  Enhances response time when this business key is used in a WHERE clause  NCI(s)  Surrogate key  Usually the primary key  Meaningful only to the source system  Will expedite loads  Other columns found to be accessed frequently in searches, sorting, or grouping  Consider columns included in a hierarchy

17

18 EDW Indexing (pt. 4)  Date & time dimensions  No business key  Consider a smart PK and cluster on it  YYYYMMDD  HHMMSSSS  A smart key will retain proper order and range queries will be simplified as you will need one less join because the PK already contains the date/time

19

20 EDW Indexing (pt. 5)  Type 2 SCD  Consider adding a 4-pt NCI that includes…  The business key  The record begin date  The record end date  The surrogate key  CREATE NONCLUSTERED INDEX MyDim_CoveringIndex ON (NaturalKEY, RecordStartDate) INCLUDE ( RecordEndDate, SurrogateKEY)  Can be very useful during ETL as well as for historical queries

21

22 EDW Indexing (pt. 6)  Fact table  Similar to indexing a dimension with an eye towards partitioning  Usually best to cluster on the date key or date/time key  If table is partitioned on a date column, use that column as the clustering key  Create NCI’s on each of the FK’s in the fact table  Consider combining the FK and date key (in that order) to enhance query response  Watch storage requirements

23

24 Modifying the Scheme  Over time your data warehouse will change to accommodate what’s happening in your organization  Use tried-and-true transactional methods for tuning indexes…  DTA  Execution plans  DMV’s  sys.dm_db_index_usage_stats  sys.dm_db_index_operational_stats  sys.dm_db_missing_index_details  sys.dm_db_missing_index_columns  sys.dm_db_missing_index_group_stats  sys.dm_db_missing_index_groups

25 Thank you!!! Twitter: @GratefulDBA@GratefulDBA LinkedIn: https://www.linkedin.com/in/tgallanthttps://www.linkedin.com/in/tgallant Web: http://troygallant.comhttp://troygallant.com Email: tgallant@outlook.comtgallant@outlook.com


Download ppt "Indexing Your Data Warehouse Troy Gallant, MTA. Agenda  A little about me  Indexing review  Enterprise Data Warehouse (EDW) vs. OLTP  EDW structure."

Similar presentations


Ads by Google