Antonio Abalos Castillo

Slides:



Advertisements
Similar presentations
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Advertisements

BY LECTURER/ AISHA DAWOOD DW Lab # 3 Overview of Extraction, Transformation, and Loading.
Module 8 Importing and Exporting Data. Module Overview Transferring Data To/From SQL Server Importing & Exporting Table Data Inserting Data in Bulk.
Change Data Capture & Change Tracking Deep Dive
Database Optimization & Maintenance Tim Richard ECM Training Conference#dbwestECM Agenda SQL Configuration OnBase DB Planning Backups Integrity.
Passage Three Introduction to Microsoft SQL Server 2000.
ETL By Dr. Gabriel.
Backup & Recovery 1.
Implementing Database Snapshot & Database Mirroring in SQL Server 2005 Presented by Tarek Ghazali IT Technical Specialist Microsoft SQL Server MVP Microsoft.
Maintaining a Microsoft SQL Server 2008 Database SQLServer-Training.com.
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Loading Ola Ekdahl IT Mentors 9/12/08.
Sofia, Bulgaria | 9-10 October SQL Server 2005 High Availability for developers Vladimir Tchalkov Crossroad Ltd. Vladimir Tchalkov Crossroad Ltd.
1 Oracle Database 11g – Flashback Data Archive. 2 Data History and Retention Data retention and change control requirements are growing Regulatory oversight.
Data: Migrating, Distributing and Audit Tracking Michelle Ayers, Advisory Solution Consultant
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
1099 Why Use InterBase? Bill Todd The Database Group, Inc.
Understanding SQL Server 2008 Change Data Capture Bret Stateham Training Manager Vortex Learning Solutions blogs.netconnex.com.
SQLintersection Putting the "Squeeze" on Large Tables Improve Performance and Save Space with Data Compression Justin Randall Tuesday,
02 | Data Flow – Extract Data Richard Currey | Senior Technical Trainer–New Horizons United George Squillace | Senior Technical Trainer–New Horizons Great.
SQL SERVER AUDITING. Jean Joseph DBA/Consultant Contact Info: Blog:
Best Practices in Loading Large Datasets Asanka Padmakumara (BSc,MCTS) SQL Server Sri Lanka User Group Meeting Oct 2013.
Log Shipping, Mirroring, Replication and Clustering Which should I use? That depends on a few questions we must ask the user. We will go over these questions.
# CCNZ What is going on here???
SQL Advanced Monitoring Using DMV, Extended Events and Service Broker Javier Villegas – DBA | MCP | MCTS.
SQL Server DML Change Capture An overview of several useful SQL Server data change capture technologies Matt Smith Software Architect, Enterprise Data.
Session Name Pelin ATICI SQL Premier Field Engineer.
Carlos Bossy Quanta Intelligence SQL Server MCTS, MCITP BI CBIP, Data Mining Real-time Data Warehouse and Reporting Solutions.
11 Copyright © 2009, Oracle. All rights reserved. Enhancing ETL Performance.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
Introduction to Partitioning in SQL Server
With Temporal Tables and More
ETL Design - Stage Philip Noakes May 9, 2015.
Data Warehouse ETL By Garrett EDmondson Thanks to our Gold Sponsors:
Tim Hall Oracle ACE Director
SQL Server Statistics and its relationship with Query Optimizer
Data Virtualization Demoette… Caching – Database – Multi Table
Why are you still taking backups?
Katowice,
Temporal Databases Microsoft SQL Server 2016
How To Pass Oracle 1z0-060 Exam In First Attempt?
Temporal Databases Microsoft SQL Server 2016
Death by 1000 changes An overview of several useful Microsoft SQL Server DML change capture technologies DML – Data manipulation language (compared to.
Finding more space for your tight environment
A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Beta 2 Matthew Stephen IT Pro Evangelist (SQL Server)
Designing Database Solutions for SQL Server
Example of a page header
Installation and database instance essentials
Automating SQL Server Management
Presented by: Warren Sifre
The Ins and Outs of Partitioned Tables
SQL Backups for Beginners by Mark Gordon
Traveling in time with SQL Server 2017
Azure SQL Data Warehouse Performance Tuning
Performance Tuning SSIS
BRK2279 Real-World Data Movement and Orchestration Patterns using Azure Data Factory Jason Horner, Attunix Cathrine Wilhelmsen, Inmeta -
Transactions, Locking and Query Optimisation
In Memory OLTP Not Just for OLTP.
Stretch Database - Historical data storage in SQL Server 2016
11 Simplex or Multiplex?.
Unleashing Stretched Databases
Bulk Load and Minimal Logging
Andrew Fryer Microsoft UK
Chapter 11 Managing Databases with SQL Server 2000
Partition Switching Joe Tempel.
Change Tracking Live Data Warehouse
Processing Tabular Models
Working with Very Large Tables Like a Pro in SQL Server 2017
David Gilmore & Richard Blevins Senior Consultants April 17th, 2012
T-SQL Tools: Simplicity for Synchronizing Changes Martin Perez.
Implementing ETL solution for Incremental Data Load in Microsoft SQL Server Ganesh Lohani SR. Data Analyst Lockheed Martin
Presentation transcript:

Antonio Abalos Castillo How to load your data faster and safer using Change Tracking in SQL Server

Thank you to our sponsors!

Agenda Why faster data loads? What is Change Tracking? Design overview Demo/implementation Extra hints

Why faster data loads? Corporations load and replicate data in a variety of ways They become unreliable or miss data over time They use unsupported ways to identify increment of data They are difficult to maintain Not optimal when identifying the updated data Need extra programming effort Do not follow standards

Why faster data loads? Benefits of using this approach No programming overhead at the source Avoid using timestamps, row GUIDs or any other programming artifact Change Tracking is transparent to applications Maintenance cost is 0 Very low performance impact in the source database Multiple target systems can get data from the same source DB using this approach We get just the latest version, according to our last status. All different row status in the middle are skipped Running the delta more often will decrease the execution time MERGE is the fastest data loading method (SCD remains as a bad example) Minimally logged operations will help performance (maybe more than you think)

What is Change Tracking? Change tracking is a lightweight solution that provides an efficient change tracking mechanism for applications Available since SQL Server 2008 Requires Standard edition of SQL Server or higher Lightweight: The incremental performance overhead that is associated with using change tracking on a table is similar to the overhead incurred when an index is created for a table and needs to be maintained https://technet.microsoft.com/en-us/library/hh710064(v=sql.110).aspx https://msdn.microsoft.com/en-us/library/bb933875(v=sql.110).aspx

What is Change Tracking? Each insert/update/delete in each table will be tracked by: The ID columns used in the table [optional] the columns that were updated Changes are accumulated and reported by SQL Server according to the last version we got https://msdn.microsoft.com/en-us/library/hh710064(v=sql.110).aspx

What is Change Tracking? Enable Change Tracking Database level ALTER DATABASE AdventureWorks2012 SET CHANGE_TRACKING = ON (CHANGE_RETENTION = 2 DAYS, AUTO_CLEANUP = ON) For each audited table ALTER TABLE dbo.SalesOrderDetail ENABLE CHANGE_TRACKING WITH (TRACK_COLUMNS_UPDATED = ON)

What is Change Tracking? Get changes from Change Tracking Get current version SET @ver = CHANGE_TRACKING_CURRENT_VERSION(); Get minimum valid version SET @mvv = CHANGE_TRACKING_MIN_VALID_VERSION(OBJECT_ID('dbo.Sales'));

What is Change Tracking? Get changes from Change Tracking Get changes for one table DECLARE @last_ver BIGINT = 82; SELECT CT.SalesID, CT.SYS_CHANGE_OPERATION, CT.SYS_CHANGE_COLUMNS FROM CHANGETABLE(CHANGES dbo.Sales, @last_ver) AS CT https://technet.microsoft.com/en-us/library/cc280358(v=sql.105).aspx

Design overview Target Staging area MERGE delta over target data ETL Minimally logged operations Automatic delta/full load detection Source Change Tracking enabled Isolation aware

Design overview Requirements: SQL Server source database Change Tracking enabled Integration Services MERGE statements (SQL 2008+) Other data sources: Change Data Capture (Oracle)

Demo Demo scenario Server A Server B SQL Server Source database Windows Azure VNET Server A SQL Server Source database Change Tracking Server B SQL Server Target database Logging SSIS Data flow

Extra hints – Best practices Transaction isolation strategy Enable SNAPSHOT isolation in the source database Or create a source snapshot database Index maintenance jobs can break big transactions at the source Watch out for complex data flows that may need to break down into simpler ones The best is to have a one-to-one copy of the source table, but this is not always possible How do we deal with deleted rows? (joining tables) Do we need to track changes in columns?

Extra hints - Trick list Use trace flag 610 (carefully) Use tab-lock in destination Use ORDER hint in destination Boost up DFT memory Boost up DFT number of rows Run parallel tasks The Data Loading Performance Guide https://msdn.microsoft.com/en-us/library/dd425070.aspx

Extra hints - Other tricks Databases in “simple” recovery model Change page torn detection to NONE Create a DATA file group and set it as DEFAULT Create as many files as CPU in each file group (depends on storage) Separate the log file from the data files in different disks Consider using heaps for fast-load processes Consider using partitioned tables for regular tables Increase parallelism

Extra hints - Security considerations Catalog views sys.change_tracking_databases sys.change_tracking_tables Permissions SELECT permission on at least the primary key columns on the change-tracked table to the table that is being queried VIEW CHANGE TRACKING permission on the table for which changes are being obtained https://msdn.microsoft.com/en-us/library/hh710064(v=sql.110).aspx

Extra hints - Change Tracking Vs. Change Data Capture Change data capture (CDC) Change tracking (CT) Tracked changes   DML changes Yes Tracked information Historical data No Whether column was changed DML type Collects historical values, and therefore much more data than CT You have no idea on how many updates were made to a row, nor the values that were updated https://technet.microsoft.com/en-us/library/cc280519(v=sql.105).aspx https://msdn.microsoft.com/en-us/library/bb933994.aspx

Other references Brent Ozar’s guide to Change Tracking https://www.brentozar.com/archive/2014/06/performance-tuning-sql-server-change-tracking/ Good guide for a data load using Change Tracking implementation https://www.timmitchell.net/post/2016/01/20/using-sql-server-change-tracking-for-incremental-loads