SQL Server 2014: In-Memory OLTP Adoption Considerations Mike

SQL Server 2014: In-Memory OLTP Adoption Considerations Mike Weiner @MikeW_CAT michwe@microsoft.com

In-Memory OLTP Architectural Pillars - Review SQL Server Integration Same manageability, administration & development experience Integrated queries & transactions Integrated HA and backup/restore Main-Memory Optimized Direct pointers to rows Indexes exist only in memory No buffer pool No write-ahead logging Stream-based storage Non-Blocking Execution Multi-version optimistic concurrency control with full ACID support Lock-free data structures No locks, latches or spinlocks No I/O during transaction T-SQL Compiled to Native Machine Code T-SQL compiled to machine code leveraging VC compiler Procedure and its queries, becomes a C function Aggressive optimizations @ compile-time Architectural Pillars Results Hybrid engine and integrated experience Speed of an in- memory cache with capabilities of a database Transactions execute to completion without blocking Queries & business logic run at native- code speed Principl es Performance-critical data fits in memory Conflicts are Rare Push decisions to compilation time Built-In

Introduction Application Patterns for In-Memory OLTP In-Memory OLTP can meet demanding scale and performance goals with SQL Server In-Memory OLTP is a solution for applications suffering from RDBMS performance and scale bottlenecks There are some common workload patterns which work well with In-Memory OLTP Adoption Methodology and considerations for In-Memory OLTP Evaluate (early) whether In-Memory OLTP can address the application needs. Some workloads may not benefit from In-Memory OLTP. Some may with a large amount of work required Migrate in stages and there are tools to help Consider aspects of In-Memory OLTP Engine in design

Successful Patterns with In-Memory OLTP* Implementation ScenarioPattern Characteristics and ChallengeMain Benefits of In-Memory OLTP High Data Insert Rate  Primarily append-only store  Inability to ingest write workload  Eliminate contention  Minimize IO logging Read Performance and Scale  High performance read operations  Unable to meet scale-up requirements  Eliminate contention  Efficient data retrieval  Minimize code execution time o CPU efficiency for scale Compute Heavy Data Processing  Insert/Update/Delete workload  Heavy computation inside database  Read and write contention  Eliminate contention  Minimize code execution time  Efficient data processing Low Latency  Require low latency business transactions which typical database solutions cannot achieve  High concurrency exacerbates latency  Eliminate contention  Minimize code execution time  Efficient data retrieval Session State Management  Heavy insert, update and point lookups  User scale under load from multiple stateless web servers  Eliminate contention  Efficient data retrieval  Optional IO reduction/removal *In-Memory OLTP – Common Workload Patterns and Migration ConsiderationsIn-Memory OLTP – Common Workload Patterns and Migration Considerations

Early Workload Performance Gains Despite 20 years of optimizing for OLTP benchmarks – we still get ~2x on a workload derived from TPC-C Apps that take full advantage: e.g. web app session state Apps with periodic bulk updates & heavy random reads Existing apps typically see 4-7x improvement

In-Memory OLTP impact within the Engine Client Connectivity Query Execution Data Access (Buffer Pool) Transaction Logging Client Connectivity Procedure Execution Data Access (Memory) Transaction Logging No Improvement Same Latency Potentially Less Volume for Durable I/O Eliminated for Non-Durable Traditional Execution StackIn-Memory OLTP Execution Stack Gains 2X to 30X Improvement *Including concurrency gains

Adoption Steps Assess application Is the application suitable for In-Memory OLTP? Define goals and measure Identify bottlenecks “Analysis” through MDW and ‘Transaction Performance Analysis Report’ Staged migration “ Migrate” utilizing Advisors in Management Studio for help Measure against goals Understand considerations Memory, Storage, Indexing

Invest in workloads that are OLTP in nature with few exceptions Move critical data-sets/T-SQL into In-Memory OLTP and add more via subsequent dev/test iterations Some applications may require a lot of work to achieve the gains, while others migrate quite easily. If you lack the ability to change code In-Memory OLTP in SQL Server 2014 won’t work Some deployments may not be a good fit Hardware limitations Long running reporting workloads Row-versions can utilize a lot of memory if queries are “long running” Queries which execute much better with parallelism Dependencies on locking, MARS, unsupported collations, specialized indexes like XML/Full- Text Very dynamic, undefined workload patterns Assess Application

Establish Performance Baseline Production is great More accurate and realistic data But hard to get access and may impact workload Simulation works, but requires… Scale, transaction mix, profiling period matters Try to be as realistic as possible with scale, transaction mix, hardware configuration, etc. Baseline using Workload Tools Ostress, Distributed Replay, Visual Studio Test System etc… In-Memory OLTP’s Analyze, Migrate and Report Toolset Performance Monitor/DMV’s

Identify Bottlenecks Table contention Look at dm_db_index_operational_stats, dm_os_latch_stats, and Latch Waits/sec counter Target table with heaviest latching Table access Look at dm_db_index_operational_stats Target table with heaviest accesses Stored procedure CPU usage Look at dm_exec_query_stats Target stored procedure with highest CPU usage Tooling in place In-Memory OLTP’s Analyze, Migrate and Report Toolset

SELECT TOP (5)a.database_id, so.object_id, so.name AS TableName, a.page_latch_wait_count, a.page_latch_wait_in_ms, a.page_lock_wait_count, a.page_lock_wait_in_ms FROM sys.dm_db_index_operational_stats( DB_ID(), NULL, NULL, NULL) a INNER JOIN sys.objects AS so ON a.object_id = so.object_id WHEREso.type = 'U' AND a.page_io_latch_ wait_count > 0 ORDER BY a.page_latch_wait_count DESC SELECT TOP (10) sp.database_id, so.name AS StoredProcName, sp.total_worker_time, sp.execution_count, sp.total_logical_reads, sp.total_logical_writes, sp.total_logical_reads FROM sys.dm_exec_procedure_stats AS sp INNER JOIN sys.objects AS so ON (sp.object_id = so.object_id) WHEREso.type = 'P' AND sp.database_id = DB_ID() ORDER BY sp.total_worker_time DESC; T-SQL usagelatch Manual Analysis if MDW is too much overhead

Testing with realistic workload is critical

Identify bottlenecks in tables Address unsupported constructs and migrate data Perform minimal changes for interpreted access Identify performance critical transaction code Address surface area limitations and migrate code Taking an Iterative Approach to Migration

Durability Considerations Durability = SCHEMA_AND_DATA Operations on data are always fully logged Data is stored in checkpoint files on disk and can be recovered Requires storage considerations Durability = SCHEMA_ONLY Transient data with no log or checkpoint files Schema can be recovered but data cannot be recovered after server restart or failure Reduces dependency on disk drives, improves overall transaction throughput Not universally applicable – use for select scenarios only

Log Drive Log IO is still critical to latency Index operations and UNDO records are not logged Data Drive – Checkpoint File Pairs (CFP), space and performance Have checkpoint files on drive with high sequential I/O Needed for offline checkpoint, garbage collection (merge–read/consolidate), and recovery Spread checkpoint files over odd numbers of containers over multiple disks Data/Delta allocation round-robin across all containers, improve streaming IO bandwidth Size of files may be larger then size of tables in-memory Start with allocating storage up to 4 times size of durable tables Automated merge to consolidate CFPs to remove deleted rows Merged CFPs to go through state transition and are eventually garbage collected Requires multiple checkpoints and log truncation. Need log backup based on recovery model Use workload’s steady state to determine disk usage http://blogs.technet.com/b/dataplatforminsider/archive/2014/01/16/storage-allocation-and-management- for-memory-optimized-tables.aspx Storage Considerations for SCHEMA_AND_DATA objects

Memory Allocation Size of rows (datatypes) and indexes, plus row-versions Row versions must stay in-memory until all transactions with their timestamp end are done Garbage collection, unless under memory pressure, is async and can take time to remove rows Nonclustered Hash Index: fixed, function of bucket count, 8 bytes per bucket Nonclustered Index: dynamic, function of row count, overhead Recommendations: Around 250GB max for SCHEMA_AND_DATA tables in a database Consider ‘common’ OLTP workload 2x memory allocation (2 row versions). Consider other workloads which may require memory under SQL Server process Use Resource Governor to control memory allocated at the database level Create Resource Pool…MAX_MEMORY_PERCENT Pools only allocate a percent (around 80%) to memory-optimized objects for system stability Memory Considerations

Estimating Memory Consumption Memory Size = Table Size + SUM(Index Size) Table Size = Row Size * Row Count Row Size = SUM(Column Sizes) + Row Header Size Row Header Size = 24 + 8 * Index Count Column Size = Dependent on column type and associated padding/overhead Hash Index Size = Bucket_Count * 8 bytes Nonclustered Index Size = Row Count * (Key Size + 8) bytes Details available in Books Online - http://msdn.microsoft.com/en- us/library/dn205318(v=sql.120).aspxhttp://msdn.microsoft.com/en- us/library/dn205318(v=sql.120).aspx

No need to re-invest in hardware (although likely more memory). Processor Number of cores/sockets – All deployments have been 2-4 socket Not NUMA aware Network Bandwidth, latency and configuration are still critical Receive-Side Scaling configurations can become important Consider batching client calls to avoid number of client/server round- trips for latency Further System Considerations

Nonclustered memory-optimized: Execution similar to nonclustered on disk-based tables Choose index direction that conforms to seek direction Definition of column in index is distinct ordering during seek – Should match ORDER BY Nonclustered Hash memory-optimized: Bucket_count requirement Bucket-count too low or too high impacts read and write behavior & memory Performance generally good up to 5x bucket count, if avg key values per bucket > 10 consider nonclustered Usually better to overestimate rather than under Queries on sub-set of columns in hash index result in scans Index key should be exact columns search is on Recall indexes need to be re-created on startup Index Guidelines

OperationNonclusteredNonclustered Hash Specify Bucket CountNot RequiredRequired Index Scan, retrieve all table rowsSupported Index seek on equality predicate(s) (=) Supported*Performs better than Nonclustered. Predicate must include all columns in hash key, otherwise will result in scan Index seek on inequality predicates (>, =), BETWEEN Performs better than Nonclustered Hash* Will result in scan Retrieve rows in a sort-order matching the index definition SupportedNot Supported Retrieve rows in a sort-order matching the reverse of the index definition Not Supported * Predicate does not need to include all key columns but must include at least leading key columns, otherwise will result in scan

Interpreted Transact-SQL vs. Natively Compiled Interpreted T-SQL Access – Access both memory- and disk-based tables – Less performant – Virtually full T-SQL surface When to use – Ad hoc queries – Surface area not available in native – Compilation at runtime can impact plan (positively) Natively Compiled Procs ‒ Access only memory- optimized tables ‒ Maximum performance ‒ Limited T-SQL surface area When to use ‒ Well defined OLTP-style operations ‒ Optimize performance critical business logic ‒ Query plan unlikely to change based on statistics

 Lack of parallelism can be problematic for some queries  Consider trade-off between faster execution (native) vs. parallel operation  Joins between memory-optimized table and columnstore can be problematic  Statistics creation and updates can be problematic for large tables  Requirement of FULLSCAN can take a long time  If query has many attributes individual column create stats can have an impact  Consider impact of no auto-update statistics  For InterOp queries: determining how often and when can affect plans  For native procedures: Do data load, execute statistics update, create native compiled procedure Query Performance

Recap: Migration Steps Assess application Is the application suitable for In-memory OLTP? Define goals and measure Identify bottlenecks “Analysis” through MDW and ‘Transaction Performance Analysis Report’ Staged migration “Migrate” utilizing Advisors in Management Studio for help Measure against goals Understand considerations Memory, Storage, Indexing

Review Application Patterns for In-Memory OLTP In-Memory OLTP can meet demanding scale and performance goals with SQL Server In-Memory OLTP is a solution for applications suffering from engine performance and scale bottlenecks There are some common workload patterns which work well with In-Memory OLTP Adoption Methodology and considerations for In-Memory OLTP Evaluate (early) whether In-Memory OLTP can address the application needs. Some workloads may not benefit from In-Memory OLTP. Some may with a large amount of work required Migrate in stages and there are tools to help Consider aspects of In-Memory OLTP Engine in design

References

SQL Server 2014: In-Memory OLTP Adoption Considerations Mike

Similar presentations

Presentation on theme: "SQL Server 2014: In-Memory OLTP Adoption Considerations Mike"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

SQL Server 2014: In-Memory OLTP Adoption Considerations Mike

Similar presentations

Presentation on theme: "SQL Server 2014: In-Memory OLTP Adoption Considerations Mike"— Presentation transcript:

Similar presentations

About project

Feedback