Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 1 Oracle Active Data Guard Performance Joseph Meeks Director, Product Management Oracle.

Slides:



Advertisements
Similar presentations
Replication for Availability & Durability with MySQL and Amazon RDS Grant McAlister.
Advertisements

1 Copyright © 2012 Oracle and/or its affiliates. All rights reserved. Convergence of HPC, Databases, and Analytics Tirthankar Lahiri Senior Director, Oracle.
1 Chapter 16 Tuning RMAN. 2 Background One of the hardest chapters to develop material for Tuning RMAN can sometimes be difficult Authors tried to capture.
<Insert Picture Here>
Exadata Distinctives Brown Bag New features for tuning Oracle database applications.
Current Testbed : 100 GE 2 sites (NERSC, ANL) with 3 nodes each. Each node with 4 x 10 GE NICs Measure various overheads from protocols and file sizes.
Oracle Exadata for SAP.
INTRODUCTION TO ORACLE Lynnwood Brown System Managers LLC Oracle High Availability Solutions RAC and Standby Database Copyright System Managers LLC 2008.
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 1.
© 2010 VMware Inc. All rights reserved Confidential Performance Tuning for Windows Guest OS IT Pro Camp Presented by: Matthew Mitchell.
High Availability Group 08: Võ Đức Vĩnh Nguyễn Quang Vũ
RDS and Oracle 10g RAC Update Paul Tsien, Oracle.
Oracle Data Guard Ensuring Disaster Recovery for Enterprise Data
Oracle Database Compression with Oracle Database 12c
Keith Burns Microsoft UK Mission Critical Database.
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
Redo Waits Kyle Hailey #.2 Copyright 2006 Kyle Hailey Redo REDO Lib Cache Buffer Cache Locks Network I/O.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1 Preview of Oracle Database 12 c In-Memory Option Thomas Kyte
Proven Techniques for Maximizing Availability Maximum Availability Architecture Lawrence To, Shari Yamaguchi High Availability Systems Group Systems Technologies.
Oracle Open World September 28 – October 2, 2014 Jim McKinstry Oracle Consulting Practice Director Bill Callahan Director, Product and Technology.
1 Data Guard Basics Julian Dyke Independent Consultant Web Version - February 2008 juliandyke.com © 2008 Julian Dyke.
Infiniband enables scalable Real Application Clusters – Update Spring 2008 Sumanta Chatterjee, Oracle Richard Frank, Oracle.
1 Copyright © 2009, Oracle. All rights reserved. Exploring the Oracle Database Architecture.
Building Highly Available Systems with SQL Server™ 2005 Vineet Gupta Evangelist – Data and Integration Microsoft Corp.
Oracle Recovery Manager (RMAN) 10g : Reloaded
2 Copyright © 2006, Oracle. All rights reserved. Performance Tuning: Overview.
Clustering  Types of Clustering. Objectives At the end of this module the student will understand the following tasks and concepts. What clustering is.
Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads Iraklis Psaroudakis (EPFL), Tobias Scheuer (SAP AG), Norman May.
ORACLE 10g DATAGUARD Ritesh Chhajer Sr. Oracle DBA.
Database Edition for Sybase Sales Presentation. Market Drivers DBAs are facing immense time pressure in an environment with ever-increasing data Continuous.
11g(R1/R2) Data guard Enhancements Suresh Gandhi
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Oracle Advanced Compression – Reduce Storage, Reduce Costs, Increase Performance Session: S Gregg Christman -- Senior Product Manager Vineet Marwah.
1.
Data Dependent Routing may not be necessary when using Oracle RAC Ken Gottry Apr-2003 Through Technology Improvements in: Oracle 9i - RAC Oracle 9i - CacheFusion.
Designing and Deploying a Scalable EPM Solution Ken Toole Platform Test Manager MS Project Microsoft.
MPE/iX 7.5 and HP e3000 PA-8700 Performance Upgrade Updates Kevin Cooper Hewlett-Packard
Daniela Anzellotti Alessandro De Salvo Barbara Martelli Lorenzo Rinaldi.
1 Data Guard. 2 Data Guard Reasons for Deployment  Site Failures  Power failure  Air conditioning failure  Flooding  Fire  Storm damage  Hurricane.
DONE-08 Sizing and Performance Tuning N-Tier Applications Mike Furgal Performance Manager Progress Software
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
Oracle Open World 2009 Active / Active Configurations with Oracle Active Data Guard Aris Prassinos Distinguished Member of Technical Staff MorphoTrak,
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Oracle® Streams for Near Real Time Asynchronous Replication Nimar S. Arora Oracle USA.
Process Architecture Process Architecture - A portion of a program that can run independently of and concurrently with other portions of the program. Some.
14 Copyright © 2005, Oracle. All rights reserved. Backup and Recovery Concepts.
1© Copyright 2012 EMC Corporation. All rights reserved. EMC VNX5700, EMC FAST Cache, SQL Server AlwaysOn Availability Groups Strategic Solutions Engineering.
Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
18 Copyright © 2004, Oracle. All rights reserved. Backup and Recovery Concepts.
CERN IT Department CH-1211 Genève 23 Switzerland 1 Active Data Guard Svetozár Kapusta Distributed Database Operations Workshop November.
18 Copyright © 2004, Oracle. All rights reserved. Recovery Concepts.
7 Copyright © Oracle Corporation, All rights reserved. Instance and Media Recovery Structures.
14 Copyright © 2005, Oracle. All rights reserved. Backup and Recovery Concepts.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Introduction to Exadata X5 and X6 New Features
REMINDER Check in on the COLLABORATE mobile app Best Practices for Oracle on VMware - Deep Dive Darryl Smith Chief Database Architect Distinguished Engineer.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1.
14 Copyright © 2007, Oracle. All rights reserved. Backup and Recovery Concepts.
Exadata Distinctives 988 Bobby Durrett US Foods. What is Exadata? Complete Oracle database platform Disk storage system Unique to Exadata – intelligent.
Flash Storage 101 Revolutionizing Databases
Diskpool and cloud storage benchmarks used in IT-DSS
Maximum Availability Architecture Enterprise Technology Centre.
Windows Azure Migrating SQL Server Workloads
PAE SuperCluster Peformance
Proving Hardware Bottlenecks &
Introduction.
Oracle Data Guard Session-4
Presentation transcript:

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 1 Oracle Active Data Guard Performance Joseph Meeks Director, Product Management Oracle High Availability Systems

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 2 Note to viewer  These slides provide various aspects of performance data for Data Guard and Active Data Guard – we are in the process of updating for Oracle Database 12c.  It can be shared with customers, but is not intended to be a canned presentation ready to go in its entirety  It provides SC’s data that can be used to substantiate Data Guard performance or to provide focused answers to particular concerns that may be expressed by customers.

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 3 Note to viewer  See this FAQ for more customer and sales collateral – 66::::P75_ID,P75_AREAID:21704,2

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 4 Agenda – Data Guard Performance  Failover and Switchover Timings  SYNC Transport Performance  ASYNC Transport Performance  Primary Performance with Multiple Standby Databases  Redo Transport Compression  Standby Apply Performance

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 5 Data Guard 12.1 Example - Faster Failover # of database sessions on primary and standby 43 seconds 2,000 sessions on both primary and standby 48 seconds 2,000 sessions on both primary and standby Preliminary

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 6 Data Guard 12.1 Example – Faster Switchover # of database sessions on primary and standby 83 seconds 500 sessions on both primary and standby 72 seconds 1,000 sessions on both primary and standby Preliminary

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 7 Agenda – Data Guard Performance  Failover and Switchover Timings  SYNC Transport Performance  ASYNC Transport Performance  Primary Performance with Multiple Standby Databases  Redo Transport Compression  Standby Apply Performance

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 8 Synchronous Redo Transport  Primary database performance is impacted by the total round-trip time for acknowledgement to be received from the standby database – Data Guard NSS process transmits Redo to the standby directly from log buffer, in parallel with local log file write – Standby receives redo, writes to a standby redo log file (SRL), then returns ACK – Primary receives standby ACK, then acknowledges commit success to app  The following performance tests show the impact of SYNC transport on primary database using various workloads and latencies  In all cases, transport was able to keep pace with generation – no lag  We are working on test data for Fast Sync (SYNCNOAFFIRM) in Oracle Database 12c (same process as above, but standby acks primary as soon as redo is received in memory – it does not wait for SRL write. Zero Data Loss

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 9 Test 1) Synchronous Redo Transport  Workload: – Random small inserts (OLTP) to 9 tables with 787 commits per second – 132 K redo size, 1368 logical reads, 692 block changes per transaction  Sun Fire X4800 M2 (Exadata X2-8) – 1 TB RAM, 64 Cores, Oracle Database , Oracle Linux – InfiniBand, seven Exadata cells, Exadata Software  Exadata Smart Flash, Smart Flash Logging and Write-Back flash enabled provided significant gains OLTP with Random Small Insert < 1ms RTT Network Latency

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 10 Test 1) Synchronous Redo Transport  Local standby, <1ms RTT  99MB/s redo rate  <1% impact on database throughput  1% impact on transaction rate OLTP with Random Small Inserts and < 1ms RTT Network Latency With Data Guard Synchronous Transport Enabled Data Guard Transport Disabled RTT = network round trip time

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 11 Test 2) Synchronous Redo Transport  Exadata X2-8, 2-node RAC database – smart flash logging, smart write back flash  Swingbench OLTP workload – Random DMLs, 1 ms think time, 400 users, transactions per second, 30MB/s peak redo rate (different from test 2)  Transaction profile – 5K redo size, 120 logical reads, 30 block changes per transaction  1 and 5ms RTT network latency Swingbench OLTP Workload with Metro-Area Network Latency

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 12 Test 2) Synchronous Redo Transport  30 MB/s redo  3% impact at 1ms RTT  5% impact at 5ms RTT Swingbench OLTP Workload with Metro-Area Network Latency Swingbench OLTP Baseline No Data Guard Data Guard SYNC 1ms RTT Network Latency Data Guard SYNC 5ms RTT Network Latency 6363 tps 6151 tps 6077 tps Transactions per/second

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 13 Test 3) Synchronous Redo Transport  Exadata X2-8, 2-node RAC database – smart flash logging, smart write back flash  Large insert OLTP workload – 180+ transactions per second, 83MB/s peak redo rate, random tables  Transaction profile – 440K redo size, 6000 logical reads, 2100 block changes per transaction  1, 2 and 5ms RTT network latency Large Insert OLTP Workload with Metro-Area Network Latency

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 14 Test 3) Synchronous Redo Transport  83 MB/s redo  <1% impact at 1ms RTT  7% impact at 2ms RTT  12% impact at 5ms RTT Large Insert OLTP Workload with Metro-Area Network Latency Large Insert - OLTP Baseline No Data Guard 1ms RTT Network Latency 5ms RTT Network Latency Transactions per/second 189 tps 188 tps 167 tps 177 tps 2ms RTT Network Latency

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 15 Test 4) Synchronous Redo Transport  Exadata X2-8, 2-node RAC database – smart flash logging, smart write back flash  Mixed workload with high TPS – Swingbench plus large insert workloads – txn per second and 112 MB/sec peak redo rate  Transaction profile – 4K redo size, 51 logical reads, 22 block changes per transaction  1, 2 and 5ms RTT network latency Mixed OLTP workload with Metro-Area Network Latency

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 16 Test 4) Synchronous Redo Transport Mixed OLTP workload with Metro-Area Network Latency Swingbench plus large insert  112 MB/s redo  3% impact at < 1ms RTT  5% impact at 2ms RTT  6% impact at 5ms RTT Note: 0ms latency on graph represents values falling in the range <1ms

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 17 Additional SYNC Configuration Details  No system bottlenecks (CPU, IO or memory) were encountered during any of the test runs – Primary and standby databases had 4GB online redo logs – Log buffer was set to the maximum of 256MB – OS max TCP socket buffer size set to 128MB on both primary and standby – Oracle Net configured on both sides to send and receive 128MB with an SDU for 32k – Redo is being shipped over a 10GigE network between the two systems. – Approximately 8-12 checkpoints/log switches are occurring per run For the Previous Series of Synchronous Transport Tests

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 18 Customer References for SYNC Transport  Fannie Mae Case Study that includes performance data Fannie Mae Case Study  Other SYNC references – Amazon Amazon – Intel Intel – MorphoTrak – prior biometrics division of Motorola, case study, podcast, presentationcase studypodcastpresentation – Enterprise Holdings Enterprise Holdings – Discover Financial Services, podcast, presentationpodcastpresentation – Paychex Paychex – VocaLink VocaLink

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 19 Synchronous Redo Transport  Redo rates achieved are influenced by network latency, redo-write size, and commit concurrency – in a dynamic relationship with each other that will vary for every environment and application  Test results illustrate how an example workload can scale with minimal impact to primary database performance  Actual mileage will vary with each application and environment.  Oracle recommends customers conduct their own tests, using their workload and environment. Oracle tests are not a substitute. Caveat that Applies to ALL SYNC Performance Comparisons

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 20 Agenda  Failover and Switchover Timings  SYNC Transport Performance  ASYNC Transport Performance  Primary Performance with Multiple Standby Databases  Redo Transport Compression  Standby Apply Performance

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 21 Asynchronous Redo Transport  ASYNC does not wait for primary acknowledgement  A Data Guard NSA process transmits directly from log buffer in parallel with local log file write – NSA reads from disk (online redo log file) if log buffer is recycled before redo transmission is completed  ASYNC has minimal impact on primary database performance  Network latency has little, if any, impact on transport throughput – Uses Data Guard 11g streaming protocol & correctly sized TCP send/receive buffers  Performance tests are useful to characterize max redo volume that ASYNC is able to support without transport lag – Goal is to ship redo as fast as generated without impacting primary performance Near Zero Data Loss

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 22 Asynchronous Test Configuration  100GB online redo logs  Log buffer set to the maximum of 256MB  OS max TCP socket buffer size set to 128MB on primary and standby  Oracle Net configured on both sides to send and receive 128MB  Read buffer size set to 256 (_log_read_buffer_size=256) and archive buffers set to 256 (_log_archive_buffers=256) on primary and standby  Redo is shipped over the IB network between primary and standby nodes (insures that transport is not bandwidth constrained) – Near-zero network latency, approximate throughput of 1200MB/sec. Details

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 23 ASYNC Redo Transport Performance Test  Data Guard ASYNC transport can sustain very high rates ‒ 484 MB/sec on single node ‒ Zero transport lag  Add RAC nodes to scale transport performance ‒ Each node generates its own redo thread and has a dedicated Data Guard transport process ‒ Performance will scale as nodes are added assuming adequate CPU, I/O, and network resources  A 10GigE NIC on standby receives data at maximum of 1.2 GB/second ‒ Standby can be configured to receive redo across two or more instances Oracle Database Redo Transport MB/sec 484

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 24 Data Guard 11g Streaming Network Protocol  Streaming protocol is new with Data Guard 11g  Test measured throughput with 0 – 100ms RTT  ASYNC tuning best practices – Set correct TCP send/receive buffer size = 3 x BDP (bandwidth delay product)  BDP = bandwidth x round-trip network latency – Increase log buffer size if needed to keep NSA process reading from memory  See support note  X$LOGBUF_READHIST to determine buffer hit rate High Network Latency has Negligible Impact on Network Throughput Redo Transport Rate MB/sec Network Latency

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 25 Agenda  Failover and Switchover Timings  SYNC Transport Performance  ASYNC Transport Performance  Primary Performance with Multiple Standby Databases  Redo Transport Compression  Standby Apply Performance

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 26 Multi-Standby Configuration  A growing number of customers use multi-standby Data Guard configurations.  Additional standbys are used for: – Local zero data loss HA failover with remote DR – Rolling maintenance to reduce planned downtime – Offloading backups, reporting, and recovery from primary – Reader farms – scale read-only performance  This leads to the question: How is primary database performance affected as the number of remote transport destinations increases? Primary - ALocal Standby - B Remote Standby - C SYNC ASYNC

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 27 Redo Transport in Multi-Standby Configuration Primary Performance Impact: 14 Asynchronous Transport Destinations Increase in CPU (compared to baseline) Change in redo volume (compared to baseline) destinations

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 28 Redo Transport in Multi-Standby Configuration Primary Performance Impact: 1 SYNC and multiple ASYNC Destinations Increase in CPU (compared to baseline) Change in redo volume (compared to baseline) # of SYNC/ASYNC destinations Zero1/01/11/14 # of SYNC/ASYNC destinations Zero1/01/11/14

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 29 Redo Transport for Gap Resolution  Standby databases can be configured to request log files needed to resolve gaps from other standby’s in a multi-standby configuration  A standby database that is local to the primary database is normally the preferred location to service gap requests – Local standby database are least likely to be impacted by network outages – Other standby’s are listed next – The primary database services gap requests only as a last resort – Utilizing a standby for gap resolution avoids any overhead on the primary database

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 30 Agenda  Failover and Switchover Timings  SYNC Transport Performance  ASYNC Transport Performance  Primary Performance with Multiple Standby Databases  Redo Transport Compression  Standby Apply Performance

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 31 Redo Transport Compression  Test configuration – 12.5 MB/second bandwidth – 22 MB/second redo volume  Uncompressed volume exceeds available bandwidth – Recovery Point Objective (RPO) impossible to achieve – perpetual increase in transport lag  50% compression ratio results in: – volume < bandwidth = achieve RPO – ratio will vary across workloads  Requires Advanced Compression Conserve Bandwidth and Improve RPO when Bandwidth Constrained 22 MB/sec uncompressed 12 MB/sec compressed Elapsed Time - Minutes Transport Lag - MB

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 32 Agenda  Failover and Switchover Timings  SYNC Transport Performance  ASYNC Transport Performance  Primary Performance with Multiple Standby Databases  Redo Transport Compression  Standby Apply Performance

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 33 Standby Apply Performance Test  Redo apply was first disabled to accumulate a large number of log files at the standby database. Redo apply was then restarted to evaluate max apply rate for this workload.  All standby log files were written to disk in Fast Recovery Area  Exadata Write Back Flash Cache increased the redo apply rate from 72MB/second to 174MB/second using test workload (Oracle ) – Apply rates will vary based upon platform and workload  Achieved volumes do not represent physical limits – They only represent the particular test case configuration and workload, higher apply rates have been achieved in practice by production customers

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 34 Apply Performance at Standby Database  Test 1: no write-back flash cache  On Exadata x2-2 quarter rack  Swing bench OLTP workload  72 MB/second apply rate – I/O bound during checkpoints – 1,762ms for checkpoint complete – 110ms DB File Parallel Write

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 35 Apply Performance at Standby Database  Test 2: a repeat of the previous test but with write-back flash cache enabled  On Exadata x2-2 quarter rack  Swing bench OLTP workload  174 MB/second apply rate – Checkpoint completes in 633ms vs 1,762ms – DB File Parallel Write is 21ms vs 110ms

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 36 Two Production Customer Examples  Thomson-Reuters – Data Warehouse on Exadata, prior to write-back flash cache – While resolving a gap of observed an average apply rate of 580MB/second  Allstate Insurance – Data Warehouse ETL processing resulted in average apply rate over a 3 hour period of 668MB/second, with peaks hitting 900MB/second Data Guard Redo Apply Performance

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 37 Redo Apply Performance for Different Releases Range of Observed Apply Rates for Batch and OLTP Standby Apply Rate MB/sec

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 38

Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 39