Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © 2006 Quest Software RAC be Nimble, RAC be Quick Bert Scalzo, Domain Expert, Oracle Solutions

Similar presentations


Presentation on theme: "Copyright © 2006 Quest Software RAC be Nimble, RAC be Quick Bert Scalzo, Domain Expert, Oracle Solutions"— Presentation transcript:

1 Copyright © 2006 Quest Software RAC be Nimble, RAC be Quick Bert Scalzo, Domain Expert, Oracle Solutions

2 1 About the Author … Domain Expert & Product Architect for Quest Software Oracle Background: Worked with Oracle databases for well over two decades (starting with version 4) Work history includes time at both Oracle Education and Oracle Consulting Academic Background: Several Oracle Masters certifications BS, MS and PhD in Computer Science MBA (general business) Several insurance industry designations Key Interests: Data Modeling Database Benchmarking Database Tuning & Optimization "Star Schema" Data Warehouses Oracle on Linux – and specifically: RAC on Linux Articles for: Oracle’s Technology Network (OTN) Oracle Magazine, Oracle Informant PC Week (eWeek) Articles for: Dell Power Solutions Magazine The Linux Journal

3 2 Books by Author … Potentially in 2008 …

4 3 Agenda RAC Performance –“RAC in the Box” syndrome –Must Optimize every subsystem RAC Optimization Approach –RAC Optimization Methodology –via Quest’s RAC Tools (just example - not a sales pitch) Real-world Scenario –Dell’s use of Quest’s Solution for Oracle RAC –“Best Practices” applied incrementally & results

5 4 Optimization Focus “Low Hanging Fruit” Obvious yet Overlooked Subtle yet Highly Critical Radically Different than other RAC tuning sessions: Not going to delve into obscure hardware, OS, network, Oracle and RAC tuning parameters or configurations, Just easy stuff that makes a big difference

6 5 Oracle RAC is Great, but … Too often people expect RAC to “auto-magically” function out of the box with little to no optimization During RAC optimization attempts, people far too often concentrate on just a single dimension – the Oracle database “stuff” (i.e. waits, parameters, etc) During RAC optimization attempts, Oracle is often too heavily weighted as the source and reason for most, if not all, of the performance bottlenecks Not enough “application nature” is identified and accounted for during the optimization process Result: Too many people achieve sub-par results!

7 6 I call it “RAC in the BOX” syndrome That’s just my stupid name for it (hope it catches) But nearly half the RAC sites I visit are suffering from RAC performance issues related to this! Still far too RAC experts among the general DBA population (although improving every day) Really no simple, single button tools yet to make RAC “auto-magically” “fire on all cylinders” Many people bail on RAC far too soon and simply fall-back to big SMP boxes (the evil that they know) But with just a little manual “box winding”, anyone should be able to “pop the RAC weasel” free

8 7 RAC is System, Must Tune its Entirety

9 8 RAC Performance = Sum of its Parts Application Nature (affects everything else below) Public Network Storage Network Storage Sub-System Oracle Instance Configuration Oracle Cluster Configuration Private Network (i.e. Interconnect)

10 9 RAC Performance Testing Methodology 1. Benchmark Factory tests or Oracle trace file playback generator 3. TOAD with DBA Module “AWR/ADDM Reports “ Identify and correct Top five wait events RAC Performance Testing Methodology 2. Spotlight on RAC Monitor RAC system for performance bottlenecks

11 10 Quest Solutions for Oracle RAC Benchmark Factory ® Test Oracle RAC environments rapidly and reliably Perform database “scalability” or “goal” testing to determine the most optimal RAC configuration Tests: –TPC-C –TPC-H –Trace File playback –etc, etc, etc … Let’s DBA concentrate on task at hand - Optimization

12 11

13 12 Quest Solutions for Oracle RAC Spotlight ® on RAC Monitor Oracle RAC environments rapidly and reliably Diagnose Oracle RAC environment health levels at –Node –Cluster –ASM –Instance –Interconnect Intelligent performance alerts plus market-leading GUI for entire RAC to instances architecture & bottlenecks Let’s DBA concentrate on task at hand - Diagnosing

14 13

15 14

16 15

17 16 Quest Solutions for Oracle RAC TOAD ® with DBA Module Expedite typical DBA management & tuning tasks Great Productivity Enhancing Features –Database Health Check –Database Probe –Database Monitor –AWR/ADDM Reports –UNIX Monitor –Stats Pack Reports (coming soon) See Toad World paper –Title: “Maximize Database Performance Via Toad for Oracle” –http://www.toadworld.com/Education/ToadWorldPapersandPodcasts/tabid/82/Default.aspxhttp://www.toadworld.com/Education/ToadWorldPapersandPodcasts/tabid/82/Default.aspx Let’s DBA concentrate on task at hand – Correcting (i.e. Fixing)

18 17

19 18 Real-world Scenario Quest strategic partner & customer Dell uses Quest’s solution for Oracle RAC to test the performance of the Oracle RAC architecture running on Dell Power Edge servers and EMC Clarion SAN & iSCSI Disk Arrays

20 19 DELL Success Story

21 20 Database Configuration used at DELL for the RAC test environment DELL Success Story

22 21 DELL Success Story Near Linear Scalability!

23 22 Step-By-Step Example Apply Methodology, “Best Practices”, and Quest’s RAC tools to optimize and quantify the approximate percentage of the improvements Note – will quote some specific examples for a given RAC setup, your mileage will surely vary Test = TPC-C (OLTP) for users, 10 GB

24 23 Step 1 - Application Nature Know Your Application Demands (this info flows downstream) OLTP vs. Data Warehousing –Primarily Read vs. Write –Average Transaction Size –Likelihood of “Dead Lock” –Logging, Flashback and Recovery Requirements –etc, etc, etc … Concurrent User Load Profile (i.e. user load over time) Focus on User Response Time Requirements –For example, TPC-C must run each transaction <= 2 seconds –Response Time ~= Wait Events Cary Millsap (Hotsos) calls this “Method R” Anjo Kolk & et al … (Oracle) call this “YAPP Method” Kyle Haily (PerfVision) paper “Waits Defined” etc, etc, etc … Don’t skip this step – cost can be enormous – and that no network, OS, or database tuning can compensate for !!!!!

25 24 Application “Best Practices” Well known rules: Write efficient SQL and/or PL/SQL code (explain plans) Use bind variables to reduce unnecessary “re-parsing” Often the underlying Application Code (e.g. Benchmark) NOT changeable, so you can’t do anything Deeper TPC-C Analysis (remember across all next steps): Primarily Read with Some Writes Small Average Transaction Size High Concurrency with Potential Deadlocks Logging for ACID compliance and no flashback

26 25 Public Network “Best Practices” Well known rules: Isolate Network (for single or related applications only) Use Gigabit Ethernet (consider “bonding” multiple cards) Use Layer 2 or 3 Switches and verify Gigabit throughput TPC-C Analysis Ramifications: Primarily Reads = Nothing Small Transaction = No jumbo frames, Standard SDU/TCU High Concurrency = Multiple Ethernet Segments (collisions) No Logging, etc… = Nothing

27 26 Storage Network “Best Practices” Well known rules: Isolate Network (for single or related applications only) Use Fiber Channel for SAN, 10GB Ethernet for NAS/iSCSI Consider multiple pathways per storage controller and HBA Consider TCP/IP offload engine (TOE) NIC’s or iSCSI HBA’s TPC-C Analysis Ramifications: Primarily Reads = Nothing Small Transaction = Jumbo frames since “Block Level” IO High Concurrency = Fiber Channel and Multiple Pathways (if budget) No Logging, etc… = Nothing

28 27 Storage Sub-System “Best Practices” Well known rules: More Smaller Disks generally higher overall throughput More memory cache generally higher overall throughput Avoid “write-back” mode if no backup power source (e.g. battery) Align Stripe Boundaries: drive, OS block, LVM, file sys, database block, etc Stripe Depth (i.e. size) from 256 KB to 1 MB Stripe Width (i.e. # disks) between 4 and 16 Stripe Depth = Stripe Width X Drive IO Size = One IO per Disk per IO request Average I/O <= Stripe Width X Stripe Depth Write-intensive = RAID 0+1/1+0 and Read-intensive = RAID 3 (sequential) or 5 (scattered) TPC-C Analysis Ramifications: Primarily Reads = RAID 5, Adjust cache memory allocations & look-ahead algorithms Small Transaction = Stripe Depth >= db_block_size X db_file_multiblock_read_count High Concurrency = Low Deadlock, so spread DB objects and partitions across LUN’s No Logging, etc… = Logging but no flashback, so write IO is reasonable, so RAID 5 OK

29 28 Oracle Instance “Best Practices” Well known rules: Size & Tune the SGA appropriately for application nature Choose reasonable block size based on application nature (?8K?) Partition large objects & indexes across storage devices & spindles Don’t assume any “golden rules” – i.e. test all assumptions! TPC-C Analysis Ramifications: Primarily Reads = opt_index_caching=80, opt_index_adj_cost=20 Small Transaction = Size redo logs correctly for small size X high load High Concurrency = cursor_space_for_time=t, cursor_sharng=similar No Logging, etc… = Turn off “Recycle Bin” but keep “LOGGING”

30 29 Oracle Cluster “Best Practices” Well known rules: Increase default SGA size for all ASM instances (64M too small) Interconnect is the most important bottleneck – many bonded NIC’s Consider hash partitions & reverse indexes to spread IO across nodes Don’t assume any “golden rules” – i.e. really test all assumptions!!! TPC-C Analysis Ramifications: Primarily Reads = Nothing Small Transaction = Decrease db_file_multiblock_read_count High Concurrency = Decrease block size (?4K?) – see next slide No Logging, etc… = Nothing

31 30 Single Instance – No Block Contention

32 31 Cluster – Block Contention Costs

33 32 Private Network “Best Practices” Well known rules: Isolate Network (for single cluster only – and 0% public) Use 10GB Ethernet or Inifini-Band (Dell found 15% RTI) Consider multiple pathways per HBA and storage controller Jumbo frames since high “Block Level” IO between nodes TPC-C Analysis Ramifications: Primarily Reads = Nothing Small Transaction = Nothing High Concurrency = Lower block size until interconnect traffic OK –Consider increasing the OS priority of the global cache cluster services No Logging, etc… = Nothing

34 33 Let’s Apply the Recommendations Read Count Block Size Cursor Space Cursor Share Index Cache Index Cost Jumbo Test 1168FalseExact0100False Test 228FalseExact0100False Test 324FalseExact0100False Test 424True 8020False Test 524True 8020True

35 34 Results – TPS (lesser interest here)

36 35 Results – Average Response Time Sub Second

37 36 Thank you Please offer any questions or comments Remember: Eliminate “RAC in the Box” syndrome – eat low hanging fruit Example was for TPC-C or OLTP type application TPC-H or Data Warehouse would NOT be the same Your mileage may well vary (especially percentages) Toad World Article: “Maximize Database Performance Via Toad for Oracle” Dell Power Solutions article:


Download ppt "Copyright © 2006 Quest Software RAC be Nimble, RAC be Quick Bert Scalzo, Domain Expert, Oracle Solutions"

Similar presentations


Ads by Google