Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Provisioning Storage for Oracle Database with ZFS and NetApp Mike Carew Oracle University.

Slides:



Advertisements
Similar presentations
Redundant Array of Independent Disks (RAID) Striping of data across multiple media for expansion, performance and reliability.
Advertisements

Tom Hamilton – America’s Channel Database CSE
By Rashid Khan Lesson 6-A Place for Everything: Storage Management.
RAID (Redundant Arrays of Independent Disks). Disk organization technique that manages a large number of disks, providing a view of a single disk of High.
5 Copyright © 2005, Oracle. All rights reserved. Managing Database Storage Structures.
WS2012 File System Enhancements: ReFS and Storage Spaces Rick Claus Sr. Technical WSV316.
File Systems.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
CS-3013 & CS-502, Summer 2006 More on File Systems1 More on Disks and File Systems CS-3013 & CS-502 Operating Systems.
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
Allocation Methods - Contiguous
Chapter 11: File System Implementation
File System Implementation
File System Implementation
Reliability Week 11 - Lecture 2. What do we mean by reliability? Correctness – system/application does what it has to do correctly. Availability – Be.
NetApp The perfect fit for Virtual Desktop Infrastructure
Module – 11 Local Replication
IBM® Spectrum Storage Virtualize™ V V7000 Unified in a nutshell
Section 3 : Business Continuity Lecture 29. After completing this chapter you will be able to:  Discuss local replication and the possible uses of local.
By Richard Rogers & Mark Walsh. What does database storage management mean? A database storage management system is a defined set of hardware, software.
CSE 451: Operating Systems Winter 2010 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura.
© 2009 IBM Corporation Statements of IBM future plans and directions are provided for information purposes only. Plans and direction are subject to change.
Session 3 Windows Platform Dina Alkhoudari. Learning Objectives Understanding Server Storage Technologies Direct Attached Storage DAS Network-Attached.
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
11 Capacity Planning Methodologies / Reporting for Storage Space and SAN Port Usage Bob Davis EMC Technical Consultant.
1 © Copyright 2009 EMC Corporation. All rights reserved. Agenda Storing More Efficiently  Storage Consolidation  Tiered Storage  Storing More Intelligently.
Copyright © 2009 EMC Corporation. Do not Copy - All Rights Reserved.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
Storage Systems CSE 598d, Spring 2007 Lecture 5: Redundant Arrays of Inexpensive Disks Feb 8, 2007.
1 Storage Refinement. Outline Disk failures To attack Intermittent failures To attack Media Decay and Write failure –Checksum To attack Disk crash –RAID.
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
Nexenta Proprietary Global Leader in Software Defined Storage Nexenta Technical Sales Professional (NTSP) COURSE CONTENT.
Introducing Snap Server™ 700i Series. 2 Introducing the Snap Server 700i series Hardware −iSCSI storage appliances with mid-market features −1U 19” rack-mount.
Oracle Storage Overview Tomáš Vencelík – Storage sales leader.
CERN IT Department CH-1211 Geneva 23 Switzerland t Experience with NetApp at CERN IT/DB Giacomo Tenaglia on behalf of Eric Grancher Ruben.
N-Tier Client/Server Architectures Chapter 4 Server - RAID Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept RAID – Redundant Array.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
XenDesktop Built on FlexPod Flexible IT Infrastructure for Desktop Virtualization.
Nexenta Proprietary Global Leader in Software Defined Storage Nexenta Technical Sales Professional (NTSP) COURSE CONTENT.
An Open Source approach to replication and recovery.
Module – 4 Intelligent storage system
FlashSystem family 2014 © 2014 IBM Corporation IBM® FlashSystem™ V840 Product Overview.
Copyright © 2009 EMC Corporation. Do not Copy - All Rights Reserved.
Experience with the Thumper Wei Yang Stanford Linear Accelerator Center May 27-28, 2008 US ATLAS Tier 2/3 workshop University of Michigan, Ann Arbor.
"1"1 Introduction to Managing Data " Describe problems associated with managing large numbers of disks " List requirements for easily managing large amounts.
Chapter 12 – Mass Storage Structures (Pgs )
CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations.
The concept of RAID in Databases By Junaid Ali Siddiqui.
Chapter 11: File System Implementation Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 11: File System Implementation Chapter.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
1 © 2002 hp Introduction to EVA Keith Parris Systems/Software Engineer HP Services Multivendor Systems Engineering Budapest, Hungary 23May 2003 Presentation.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
2 Copyright © 2006, Oracle. All rights reserved. RAC and Shared Storage.
VVols with Adaptive Flash and InfoSight Analytics 1 Manchester Virtualisation User Group Rich Fenton (Nimble North Senior Systems Engineer)
Network-Attached Storage. Network-attached storage devices Attached to a local area network, generally an Ethernet-based network environment.
Lecture 17 Raid. Device Protocol Variants Status checks: polling vs. interrupts Data: PIO vs. DMA Control: special instructions vs. memory-mapped I/O.
Fujitsu Training Documentation RAID Groups and Volumes
Introduction of Week 6 Assignment Discussion
ICOM 6005 – Database Management Systems Design
Overview Continuation from Monday (File system implementation)
CSE 451: Operating Systems Winter 2009 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
Overview: File system implementation (cont)
Mark Zbikowski and Gary Kimura
CSE 451: Operating Systems Winter 2012 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
Specialized Cloud Architectures
ASM File Group Parity Protection New to ASM for Oracle Database 19c
Hard Drives & RAID PM Video 10:28
Presentation transcript:

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Provisioning Storage for Oracle Database with ZFS and NetApp Mike Carew Oracle University UK

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Content In this presentation: Background: Some interesting things about disks. Key Features of ZFS managed storage Key Features of NetApp managed storage Provisioning storage for Oracle DB using ZFS Provisioning storage for Oracle DB using NetApp

Copyright © 2013, Oracle and/or its affiliates. All rights reserved A few interesting things about disks … Two categories for all disks (including fc, sas, sata, pata, scsi, ssd):  Failed …. and  Failing Always a disappointment:  size  speed  reliability … we know this already, this is why we have RAID systems.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Trends in Storage As disk capacity increases:  MTBF decreases  disk bottleneck increases Uncorrectable bit error rates have stayed roughly constant  1 in 10^14 bits (~12TB) for desktop-class drives  1 in 10^15 bits (~120TB) for enterprise-class drives (allegedly)  Bad sector every 8-20TB in practice (desktop and enterprise)

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Some facts: Measurements at CERN How valuable is my data? How secure is my data on disk? They wrote a simple application to write/verify 1GB file  Write 1MB, sleep 1 second, etc. until 1GB has been written  Read 1MB, verify, sleep 1 second, etc. Ran continuously on servers with traditional HW RAID After 3 weeks, found 152 instances of silent data corruption  Previously thought “everything was fine”. Traditional HW RAID only detected “noisy” data errors Need end-to-end verification to catch silent data corruption

Copyright © 2013, Oracle and/or its affiliates. All rights reserved ZFS Key Features  Pooled Storage – Defines the physical aspects of capacity and redundancy  Transactional object store – FS is always consistent  Application still has to deal with file content consistency, but ZFS manages the File System consistency.  End to end data integrity authentication: Recognition of and Recovery from:  bit rot, lost writes, misdirected writes, phantom writes  Snapshot backup through Copy on Write  Lightweight, fast, low cost  Unparalleled scalability

Copyright © 2013, Oracle and/or its affiliates. All rights reserved ZFS Data Authentication Checksum of data stored with parent data structure Isolates checksum from data, therefore can validate the data  Safeguards against: Bit rot, Phantom writes, Misdirected reads and writes, DMA parity errors, Driver bugs, Accidental overwrite

Copyright © 2013, Oracle and/or its affiliates. All rights reserved ZFS Self Healing With redundant storage, ZFS detects the bad block from the CRC stored in parent structure, then reconstructs from alternative copy and re-writes the defective block to heal the data.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Virtual Devices and Dynamic Striping ZFS dynamically stripes data across all of the top-level virtual devices. 36 GB Data Stripe 1Stripe 3 Stand-alone Devices 36 GB Stripe 2 Stripe 1 Mirror Device 36 GB Stripe 2 36 GB Mirror Device Data Mirrored Devices

Copyright © 2013, Oracle and/or its affiliates. All rights reserved RAID-Z Dynamic Stripe Width  All writes are full stripe Writes Adjusted to the size of the IO  Each logical block is its own stripe.  Stripes written to vdevs  Avoids the Read-Modify-Writes  Record size/block size/stripe size needs consideration for Database use.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved NetApp Key Features  Write Anywhere file layout – WAFL  Coalesces otherwise random writes into contiguous sequential IO  Snapshots by reference – lightweight, low cost, fast  Write optimized – (Correspondingly not read optimized)  NVRAM write cache for write performance and commitment  Mature data management applications: data backup, DR replication, Application Integration (e.g. Snap Manager for Oracle) – All based around the snapshot  ONTAP 8 Cluster Mode has scale out capabilities which offer very high scalability options.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved NetApp Data Authentication Block Checksums co-located with data  Not as extensive as ZFS measures  Safeguards against:  Bit rot  Other measures (RAID Scrubbing) needed to safeguard:  Phantom writes  Misdirected writes

Copyright © 2013, Oracle and/or its affiliates. All rights reserved NetApp Disk Aggregation Technology Dual Parity raid groups – RAID-DP The Raid Group is the protection boundary. RAID-DP protects against dual concurrent disk failure within the raid group. DP is the only practical choice! Everything operating against a fixed size File System block – 4KB (4KB WAFL Block size is not configurable or negotiable) parity double parity data Raid group

Copyright © 2013, Oracle and/or its affiliates. All rights reserved NetApp Disk Aggregation Concept The Aggregate Aggregates constructed from 1 or more raid groups DP rg1 rg2 aggr0 DP P P P DDDDDrg0

Copyright © 2013, Oracle and/or its affiliates. All rights reserved WAFL Overhead WAFL Aggregate Space 95% FlexVol Space plus Aggregate Snapshot Reserve Aggregate Snapshot Reserve FlexVol1 FlexVol#n.snapshot 20% 80% 20% 80% 10% 90% 5% (adjustable) Aggregates The NetApp Aggregate is equivalent to the ZFS pool. It represents the useable capacity of the disk. Flexible Volumes Are the means of using space. They contain NAS file systems or SAN luns (you choose) and can be resized easily. Snapshot Reserve Management of snapshot backup space is through snapshot reserve. NetApp Disk Space Allocation: Flexible Volumes

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Disk and Data Protection Data ONTAP protects against media flaws, misdirected writes and lost writes in several ways: RAID-4 and RAID-DP protecting against disk failure Media Scrubbing – Periodic checking block data against checksums –Bit rot RAID scrubbing – Periodic checking parity in 2 Dimensions is good –Lost writes –Misdirected writes –Phantom writes

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Provisioning Storage for Oracle with ZFS Array Considerations  ZFS designed to work with JBOD and disk level caches  NVRAM write cache based arrays should ignore ZFS cache flush requests  General ZFS rules apply ref use of whole disks  If using HW RAID Storage array to present LUNS, then quantity of LUNs should equal the number of physical disk  Avoid dynamic space provisioning arrays for allocating LUN’s for ZFS. ZFS uses the whole LUN space quickly negating the benefits of thin and dynamic provisioning.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Provisioning Storage for Oracle with ZFS Pool Considerations  If the array technology gives enough redundancy then use it. Duplicating the protection may work against you.  ZFS may offer higher protection and recovery, but your array may give enough  RAID-Z not recommended where IOPS performance is important. Then use mirrors.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Provisioning Storage for Oracle with ZFS ZFS Record Size Considerations  Match ZFS record size to Oracle database block size - The general rule is to set recordsize = db_block_size for the file system that contains the Oracle data files. This sets the maximum ZFS block size equal to DB_BLOCK_SIZE. Resulting efficiencies ensue in read performance and buffer cache occupancy.  When the db_block_size is less than the OS memory page size, 8 KB on SPARC systems and 4 KB on x86 systems, set the record size to the page size.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Provisioning Storage for Oracle with ZFS ZFS Record Size Considerations contd.  Modifying Record size is not retrospective  Must copy files after record size change to have change effected.  Performance may be optimized with different block sizes for different DB components  Set appropriate record sizes for those file systems that contain the respective files using diff block sizes.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Provisioning Storage for Oracle with ZFS Improving Writing and Caching Performance  logbias ZFS property: Latency or Throughput  Redo – Latency  Data – Throughput  Unless … Storage throughput is saturated, then set redo logbias to Throughput, (therefore not performing double IO by first writing to ZIL and subsequently to FS, and as a consequence overall improvement in performance results)  primarycache ZFS property – to control what is cached in main memory (the primary ARC – Adaptive Replacement Cache)

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Provisioning Storage for Oracle with ZFS Use secondarycache (L2ARC)  Since Solaris 10 10/09  Store a cached copy of data for fast access  SSD devices recommended  Use the secondarycache ZFS property to determine which file systems will use the secondary cache and what the cache contents should be.  For read latency sensitive workloads

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Provisioning Storage for Oracle with ZFS Separation of Data from Redo logs  Consider physical separation of Data files from Redo logs by placement in separate pools.  Reduces conflict between sometimes opposite storage needs  Large storage for Data files require emphasis on throughput  Small storage for Redo logs require emphasis on latency

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Provisioning Storage for Oracle with NetApp Write Performance  Write performance primarily achieved with NVRAM  Remember NetApp is write optimized storage  However, Physical disk must be able to keep up, otherwise we lose benefits of NVRAM. We fall back to disk performance from memory performance.  Aggregated write throughput achieved with single large aggregate for all volumes of all types: Data, redo, control files.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Provisioning Storage for Oracle with NetApp Read Performance  Read performance achieved through large aggregate with as many disks as possible/necessary  Many small disks better than few large disks  ONTAP 7.x is 32 bit system, suffers limits on aggregate size (16TB)  Large databases may need to span Aggregates  Use ONTAP 8.x (64 bit aggregates)

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Provisioning Storage for Oracle with NetApp SAN or NAS?  SAN and NAS both supported  FC, iSCSI, FCoE  NFS  SAN implies some need for RAID management  ZFS managed LUN’s provisioned from NetApp?  Suggest not to do this, ZFS suited to JBOD  Or if must then focus on one or the other, do not try to use all features of both. Unnecessarily complicated.  Oracle ASM managed LUNs is a good solution using ASM external redundancy. Not necessary to mirror when already highly redundant.  NFS on NetApp is a perfect solution, thin provisioned file system space, good performance, and easy management

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Provisioning Storage for Oracle using ZFS Backup & Recovery Integration  Home grown self engineered solutions  Can use snapshots and clones  Replication of snapshots to Secondary DR/Backup location  zfs send operation  Fast and efficient  Recommend using granular objects for easy of management  i.e. snap and send several small objects rather than single massive file system – may never succeed.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Provisioning Storage for Oracle using NetApp Backup & Recovery Integration  Mature data management tools  Application layer integration with backup snapshots  Snap Manager for Oracle (SMO) offer hot backup integration for OS image copy backup.  SMO  Offers some integration with RMAN (Snapshot image copy cataloging)  Supports DB cloning  Supports snapshot management of ASM disk groups built upon NAS files or SAN devices.  Storage layer replication with mature tools  Snapmirror (async/sync/semi-sync)

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Summary  ZFS is a very powerful file system with unsurpassed scalability, and many very interesting features.  Deploying Oracle on ZFS requires a detailed knowledge of the demands placed by Oracle on the storage system, and of ZFS to meet the need configurationally.  NetApp not so fully featured, but more mature environment.  Simpler aggregation approach, although some severe size limits if restricted to modern large disks on ONTAP 7.3

Copyright © 2013, Oracle and/or its affiliates. All rights reserved Thank You  I hope you’ve found the subject of interest.  Thank you for listening.  Any questions?