Presentation on theme: "The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor."— Presentation transcript:
The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor Shelby County Schools AETA - October, 2007 (revised post-conference to correct errors)
AETA October 2007Shelby County Schools2 No, not that RAID! ??? What Does RAID Mean?
AETA October 2007Shelby County Schools3 What Does RAID Mean? Redundant Array(s) of Inexpensive Disks The technology we now call RAID was developed (and patented) in 1978. The term RAID was first used in 1987 Redundant? That means theres another one to take over if the first one can no longer perform the job.
AETA October 2007Shelby County Schools4 What Does RAID Mean? Redundant Array(s) of Inexpensive Disks Okay, so whats an Array??? Here we have 3 separate disks (hard drives). If we take these 3 disks and treat them as a single unit, now we have an array! 36GB Now we have a 108GB disk!
AETA October 2007Shelby County Schools5 Why RAID? Fault Tolerance If a hard drive fails, your system could continue to run, and even allow you to make the repair without ever taking the system down. Users continue to work. No data is lost – thus no restores. Better Performance Disk reads or writes can occur more quickly, giving users what they need faster.
AETA October 2007Shelby County Schools6 Disk Types used in RAID Arrays ATA, PATA, IDE & EIDE These terms may be improperly used interchangeably. Usually refers to the common 3.5 hard drive found in most computer over the last 10 years. SCSI, Ultra SCSI, Wide SCSI, SCSI2, SCSI3 The standard for servers and high-end PCs from around 1993 until now. SATA Quickly becoming the new standard for PCs. Also becoming a strong contender in the server market. RAID itself will work with any of these disk types. RAID relies on the controller more than the type of disk.
AETA October 2007Shelby County Schools7 RAID Hardware vs Software Controllers RAID is a method of disk management and data input/output. RAID can be achieved via software or hardware methods. Software-Based RAID Hardware-Based RAID Software-Based RAID is built into some operating systems. Windows Server 2003 Linux (various versions)
AETA October 2007Shelby County Schools8 Software-Based RAID Pros Built into operating system. No additonal cost for RAID controller. Wizard-like GUI configuration makes setup easy. Cons Performance impact on operating system. Memory and process usage. May not include for all partitions of disks in RAID array. Problem recovery can be more difficult. When to Use When hardware-based RAID is not an option, but RAID is desired. Non-critical servers with large disk storage requirements.
AETA October 2007Shelby County Schools9 Hardware-Based RAID Pros Built into many newer servers. Easy to add PCI-type controller to servers. All the overhead of the controlling the RAID array is handled by the controller itself. More advanced recovery options for serious failures. Cons Possible additional cost for controller. BIOS-based setup utility may not be as easy to use as GUI utilities. When to Use Every time its available.
AETA October 2007Shelby County Schools10 RAID Basics RAID is more about the configuration of the drives, and how data is written to, and read from, those drives than it is about physical connections. Something must manage the RAID system – this must be either Hardware or Software. If Hardware, this is often called the RAID Controller. If Software, this is often called the RAID Manager. The RAID number (i.e. RAID-1, RAID-5) indicates the configuration of the disks, how data is written, and how data is read. This also indicates what happens if part of the disk storage system fails. The simplest forms of RAID require 2 hard drives. In most cases, all drives in the RAID array must be: The same size (capacity) The same speed The same interface. For all practical purposes, drives are likely the same exact model.
AETA October 2007Shelby County Schools11 RAID Terminology RAID – Redundant Array of Inexpensive Disks. Array – A grouping of multiple hard drives into a single entity. This entity is a RAID-x Array. Disk – The physical hard drive. There will be at least 2 of these when talking about RAID. Controller – The physical card (or built-in) that connects to the Disks. A RAID Controller has the smarts to arrange those disks into an array. Manager – The software component that has the smarts to arrange disks into an array. Parity – Refers to Parity Blocks used in some RAID configurations. These Parity Blocks provide the ability to reconstruct data after a drive failure. Cache – Memory set aside (usually part of a controller) to pre-store anticipated Read Data, or to queue Write Data waiting to be written. (Remember, you normally are dealing with either HARDWARE-BASED RAID, or SOFTWARE-BASED RAID… not both on the same server).
AETA October 2007Shelby County Schools12 RAID-0 Commonly known as a Striped Set. Requires at least 2 disks. There is NO Redundancy (AID-0?) Data is striped across disks. Pros Faster performance because there is no Parity generation. Faster reads and writes because data can be transferred in a parallel fashion. You get the full capacity of all disks for data storage. Cons If any disk fails, all the data is irrecoverable. When to Use Performance is the number-one goal. Data is backed up elsewhere. Downtime is not a problem (lengthy restores).
AETA October 2007Shelby County Schools13 RAID-1 Commonly known as a Mirrored Set. Requires at least 2 disks. All data is written to both disks. Pros Fault-Tolerance If either disk fails, the system can keep working on the remaining disk. Faster disk reads because data can data can be read in a parallel fashion. Cons You lose the capacity of the second disk. If each disk is 40GB, you only have 40GB of storage space (not 80GB). When to Use Servers that are only configured with two disks. Fault-tolerance is importance.
AETA October 2007Shelby County Schools14 RAID-3 Commonly known as a Striped Set with Dedicated Parity. Requires at least 3 disks. Data is striped to data disks, then parity is written to the parity disk. RAID-3 accesses all the disks at once, and therefore can only service one I/O request at a time. Pros Fault-Tolerance If a data disk fails, the system can keep working on the remaining disks. If the parity disk fails, the system can keep working but without parity. High disk read performance because data can data can be read in a parallel fashion. High disk write performance for large, sequential data. Cons You lose the capacity of the parity disk. Not efficient for small, random data. When to Use Very effective for large sequential data, such as video.
AETA October 2007Shelby County Schools15 RAID-4 Also known as a Striped Set with Dedicated Parity, but with multiple I/O calls. Requires at least 3 disks. Data is striped to data disks, then parity is written to the parity disk. Pros Fault-Tolerance If a data disk fails, the system can keep working with the remaining disks. If the parity disk fails, the system can keep working without parity, but with no noticable effect in performance. High disk read performance because data can data can be read in a parallel fashion. Can execute multiple I/O requests at once, assuming that data is on different physical disks. Cons You lose the capacity of the parity disk. Medium write performance. When to Use Effective for small, random data.
AETA October 2007Shelby County Schools16 RAID-5 Commonly known as a Striped Set with Distributed Parity. Requires at least 3 disks. Data and partity is striped to all disks in a round-robin fashion. Pros Fault-Tolerance If any disk fails, the system can keep working on the remaining disks. Best balance of cost, performance and fault-tolerance for most applications. High disk read performance because data can data can be read in a parallel fashion. Can execute multiple I/O requests at once, assuming that data is on different physical disks. Highly efficient. Cons Inefficient with large file transfers. Medium write performance. Disk failure will have a direct impact on performance. When to Use Most applications, including databases, file and print services, web servers and e-mail.
AETA October 2007Shelby County Schools17 RAID-6 Commonly known as a Striped Set with DUAL-Distributed Parity. Requires at least 4 disks. Data and partity is striped to all disks in a round-robin fashion. Pros Fault-Tolerance If any TWO disks fail, the system can keep working on the remaining disks. Cons You lose the storage capacity of two disks. When to Use Consider RAID-6 when you have RAID arrays with more than 10 large capacity physical disks.
AETA October 2007Shelby County Schools18 Nested RAID Levels Supported by many RAID controllers. Allows the use of multiple RAID strategies on the same set of physical disks. RAID 0+1 Minimum of 4 disks – must be even number (4, 6, 8, etc.) Striped set is then mirrored to another striped set. RAID 1+0 (RAID 10) Minimum of 4 disks – must be even number (4, 6, 8, etc.) Mirrored sets are then striped to another set of drives. RAID 5+0 (RAID 50) Minimum of 3 disks. Striped set across distributed parity RAID systems. RAID 5+1 Minimum of 4 disks – must be even number (4, 6, 8, etc.) Mirrored striped set with distributed parity. Sometimes called RAID 53.
AETA October 2007Shelby County Schools19 Non-Standard RAID Levels RAID-7 Created by Storage Computer Corporation. Added caching to RAID-3 and RAID-4 to improve performance. RAID-S Created by EMC Corporation. Also known as Parity RAID. Was offered as an alternative to RAID-5 on EMC Symmetrix Systems. No longer support on the latest releases of Enguinuity (the Symmetrix operating system). RAID-Z Created by SUN Microsystems. Adds an extra level of protection in the way data is written to a new location instead of overwriting existing data. Only writes full stripes of data, utilizing a mirroring technology for small data writes. RAID-Z2 Variation of RAID-Z with the ability to withstand two disk failures.
AETA October 2007Shelby County Schools20 Non-Standard RAID Levels Double-Parity Not exactly RAID-6. Two sets of Parity Information are generated. The two sets are not the same… they are based on different groups of data blocks. RAID-DP Created by Network Appliance Corp. Offers better performance over traditional RAID-6 via storage controller software. Offer protection of writing data to battery-protected NVRAM to ensure that no data is lost in the event of a power outage. Was offered as an alternative to RAID-5 on EMC Symmetrix Systems. No longer support on the latest releases of Enguinuity (the Symmetrix operating system). RAID 1.5 Created by HighPoint Systems. Sometimes incorrectly identified as RAID-15. Very close (if not exact) implementation of RAID-1. RAID-5E, 5EE & RAID 6E Introduced by IBM ServeRAID. The E stands for Enhanced. Variations of RAID-5 and RAID-6, with hot-spare drives that are an active part of the block rotation.
AETA October 2007Shelby County Schools21 Non-Standard RAID Levels ServeRAID 1E Utilized by some IBM and Sun storage systems. Uses 2-way mirroring. RAID-K Created by Kaleidescape for their KSERVER media storage units. Uses Double-Parity with proprietary modifications. Allows adding additional drives to existing array. Linux MD RAID-10 Part of the Linux kernel since 2.6.9 (software-based). Slight modification to the RAID-10 standard to allow less drives to constitute a RAID-10 array. Intel Matrix RAID Not a new RAID level. Allows physical disks to be broken into logical partitions. Those logical partitions can be part of separate RAID arrays. UNRAID Developed by Lime Technology. Unusual in that it does not require drives in RAID Array to be of matching size or speed.
AETA October 2007Shelby County Schools22 The RAID Experience Okay, all this RAID-1 and RAID-5 and stuff is great, but what does it all mean? The bottom line to me is fault-tolerance. Hard drives have moving parts, which means they will likely fail at some point. Keep those systems running, even if a hard drive fails. Better performance is nice, but fault-tolerance is the real selling point.
AETA October 2007Shelby County Schools23 The Cost of Non-RAID Typical Non-RAID Setup… 36GB72GB Heres my operating system hard drive. Heres my important data hard drive. Just lost my OS… cant boot, and even though my 72GB data drive is good, no one can get to it! Just lost my Data… server is still running, but users dont care because they cannot get to their data!
AETA October 2007Shelby County Schools24 RAID-1 Example Typical RAID-1 Setup… 36GB72GB Heres my operating system RAID-1 Array called Array 0 Heres my important data RAID-1 Array called Array 1 Just lost the first hard drive in Array 0. RAID-1 keeps Array 0 running on the second hard drive. Users may see a little slower reads, but things keep working. Just lost the second hard drive in Array 1. RAID-1 keeps Array-1 running on the first hard drive… many users dont know that anything happened! 36GB72GB
AETA October 2007Shelby County Schools25 RAID-5 Example Typical RAID-5 Setup… 72GB Heres all the disks in my server configured as a single RAID-5 array. Third hard drive has failed. Arrays get a little slower, but users keep working and no data is lost. 72GB New hard drive installed… RAID controller REBUILDS the data on that drive while the system is operating… users connected, programs runnings, etc.
AETA October 2007Shelby County Schools26 The Case for RAID RAID makes sense for all your servers. Smaller servers should have a minimum of two hard drives in a RAID-1 mirror configuration. Larger servers should have 3 or more hard drives in a RAID-5 striped set configuration. SAN, NAS and really big servers are more likely to use RAID-6, dual-parity configurations and such. No matter what you have, keep a spare hard drive or two in stock if possible!
AETA October 2007Shelby County Schools27 RAID Bibliography Lascon.co.uk – RAID Animated graphics Any questions? Drop me an e-mail! email@example.com Thank You!