Replicating VMware VVols: A technical deep dive

Replicating VMware VVols: A technical deep dive
[STO3305BES] Replicating VMware VVols: A technical deep dive into VVol array based replication in vSphere 6.5 Claudio Calisto – Storage Solutions Architect Nick Dyer – Principal SE, Nimble

A Brief History of 3PAR, Nimble & VMware VVols
New Product Introduction A Brief History of 3PAR, Nimble & VMware VVols Design partnership between HPE and VMware Virtual Volumes VMworld VVol introduced with HPE as 1 of 5 original design partners VVol Beta 3PAR 1 of 3 partners ready on Beta Day 1 Nimble Tech Preview Nimble tech preview released VVol 2.0 GA with vSphere 6.5 3PAR & Nimble ready on Day 1 Aug 2011 Jul 2014 Jun 2015 Nov 2016 May 2012 Mar 2015 Mar 2016 Today 3PAR Reference Platform 3PAR selected as the FC reference platform VVol GA with vSphere 6.0 3PAR 1 of 4 partners ready Nimble GA Nimble releases VVol support Continued development 3PAR, Nimble and VMware continue working to develop and refine VVols HP Restricted

Why did VMware create a new storage architecture?
Todays exercise: Challenges in external storage architectures LUN Centric – storage pre- allocated into silos No Visibility – cannot see inside VMFS volume Poor Efficiency – typically over- provisioning LUNs Increased Administration – must always go to storage admin Difficult Reclamation – space reclamation is a manual process Hardware Centric – must use vendor tools & plug-ins Long Provisioning – time consuming manual requests Data Services – array data services not aligned with VMs So why did VMware feel the need to create a whole new architecture for external storage? VMware recognized that everything in today’s world is moving towards an app-centric model, and while their legacy storage architecture works just fine it has many shortcomings to be able to support the modern data center. Virtualization has greatly evolved over the last decade and plays a key role in just about every aspect of the IT. Storage has greatly evolved in the past decade as well and VMware designed VVols to meet the evolution of both storage and the cloud and app centric data center. Some of the challenges with the current architecture in vSphere center around it being very LUN centric which is not a good fit in todays app centric world. Storage is pre-allocated into silos on storage arrays which traps and wastes both capacity and performance. Block storage arrays have no visibility inside VMFS volumes so they have no ability to interact directly with VMs. With LUNs storage is always over-provisioned which leads to poor efficiency and resource utilization. Administration efforts are increased as vSphere admins must constantly work with storage admins for any requests. Space reclamation is a difficult, time-consuming and manual process which often leads to poor resource utilization. Storage array vendor tools and plug-ins for vCenter must be utilized to manage physical storage resources which makes management more cumbersome. Provisioning of physical storage resources is a manual process which leads to longer times waiting for requests to be completed. And finally storage array data services and features are not aligned at the VM level and can only be applied at the higher LUN level which is inefficient. All of these challenges combine to make for a very compelling reason to break away from using legacy storage architectures in vSphere and was the motivation for VMware to create their new VVols storage architecture. Key take-away: There are many hidden challenges with traditional external storage architectures in vSphere which decrease efficiency and increase management

What are VMware VVols? New vSphere storage architecture to replace VMFS/NFS Standard (VASA) for all storage vendors to adhere to Enables vSphere to write VMs natively to storage arrays Common storage management tasks are automated Designed to be dynamic, eliminates LUN provisioning Storage array features can be applied to individual VMs VMware Virtual Volumes (or VVols for short) are a new vSphere storage architecture that was created to replace the existing VMFS and NFS implementations that have been used since ESX was first launched more than 10 years ago. VVols creates a unified standard and architecture for all storage vendors and storage protocols to adhere to, it leverages the vSphere APIs for Storage Awareness (also known as VASA) to accomplish this. VVols enables vSphere to write virtual machines (VMs) natively to storage arrays without using any kind of file system. With VVols common storage management tasks are automated eliminates operational dependencies to provide more simplified management. VVols is designed to be dynamic and efficient were no storage is pre-provisioned and data is thinly written when VMs are created and running which provides finer control of storage resources and data services at the VM level. This eliminates the need for any LUN provisioning and also enables storage arrays to automatically reclaim space when VMs are deleted or moved to another storage device. VVols allows storage arrays to be VM-aware and array features can be applied directly to individual VMs instead of entire LUNs which provides more flexible consumption of storage resources with greater granularity fro increased efficiency To accomplish all this vSphere Storage Policy Based Management (or SPBM) is used to allow policies built based on array capabilities to be assigned to VMs. Key take-away: VVols provides automated and dynamic management of storage resources and enables storage arrays to interact directly with VMs

How VVols transforms storage in vSphere
LUN-Centric Siloed storage pools, array services aligned to LUNs Static Pre-allocated storage, over- provisioned resources Complex Complicated management using vendor specific tools Time-consuming Longer provisioning, manual space reclamation VMFS Let’s now look at how VVols transforms storage in vSphere and solves the challenges presented by external storage architectures using VMFS. We talked about how traditional VMFS storage is very siloed and LUN-centric and everything is aligned at the LUN level. With VVols it changes to an VM-centric model VM-Centric No siloed storage pools, array services aligned to VMs Dynamic Dynamically allocated storage, using only what is needed Simple Simplified management using vSphere interfaces Effortless Instant provisioning, automatic space reclamation VVols

What changes between file and block protocols with VVols
Host adapter I/O transport Host present File system VM storage Storage visibility iSCSI initiator (sw/hw) or HBA Network or fabric LUNs Storage container vSphere native vSphere managed (VMFS) VMDK files VVols Data store Level VM level Block NFS Client Network Mount point Storage container Array managed (NFS) vSphere native VMDK files VVols VM level File

Overview of VVols Storage Architecture
Protocol Endpoint: Logical I/O proxy that serves as the data path between ESXi hosts to VMs and their respective VVols VASA Provider: Software component that mediates out-of- band communication for VVols traffic between vCenter Server, ESXi hosts & storage array Storage Container: Pool of raw storage capacity that becomes a logical grouping of VVols, seen as a virtual datastore by ESXi hosts Virtual Volume (VVol): Container that encapsulates VM files, virtual disks and their derivatives Storage Profile: Set of rules that define storage requirements for VMs based on capabilities provided by storage array (same as VSAN) ESXi hosts vCenter Server VM VM VM VM VM SPBM Control Path Data Path On 3PAR no space is pre-allocated, and space for VVols can be drawn from any provisioning template (CPG). Protocol Endpoint VASA Provider Storage Container VVOLs Storage Array

Storage Policy-Based Management
2. Storage Policies created in vSphere and assigned array capabilities 1. Storage Array advertises capabilities via VASA Provider Policy 1 Policy 2 Policy 3 7.2K 15K SSD 7.2K RAID1 Encrypt Cache RAID5 RAID6 RAID1 15K RAID6 Replica Thin Thin Encrypt Cache SSD RAID5 QoS Dedupe Replica Replica Thin Dedupe QoS VM VM VM VM VM VM VM VM VM VM VM VM 4. SPBM provisions VM on appropriate storage as defined by policy and maintains compliance 3. VMs assigned a Storage Policy based on requirements and SLAs

Top Reasons Customers will want to Start Using VVols Now
Don’t have to go all in Get early experience Get your disk space back Available in all vSphere editions Let the array do the heavy lifting Start using SPBM now Snapshots don’t suck anymore Easier for IT Generalists One architecture to rule them all 10. The VM is now a unit of storage Because it’s all about the VM, no more LUNs – the VM is now a first class citizen and array has VM-level visibility 7. Snapshots don’t suck anymore No longer have to wait hours for I/O intensive snapshot deletions to commit, plus your backups will complete faster 9. One architecture to rule them all …and in the darkness bind them - NFS, iSCSI, Fiber Channel, who cares? VVols provides a unified storage architecture across protocols 6. Start using SPBM now Don’t get left out, get started using the same SPBM that VSAN users have been enjoying 8. Easier for IT Generalists Don’t need to be a storage admin, fully manage storage from within vSphere 5. Let the array do the heavy lifting An array is a purpose built I/O engine, array features are more powerful and has better visibility into storage resources than vSphere has 1. You don’t to go all in with VVols You can use VVols alongside of VMFS, easy migration using Storage vMotion, take your time with it 4. Available in all vSphere editions It’s an architecture not a feature, nothing to license, will replace VMFS one day 2. Get early VVol experience Don’t wait until the last minute. How long did you wait to switch from ESX to ESXi or from vSphere Client to Web Client? 3. Get your disk space back No more over-provisioning or manual space reclamation, keep your array thin as possibly automatically So when talking to customers about VVols you will inevitably be asked why should they consider using VVols. After all what they are using today works perfectly fine

VVol Replication Deep Dive

Nimble and 3PAR were both VMware design partners on VVols 2.0
Introduction Nimble and 3PAR were both VMware design partners on VVols 2.0 Array replication support via SPBM introduced in vSphere (VASA 3.0) Nimble supported replication in vSphere 6.0 Nimble and 3PAR first to complete VVol replication implementations (before merger) Replication done at the VVol level (VM) not the datastore level like SRM (LUN) Replication Groups automatically created on array contain VVol objects Designed to be managed in vSphere, not on array side

Components of VVol Replication
Fault Domains [new] Something that fails as a whole (3PAR = storage array, Nimble = Nimble group) Replication Groups [new] Replicate VVol-based VMs between fault domains Groups maintain consistent point[s]-in-time 3PAR - Only maintains most recent point in time, which is no older than the most recent RPO Nimble – Uses Volume Collections [most recent point for RGs] - plugin (workflows for replication) RPOs can be “stretched” when adding VMs or VVols to a replication group Groups are in a Source, Target, InTest or FailedOver state Terms Source/Primary/Protected are interchangeable Terms Target/Secondary/Recovery are interchangeable

VVol Replication diagram
Storage Container VVols vSphere Storage Policy (SPBM) Storage Containers (a.k.a VVol datastores) [not new] Replication occurs between containers on different arrays. 3PAR Storage containers are always visible at both site, but the replica VVols within them do not appear until after a failover occurs. Nimble Storage containers are implemented as folders Special "VVol" type Size limited Optionally QoS limited Replication puts volumes (vvols) into folders Replica vvols are in offline state and do not show up in the datastore Failover brings the volumes online This is just a graphical representation of the replication components we saw on the previous slide. One thing worth repeating is that the storage containers, a.k.a. VVol datastores, are not the objects of replication. Instead each individual VVol is placed inside a replication group. And with HPE arrays, those replication groups are associated with, but not the same as, storage containers on each array. Note, as you can see in one example here we have replication groups both replicating into and out of the same storage containers. And while I'll discuss this in more detail later, I'll mention now that it is the vSphere storage policy that defines which volumes are replicated and which are not. Replication Group “Local” Array (Fault Domain) “Remote” Array (Fault Domain)

Preparing for VVol Replication
vSphere 3PAR Nimble vSphere 6.5 required Register VASA Provider Mount Storage Containers (VVol datastores) at primary and secondary sites Define Storage Policy with replication rules (Components) 3PAR OS required Remote Copy license Connect arrays via FC or iSCSI Define remote targets for Remote Copy Create Storage Containers on primary and secondary arrays vSphere 6.0 or 6.5 NimbleOS required for VASA 3.0 No licenses required Configure source and destination as replication partners Create Folders (storage containers) on source and destination Storage Array: [3PAR] Nothing special or extra required. Simply connect the two arrays together, FC or Ethernet links, and define the remote target objects for the Remote Copy subsystem [Nimble] …[story, - VMware integration - register vCenter, register VASA] - replication partners. vSphere: [3PAR only] Connect/Register the array’s VASA provider (required for any VVol support) Define a storage policy that requires replication constraints/rules So we'll talk more about the benefits and an upcoming slide, but one of things to take away here is that getting ready for VVol replication doesn't introduce any new and complicated procedures. If you're already familiar with how to create a VVol based VM, which is actually quite simple, the process is same for VVol based replication. The main requirements are to make sure your local and remote arrays are connected to each other, and then make sure you have the proper versions of your array software and the sphere software.

Creating Nimble Folder / Storage Container
Nimble folders are used for general organization Setting VVol management type makes it a Storage Container Must set a capacity limit Can optionally set a folder-level performance limit

Replication Storage Policies Summary
Granular (per VM/Virtual Disk) Empowers vSphere admin No need to coordinate with Storage admin No need to involve Storage admin during disaster recovery Storage admin may need to do some clean-up after a true disaster New Component feature for SPBM allows Components to be defined once and re-used Replication Components contain rules and constraints related to array replication Components are attached to policies which are attached to VMs SPBM maintains compliance to ensure VM resides on storage that can meet policy definition So although you'll really pick up a lot more when we go through some of the examples and demonstration, I want you to consider this the key take-away slide describing the benefits of VVol replication. The most important things to note is this is a per VM capability, and a capability that is controlled by vSphere admin. But not only is VM replication controlled through the sphere interfaces, disaster recovery operations are controlled as well. Meaning it's the vSphere administrator that can initiate a failover or test failover operation. And thanks to profile-based management you can request specific replication capabilities. You'll see some examples of that in the next couple of slides.

Example Storage Policy [3PAR]
Target array & Storage Container Target & Snapshot CPG drive tiers Need only specify one replication constraint to enable VVol replication So here we see the capabilities for 3PAR based replication. The first thing to note is that VVol replication is supported by the 3PAR Remote Copy subsystem. The first capability we seeis called the Remote Copy Target Storage Container. It lets you choose the target array as well as the storage container on that array where you'd like that VVol-based VM's to be surfaced, after a failover occurs. But not only can you select the storage containers on the target array, you can also define the drive tiers at that remote array. For example if you are truly using the target array only as a disaster recovery array you may choose to put your replicas on a lower cose drive type. And that includes what drive tiers should be used for snapshots. That's supported using the 3PAR CPG capabilities. The remote array can advertise to the local array what storage containers and CPGs are available. Finally, you can select what type of replication mode you'd like. However with our first 3PAR release, only periodic based replication is supported. In the future, remidning you of the disclaimer here, we should be able to offer other modes, such as synchronous replication. When using periodic replication you can also set the recovery point objective. You'll note that you can specify a range of allowed values. This gives you the flexibility being able to choose replication groups which have different RPO settings. And I'll quickly mention here that with replication storage profiles, you have the ability to choose existing replication groups when provisioning new VMs or creating new replication groups. When creating new replication groups these storage profile capabilities will be used as a guide as to what setting should be used when creating the new group. You'll see how new groups are created in an upcomming slide. The last thing I'll mention here is that for capabilities not specifically associated with replication, but with the volumes themselves, those types of capabilities are inherited by the replicas, when possible. For example, if the local VM has requested de-duplication, the replica will also have the duplication, as long as the remote array is licensed for that capability and the CPGs selected can support it. Remote Copy mode (Periodic only) Desired RPO Minutes (5 min) Where possible, other VVol capabilities (i.e. deduplication) are mirrored at remote site if remote array/CPG allows for it

Example Storage Policy [Nimble]
Protection schedule by minute, hour, daily or weekly Can set local and remote snapshot retention policies Choose replication partner and frequency Option to delete replicas from partner when source deleted

Creating a new VM with replication
So although will see this in the demonstration, I wanted to capture it here in the slide. Once you've created a replication profile, this is what it looks like when creating a new VM with replication capabilities. The first thing to note is that we've selected the replication storage policy. That's no different than selecting a storage policy for any VVol. But once you do select a policy that requires replication, you will see this new drop-down menu item in the dialog which asks you to specify the replication group. You have the option of choosing an existing replication group, or to create a new one. On 3PAR, the"Automatic" replication group means the 3PAR array will create a new group specifically for this VM. You might ask why you would select an existing replication group. When VVols are in the same replication group they are guaranteed consistency across that group. So if you have two VMs that require the same consistency, you can assure that consistency would be maintained at the replicas by placing them in the same replication group. Choose an existing Replication Group or Automatic

Changing a VM’s Replication Policy
So suppose you have an existing VM that is not currently being replicated. Replicating that VM is as simple as changing it storage policy to use a replication policy. That's what we're showing here. First you simply select the edit VM storage policies dialog, Select VM Policies, Edit VM Storage Policies

To start replicating existing VMs: Change the storage policy to one that has replication constraints/rules. Apply that replication policy to all of the VM’s disks Select an existing replication group, or create a new one To stop replicating a VM Change the storage policy to one that does not have replication constraints/rules. Apply that replication policy to all of the VM’s disks. then select the storage policy that supports replication. And finally assign the replication group. Selecting a Storage Policy with Replication Constraints

I will note on this slide that when you select the replication group you'll be asked to determine which virtual disks should be placed in the replication groups. Typically you'll place all of the VM's virtual disks in the same replication group as the VM itself. And while that will assure VM consistency, placing a VM's virtual disks in the same replication group is not an absolute requirement. For example, perhaps you have a virtual disk with an OS disk which does not need an aggressive RPO, but also has a database on a separate disk that has very high availability requirements. In that case you might place the OS disk in a replication group with higher RPO's and the database in a replication group with much lower RPO's. You can do this as long as you do not have consistency requirements across those virtual disks. But when you fail over, you'd want to be sure to fail over both groups at the same time. Selecting either a common or per storage object Replication Group

Replication Groups on the Array
s932 cli% showvvolvm -sc SanJoseContainer -rcopy (MB) VM_Name GuestOS VM_State Num_vv Physical Logical VMSanJose other26xLinux64Guest Unbound TinyVM2 other26xLinux64Guest Unbound TinyVM1 other26xLinux64Guest Unbound TinyVM3 other26xLinux64Guest Unbound VMFremont other26xLinux64Guest Unbound total RcopyStatus RcopyGroup RcopyRole RcopyLastSyncTime Synced VMSanJosgrp_3c12e41 Primary :20:00 PDT Synced TinyVM2grp_80ed051 Primary :22:00 PDT None NA Synced TinyVM3grp_1a3d5ba.r99931 Secondary :20:10 PDT Synced VMFremongrp_ca4027f.r99931 Secondary :21:10 PDT Output continued below So now that we are replicating some VVol based VMs, let's take a look at what you can see from the array side. This slide is some output from our CLI which displays VVol based VMs. It's been broken up into segments so you can, perhaps hopefully, see the text. Anyway you can see that we have five VMs here of which four of them are being replicated. Two of them are in the primary mode, meaning they are the source VM's for replication. And two of them are in the secondary mode meaning these are targets of replication, i.e. the replicas. I'll mention this quickly at this point, although you can see replica VMs with the array's user interfaces, vSphere itself cannot see replicas until a failover occurs. We'll talk a little bit more about that when we walk through an example failover.

3PAR SSMC User Interface
VVol replication info Remote Copy Group Remote Copy Role Remote Copy Status This slide is our SSMC graphical user interface, which basically is displaying the same information. You do have the ability to dive down and look at more of the VMs details, but we won't show that here.

Nimble Replication Groups
1. Nimble replication groups (volume collections) typically set up per-VM Could be multiple replication groups per VM in cases where there are different protection schedules for individual [groups of] disks 2. Note: Local and Remote recovery points Replication Partner Configuration and Data Vvol Source Folder (Storage Container) Application Type

Disaster Recovery with VVols

vSphere Replication Components
DR Orchestrator (PowerCLI) vCenter vCenter ESXi ESXi ESXi ESXi So now that we are replicating VVol based VMs, let's talk about disaster recovery. First of all, let's look at the components involved in replication and disaster recovery. You have the two sites, typically called the protected or primary site as well as the recovery or secondary site. We will also call them the source and target sites. And at each site we have our vSphere instances which include vCenter and the hypervisors, as well as the array and its built in VASA provider. And within that we have our storage containers and the replication groups which are replicating some of the VVols. Finally, the main component used to control disaster recovery is what we will call the DR orchestrator. In this case it's PowerCLI. In the upcoming slides I will review the new interfaces in PowerCLI that support disaster recovery. Protocol Endpoint VASA Provider Protocol Endpoint VASA Provider Remote Copy VVol Storage Container Protected Site Recovery Site

Types of Disaster Recovery operations
Planned Failover Unplanned Failover Test Failover Controlled Typically used for disaster avoidance, planned maintenance or relocating VMs Can be per VM or all VMs Primary and recovery site reverse roles Uncontrolled Typically loss of power or hardware failure Not usually a per VM event, all VMs recovered Primary and recovery site reverse roles Controlled Typically used to validate VM recoverability Non-impactful Replica VMs cloned and made visible to vSphere at recovery site But before we do that let's talk about the types of failovers. Planned First we have Planned Failover which is a more controlled failover, used typically for planned maintenance windows or intentionally relocating VMs, perhaps for load-balancing purposes. In a planned failover, you have control of which replication groups and their associated VMs. And because both sites are available, the roles of the replication groups at each site are reversed as soon as the failover is completed. The source group becomes a target and the target group becomes a source. Will do a walk-through of a planned failover in the upcoming slides. Unplanned In an unplanned failover you do not have access to both sites. This typically occurs because you've lost power or actual physical hardware at the primary site. And this typically is not a per-VM event but more of a per array event. All groups on the source array would be failed over to target arrays at their recovery sites. In this case you don't have access to the primary site so all operations occur at the recovery sites. We will review a demonstration of an unplanned failover once we walk through the planned failover slides, Test Finally, a test failover simply allows us to validate that a planned or unplanned failover would be successful if such an operation needed to occur. In this case it is again more of a controlled environment. And the way it works is to make copies of the replica VMs at the target site. Those copies are exposed to the ESXi hypervisors, and those virtual machines can be powered up and tested, without impacting the VMs at the source site, or the replicas at the target site. If you're interested in an example of a test failover, consider going to the VMware Virtual Blocks blog where we have another demonstration, and discuss it in more detail.

Recovery and Cloning – Nimble vCenter Plugin
Snapshot based recovery Choose from all available snapshots Local VM Recovery In-place restore of VM Cloned as a new VM Local Virtual Disk Recovery In-place restore of virtual disk Cloned as a new disk to same VM Cloned as a new disk to different VM Remote VM Recovery Cloned as new VM at remote site No need to stop replication Remote VM Cloning is done independent of VMware-controlled workflows

Planned Failover Workflow with VVols
Before Failover PowerCLI vSphere API vSphere API vCenter vCenter ESXi VM1 VM2 VM3 ESXi So here's an example of a planned failover, looking at it from a detailed view. In this case We'll look at the individual VVols. So the first thing to note is that we have two VMs at the local site and a third VM at the remote site. It's the two VMs at the local site on which we will perform a failover. Let's review which VVols are associated with which VMs for clarity. Here are the VVols associated with VM1. And here are the VVols associated with the VM2. One thing you'll notice is that the swap volumes for these VMs are not in the replication group. There is no benefit in having them in the group, so there is no need to replicate them, which is a little bit more efficient than replicating a traditional storage LUN. Excluding SWAP volumes from replication is vSphere's behavior. We can see the replica VVols for these VMs on the remote array. And as I previously mentioned, these replica VVols are not visible to vSphere. I've greyed them out in this diagram, to represent that invisibility. VSphere cannot perform any operations on them. And to say it yet again, we can have VMs in storage containers replicating both directions. We can see that VM3 is actually in the target storage container but replicating back to the local site. But it is VM 1 and VM 2 that we will fail over to the remote site. And I'll mention once more its PowerCLI that will actually control is failover event. Source VVol Replica VVol C1 C2 C1 C2 Storage Container Storage Container D1 D2 D1 D2 Array Array Source Replication Group Sn1 Sn1 Sw1 Sw2 Target Replication Group C3 D3 C3 D3 Local Site Remote Site

Discover VM-to-group relationship (Get-SpbmReplicationGroup/Pair) PowerCLI vSphere API vSphere API vCenter vCenter ESXi VM1 VM2 VM3 ESXi So the first thing we will do in a planned failover event is to discover the replication groups and the VMs associated with those groups, so we can determine what exactly we will be failing over. The get-spbmReplicationGroup and get-SpbmReplicationPair APIs, along with get-VM, are used for that. C1 C2 C1 C2 D1 D2 D1 D2 Array Array Sn1 Sn1 Sw1 Sw2 C3 D3 C3 D3 Local Site Remote Site

Power down source VMs (stop-VM) PowerCLI vSphere API vSphere API vCenter vCenter ESXi VM1 VM2 VM3 ESXi Once you have identified the VMs that will failover, you should power them off at the source site. You will notice that the swap volumes will go away. C1 C2 C1 C2 D1 D2 D1 D2 Array Array Sn1 Sn1 Sw1 Sw2 C3 D3 C3 D3 Local Site Remote Site

Perform final sync (Sync-SpbmReplicationGroup) PowerCLI vSphere API vSphere API vCenter vCenter ESXi VM1 VM2 VM3 ESXi At this point it's a very good idea to perform one last synchronization operation of the replication group. C1 C2 C1 C2 D1 D2 D1 D2 Array Array Sn1 Sn1 C3 D3 C3 D3 Local Site Remote Site

Issue a planned failover (Start-SpbmReplicationFailover) PowerCLI vSphere API vSphere API vCenter vCenter ESXi VM1 VM2 VM3 ESXi And now we are ready to perform our failover operation. The failover operation is initiated on the target replication group. Once that occurs, the target replication group will switch to the failed over state. The replica VVols that were once hidden now become visible. And as a return values of the failover API you'll see the paths to the VMX configuration files for the failed-over VMs. These paths will be used to register the VMs on the remote ESXi hypervisors. C1 C2 C1 C1 C2 C2 D1 D2 D1 D1 D2 D2 Array Array Source Replication Group Sn1 Sn1 Sn1 Failed-over Replication Group Target Replication Group C3 D3 C3 D3 Local Site Remote Site

Register newly-failed-over VMs (New-VM) PowerCLI vSphere API vSphere API vCenter vCenter ESXi VM1 VM2 VM1 VM2 VM3 ESXi In fact that's what we see here. The New-VM API will take those VMX paths as input to register them in vCenter. The VMX path name will contain the associated VVol datastore That datastore information can be used to identify the best ESXi hosts that should be used to run these VMs. C1 C2 C1 C2 D1 D2 D1 D2 Array Array Source Replication Group Sn1 Sn1 Failed-over Replication Group C3 D3 C3 D3 Local Site Remote Site

Apply an SPBM replication Policy to the VMs (Set-SpbmEntityConfiguration) PowerCLI vSphere API vSphere API vCenter Replication Policy vCenter ESXi VM1 VM2 VM1 VM2 VM3 ESXi vSphere does not replicate the storage profiles associated with the replica VMs. One reason this is not done is because it is impossible to know if a storage capability on a local array is available at a remote array. So as soon as you register the VMs, you should apply a storage policy to those VMs. And an important point to that, those policies should be compatible with the replication group that the replica VVols are already in. In other words, before doing a failover, you should create a storage policy that allows the VMs to be able to replicate back to their original source array. C1 C2 C1 C2 D1 D2 D1 D2 Array Array Source Replication Group Sn1 Sn1 Failed-over Replication Group C3 D3 C3 D3 Local Site Remote Site

Unregister VMs at primary site (remove-VM) PowerCLI vSphere API vSphere API vCenter vCenter ESXi VM1 VM2 VM1 VM2 VM3 ESXi And while the ordering is not specifically required here, at some point, now is a good time, we should unregister the VMs at the source site, to assure that operations on them do not occur. We don't want this, because anything that happens to the VMs at the original source site will be lost once we reverse replication. We'll use remove-VM here to accomplish that. Note that we will not use a remove from disk option here, but only unregister the VMs from vCenter. If you did happen to use a remove from disk option, the VVols would appear deleted, but behind-the-scenes be 3PAR array is aware that these VVols are in a group that may be reversed and will maintain them until that reverse operation occurs. C1 C2 C1 C2 D1 D2 D1 D2 Array Array Source Replication Group Sn1 Sn1 Failed-over Replication Group C3 D3 C3 D3 Local Site Remote Site

Power up VMs at Failed-Over site (start-VM) PowerCLI vSphere API vSphere API vCenter vCenter ESXi VM1 VM2 VM3 ESXi And at this point we could either reverse the replication group or go ahead and power up the VMs at the target site. We'll go ahead and power up the VMs first. C1 C2 C1 C2 D1 D2 D1 D2 Array Array Source Replication Group Sn1 Sn1 Failed-over Replication Group Sw1 Sw2 C3 D3 C3 D3 Local Site Remote Site

Power up VMs at Failed-Over site (Start-SpbmReplicationReverse) PowerCLI vSphere API vSphere API vCenter vCenter ESXi VM1 VM2 VM3 ESXi And now finally we can issue the replication reverse operation. This will cause the original VVols at the local site to become replicas, and thus become hidden from vSphere at the local site. And as part of the reverse operation a synchronization will occur and we will have completed our planned failover! C1 C2 C1 C2 D1 D2 D1 D2 Array Array Target Replication Group Source Replication Group Sn1 Sn1 Source Replication Group Failed-over Replication Group Sw1 Sw2 C3 D3 C3 D3 Local Site Remote Site

Demo time!

Switch to 3PAR Replication unplanned failover demo
Play STO3305BUS_Siebert_UnplannedFailover.mp4 (5:11)

Switch to Nimble Replication demo
Play STO3305BUS_Siebert_NimbleDemo.mp4 (2:30)

Closing

The HPE 3PAR & Nimble advantage with VVOLs
Solid and mature 6+ years of development VMware design partner Fibre Channel reference platform Simple and reliable Internal VASA Provider No external failure points Zero step installation Innovative and efficient Snapshots on different tiers Smallest capacity VM footprint Manage VVols folder by folder Rich and powerful App level optimized storage policies Architectures optimized for VVols VM Recovery directly from vCenter

Call To Action – Find out more
While at VMworld: Visit HPE Booth #D301 to talk to our experts Check out the VVols Hands-On Lab for some hands-on with Nimble VVols and replication Attend VVols Partner Panel session ID # Attend HPE in-booth VVol sessions Anytime: Stalk us on Twitter @Ericsiebert @Julian_cates @Nick_Dyer_ Download DR scripts from github Bookmark Around The Storage Block blog 3PAR VVol Key docs: 3PAR VMware VVol Implementation Guide 3PAR VMware VVol Replication Guide Nimble VVol Key docs: VMware Integration Guide VMware vSphere 6 Deployment Considerations

Thank You

Backup slides

Summarize and simplify this
Two Primaries Window During Planned and Unplanned failover a window exists when you have two source/primary replication groups During this time window, VMs changed as well as added/removed from replication groups, will require resolution when the groups are reversed. With traditional storage, any changes at the original source datastore are completely wiped out With VVols, it’s possible to reverse some of the actions taken at the original source site. For example: A VM deleted at the primary is kept alive until the reverse-replication operation is performed. At which point, the VM is recovered, assuming it still exists at the new source site. Adding a new VM to the original source replication group will not be lost, but instead, auto-dismissed from the group when the reverse operation occurs. VM snapshots taken at the source replication group will be lost upon reversal.

Handling Conflicts with Two Primary Replication Groups
Deleting a VM at the original source PowerCLI vSphere API vSphere API vCenter vCenter ESXi VM1 VM2 VM1 VM2 VM3 ESXi C1 C2 C1 C2 D1 D2 D1 D2 Array Array Source Replication Group Sn1 Sn1 Failed-over Replication Group C3 D3 C3 D3 Local Site Remote Site

Handling Conflicts with Two Primary Replication Groups
Adding a VM to the original source after fail-over PowerCLI vSphere API vSphere API vCenter vCenter ESXi VM1 VM2 VM4 VM1 VM2 VM3 ESXi C4 C1 C2 C1 C2 D4 D1 D2 D1 D2 Array Array Sn1 Sn1 Source Replication Group Failed-over Replication Group C3 D3 C3 D3 Local Site Remote Site

In-conflict VVols are automatically removed from the group, but not destroyed PowerCLI vSphere API vSphere API vCenter vCenter ESXi VM4 VM1 VM2 VM3 ESXi C4 D4 C1 C2 C1 C2 D1 D2 D1 D2 Array Array Target Replication Group Source Replication Group Sn1 Sn1 C4 D4 Failed-over Replication Group Source Replication Group C3 D3 C3 D3 Local Site Remote Site

Benefits of VVols on Nimble Storage
No need to manage additional resources Highly available Embedded VASA Provider Automated registration of VP Automatic creation of PE and host access control management VASA Provider Management Manage VVOLs folder by folder Folders can grow and shrink dynamically Folders Replicate VM using array based replication Recover VM using Nimble vCenter Plugin Backup and Recovery

Nimble VVol Implementation
New policy-based framework from VMware Foundation of VMware’ Software Defined Datacenter control pane Based on the VASA management protocol Storage Policy Based Management Built-in application abstractions Pick from drop-down list - Exchange, SQL, VDI, Splunk, ESX etc. Auto-selects optimal storage settings Nimble Policy Based Management Makes Nimble storage natively VM aware Enables virtual disk level offload of data services Snapshots, replication, encryption Virtual Volumes

Registering Nimble VASA Provider
Done through Nimble Web UI Simply check the box

Additional SPBM Policy Options
Application Policy Protection Schedule QoS / Performance Deduplication Data Encryption All-Flash

VVol Replication Objects
Storage Container Replication Group VVol Config, Data, RO/RW snapshots and k/v replicated. SWAP and storage policies not replicated Names of Storage Con- tainers and CPGs shared shared between arrays Array A – a.k.a Fault Domain Array B

Point-in-Time Snapshot for Failover Test [3PAR]
VVol Replication Group Snapshot relationship RO C D VM1 VM2 RO C D VM1 VM2 C’ D’ RO’ VM1 Data & K/V C’ D’ RO’ VM2 VVol Datastore

3PAR VASA Replication Capabilities
Built on top of HPE 3PAR Remote Copy rcopyTargetContainer Specifies the combined array sysName and Storage Container. Multiple values possible for each target sysName, one for each Storage Container at that target. Syntax: sysName:storageContainerName rcopyMode Specifies requested remote copy mode. Periodic [only mode supported today for VVols] Remote Copy itself supports synchronous, streaming and multi-target synchronous / periodic (SLD) rcopyRPO Specifies periodic sync period. Range: (5min - 1 year) Requires rcopyMode be set to Periodic An integral part of the VASA functionality is to advertise, enable and enforce capabilities. These are the way vSphere administrators can select capabilities they desire for their VMs. Those capabilities affect a number of VASA APIs. For more details, see the VVol EKTs. New capabilities have been added to support replication.

3PAR VASA Replication Capabilities
Built on top of HPE 3PAR Remote Copy rcopyRemoteCPG Specifies the Provisioning Group (a.k.a CPG, a template for the volume physical characteristics, such RAID, and device type) to be used for remote copy at the target site for disk VVols. If not specified defaults to CPG capability, if specified. If not specified, VASA Provider selects a CPG. rcopyRemoteSnapCPG Specifies the CPG to be used for remote copy at the target site for base volumes One selectable from SPBM dropdown If not specified defaults to same as rcopyRemoteCPG capability Where possible, other VVol capabilities are mirrored at the remote site. For example, if deduplication is selected at the local site, a replica VVol on the remote array would be created with deduplication, if the remote array and remote CPG allow for it.

Disaster Recovery with VVol-based VMs
Types of failover Planned Changes the responsibility for mastership of the VVol-based VMs from the Source/Primary site to the Target/Secondary site. Source and Target roles are reversed. Involves: Collecting replicated VM and group information at the Source and Target sites. Powering Down VMs at Source site Issuing a failover operation (Spbm-FailoverReplicationGroup in PowerCLI) Unregistering VMs at the source site Registering the VMs at the failover/target site. Applying a storage policy that allows replication back to the original Source site. Issuing a reverse operation (Spbm-ReverseReplicationGroup in PowerCLI) Optionally powering up the failed-over VMs

Types of failover Unplanned Changes the responsibility for mastership of the VVol-based VMs from the Source/Primary site to the Target/Secondary site. Source and Target roles are reversed (eventually). When the disaster occurs: Collecting replicated group information at the Target/Recovery site Issuing a failover operation (Spbm-FailoverReplicationGroup in PowerCLI) Registering the VMs at the failover/target site. Applying a storage policy that allows replication back to the original Source site. Optionally powering up the failed-over VMs After the original Source site has recovered from it’s disaster: Powering Down VMs at Source site (if needed) Unregistering VMs at the source site Issuing a reverse operation (Spbm-ReverseReplicationGroup in PowerCLI) There’s no undo option. Once you issue a Planned or Unplanned failover, the only option is to continue forward. Issue a reverse, and optionally another failover and reverse to bring things back up at the original site.

Types of failover Test Allows testing of replicated VVol-based VMs at the secondary site, by making copies of the replica VVols and exposing them to ESXi hosts for VM testing. Source and Target roles are NOT reversed. Involves: Collecting replicated VM and group information at the Source and Target sites. Issuing a test-failover operation (Start-SpbmReplicationTestFailover in PowerCLI) Registering the new test-VMs at the target site. Applying a storage policy that allows replication back to the original Source site To help verify those replication policies are valid. Powering up, testing and powering down the in-test VMs Un-registering the test-VMs Issuing a stop-test-failover operation (Stop-SpbmReplicationTestFailover in PowerCLI) Once the test is stopped, all VMs created by the test are destroyed permanently.

Storage Administrator’s Responsibility after a Disaster [may remove this slide]
Clean-up of automatically-created replication groups is automatic, unless a true disaster occurs. tbd

Replicating VMware VVols: A technical deep dive

Similar presentations

Presentation on theme: "Replicating VMware VVols: A technical deep dive"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Replicating VMware VVols: A technical deep dive

Similar presentations

Presentation on theme: "Replicating VMware VVols: A technical deep dive"— Presentation transcript:

Similar presentations

About project

Feedback