© 2014 Vicom Infinity Storage System High-Availability & Disaster Recovery Overview [638] John Wolfgang Enterprise Storage Architecture & Services

Slides:

Advertisements

Similar presentations

Introduction to InMage Solutions Eric Burgener Senior Vice President, Product Management May2009.

Advertisements

1/16/20141 Introduction to InMage Solutions Eric Burgener Senior Vice President, Product Management December 2009.

1/17/20141 Leveraging Cloudbursting To Drive Down IT Costs Eric Burgener Senior Vice President, Product Marketing March 9, 2010.

Christian Schwartz-Sørensen

Business Continuity/Disaster Recovery: Solutions For Firms Of All Sizes Atlas Lee, CBCP Director Of Business Continuity Atlas Lee, CBCP Director Of Business.

© 2009 VMware Inc. All rights reserved vCenter Site Recovery Manager 5 Name Title Q SRM-SLS-1.7.

Creating a Data Disaster Recovery Plan. What is a DR Plan? Is your best solution to: Continuous business services Prompt and smooth recovery Prepare for.

Springfield Data Center Massachusetts Government Information Systems Association Stephen Dennehy 13 June 2013.

Building the business case for Business Continuity Justin Davey Senior Consultant CA.

Microsoft System Center Data Protection Manager Mat Young – Storage Technology Architect, Microsoft EMEA Justin Alderson – Storage Specialists Manager,

Leaders Have Vision™ visionsolutions.com 1 Easy migration into the cloud Simple “on demand” disaster recovery With Double Take and HyperV Gabriel Chadeau.

© 2010 IBM Corporation ® Tivoli Storage Productivity Center for Replication Billy Olsen.

STANFORD UNIVERSITY INFORMATION TECHNOLOGY SERVICES IT Services Storage And Backup Low Cost Central Storage (LCCS) January 9,

Leaders Have Vision™ visionsolutions.com 1 MIMIX ® RecoverNow 5.1 Technical Overview Last Updated: February 3 rd 2012 for Business.

RETHINK BACKUP & ARCHIVE. 2 Backup and Archive are Top IT Priorities Which of the following would you consider to be your org’s most important IT priorities.

Business Continuity Section 3(chapter 8) BC:ISMDR:BEIT:VIII:chap8:Madhu N PIIT1.

VERITAS Confidential Disaster Recovery – Beyond Backup Jason Phippen – Director Product and Solutions Marketing, EMEA.

A Combat Support Agency Defense Information Systems Agency Storage Directions August 2011.

© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity Module 3.1.

1 Disk Based Disaster Recovery & Data Replication Solutions Gavin Cole Storage Consultant SEE.

Mainframe Replication and Disaster Recovery Services.

Determining BC/DR Methods Recovery Time Objective – (RTO) Time needed to recover from a disaster How long can you afford to be without your systems Recovery.

1 © Copyright 2010 EMC Corporation. All rights reserved. EMC RecoverPoint/Cluster Enabler for Microsoft Failover Cluster.

Information Means The World.. Enhanced Data Recovery Agenda EDR defined Backup to Disk (DDT) Tape Emulation (Tape Virtualization) Point-in-time Copy Replication.

Keith Burns Microsoft UK Mission Critical Database.

Disaster Prevention and Recovery Presented By: Sean Snodgrass and Theodore Smith.

Leaders Have Vision™ visionsolutions.com 1 Double-Take for Hyper-V Overview.

Multiple Replicas. Remote Replication. DR in practice.

Module – 12 Remote Replication

4/17/2017 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.

Business Continuity and DR

1© Copyright 2011 EMC Corporation. All rights reserved. EMC RECOVERPOINT/ CLUSTER ENABLER FOR MICROSOFT FAILOVER CLUSTER.

IBM TotalStorage ® IBM logo must not be moved, added to, or altered in any way. © 2007 IBM Corporation Break through with IBM TotalStorage Business Continuity.

Remote Replication Chapter 14(9.3) ISMDR:BEIT:VIII:chap9.3:Madhu N:PIIT1.

1© Copyright 2014 EMC Corporation. All rights reserved. SAP Data Protection with EMC Customer Presentation April 2014.

Disaster Recovery as a Cloud Service Chao Liu SUNY Buffalo Computer Science.

IT Business Continuity Briefing March 3,  Incident Overview  Improving the power posture of the Primary Data Center  STAGEnet Redundancy  Telephone.

Business Continuity and Disaster Recovery Chapter 8 Part 2 Pages 914 to 945.

© Novell, Inc. All rights reserved. 1 PlateSpin Protect Virtualize your Disaster Recovery.

Why and What are Data Movement Solutions for Continuous Business Roselinda R. Schulman, CBCP Worldwide Sales Storage Support.

Whiteboard Development Develop whiteboard visually in this slide. Use PowerPoint drawing tools and assets copy/paste from previous slide. Duplicate as.

NetBackup PureDisk Kris Hagerman Sr. Vice President, Data Center Management.

DotHill Systems Data Management Services. Page 2 Agenda Why protect your data?  Causes of data loss  Hardware data protection  DMS data protection.

Protect and Recover Presented by:Toby Bishop, Senior UK Account Manager Date:Friday, 09 October 2015Friday, 09 October 2015Friday, 09.

Continuous Access Overview Damian McNamara Consultant.

©2006 Merge eMed. All Rights Reserved. Energize Your Workflow 2006 User Group Meeting May 7-9, 2006 Disaster Recovery Michael Leonard.

Storage 101: Bringing Up SAN Garry Moreau Senior Staff Alliance Consultant Ciena Communications (763)

Consolidation And The Enterprise. Ohio Digital Government Summit: Consolidation And The Enterprise Mark Stevanovich EMC Federal Client Services 5 October.

Storage Trends: DoITT Enterprise Storage Gregory Neuhaus – Assistant Commissioner: Enterprise Systems Matthew Sims – Director of Critical Infrastructure.

Your business runs even when your server doesn’t DR Recommendation November 2011.

Availability on Demand (Business Continuity and DR, and more…)

© Copyright IBM Corporation 2013 June 2013 IBM Integrated System Test Page 1 IBM Integrated Solutions Test Enterprise Test Series: Ideal Stack Testing.

2012 Infotec Conference Leveraging the Cloud to Increase Availability and Improve Resilience Presenter Kevin Swagerty Executive Director of IT Service.

High Availability in DB2 Nishant Sinha

Virtual Machine Movement and Hyper-V Replica

DISASTER RECOVERY PLAN By: Matthew Morrow. WHAT HAPPENS WHEN A DISASTER OCCURS  What happens to a business during a disaster?  What steps does a business.

가상화 기반의 Workload 관리솔루션 : FORGE PlateSpin Virtualization and Workload Management 나영관 한국노벨 /

Planning for Application Recovery

Providing Application High Availability

Server Upgrade HA/DR Integration

Determining BC/DR Methods

Disaster Recovery Constituent Group

What Do We Do? Managed IT services

Disaster Recovery Services

Modern Datacenters with Azure

Business Continuity Technology

Storage Trends: DoITT Enterprise Storage

PRESENTER GUIDANCE: These charts provide data points on how IBM BaaS mid-market benefits a client with the ability to utilize a variety of backup software.

Using the Cloud for Backup, Archiving & Disaster Recovery

Hyper-V backup -Free Edition

Presentation transcript:

© 2014 Vicom Infinity Storage System High-Availability & Disaster Recovery Overview [638] John Wolfgang Enterprise Storage Architecture & Services 26 June 2014

© 2014 Vicom Infinity Agenda  Background  Application-Based Replication  Storage-Based Replication –Tape Replication –Point-in-Time Replication –Synchronous Replication –Asynchronous Replication  Automation  Replication Examples  Key Questions for Any Solution 2

© 2014 Vicom Infinity My Background  West Virginia University –BS Electrical Engineering –BS Computer Engineering  Carnegie Mellon University –MS Electrical and Computer Engineering –Data Storage Systems Center  Lockheed Martin & Raytheon  IBM –Development (Tucson) - Software Engineer – 10 years –Data replication, disaster recovery, GMU, eRCMF, TPC for Replication –Global Support Manager (New York City) – Morgan Stanley – 2 years –IBM Master Inventor  Vicom Infinity –Enterprise Storage Architecture and Services – 1+ years 3

© 2014 Vicom Infinity Why are High Availability & Disaster Recovery Important?  Information is your most important commodity – need to protect it  What happens to your company if you don’t have access to your production data for a minute? An hour? A day? A month?  How much money does your company lose every minute? –Amazon.com loses $66,240 per minute (Forbes.com – 8/19/2013) –Ebay.com loses $120,000 per minute (ebay.com) 4

© 2014 Vicom Infinity Lessons Learned from Previous Disasters  Rolling disasters happen  Distance is more important  Redundancy may be smoke and mirrors  If you have not successfully tested your exact DR plan, you do not have a DR plan  Automate as much as possible –Increase dependency on automation and decrease dependency on people –Automation provides the ability to test over and over until perfect –Automation will not deviate from procedures –Automation will NOT make mistakes (even under pressure!) –Automation will not have trouble getting to the DR site  Recovery site Considerations –Site capacity (MIPs and TBs) needs to be sized to handle the production environment –What is the DR Plan after successful recovery from disaster –Disasters may cause multiple companies to recover and that puts stress on the commercial business recovery services 5

© 2014 Vicom Infinity Replication Beyond Disaster Recovery 6 Availability Improvements Backup Window Tape Backup Data Migration Archival Availability Improvements Backup Window Tape Backup Data Migration Archival Disaster Recovery/ Business Continuity Minimize data loss Minimize restart time Increase distance Enable automation Disaster Recovery/ Business Continuity Minimize data loss Minimize restart time Increase distance Enable automation Operational Efficiency Data Mining Content Distribution Software Testing Operational Efficiency Data Mining Content Distribution Software Testing

© 2014 Vicom Infinity Some Definitions  Recovery Point Objective (RPO) –How much data can you tolerate losing during a disaster  Recovery Time Objective (RTO) –How much time will it take to get your systems up and running again after a disaster 7 Replication Method Point-in-Time Continuous Synchronous App-Based Asynchronous App-Based Storage-Based

© 2014 Vicom Infinity 7 Tiers of Business Recovery Options 8 Mission Critical Data Less Critical Data Key Customer Objectives: RTO – Recovery Time Objective RPO – Recovery Point Objective Tier 1 – PTAM* 15 Min.1-4 Hr.4 -8 Hr.8-12 Hr Hr..24 Hr..Days Tier 6 - RPO=Near Zero, RTO= Manual - Disk or Tape Data Mirroring Tier 5 - RPO > 15 min. RTO= Manual; PiT or SW Data Replication Tier 4 - Data Base Log Replication & Host Log Apply at Remote Tier 3 – Electronic Tape Vaulting Tier 2 – PTAM & Hot Site Cost of Ownership (Servers/Network Bandwidth/Storage) Tier 7 - RPO=Near Zero, RTO <1Hr. Server/Workload/Network/Data Automatic Site Switch Time to Recover – How quickly is an application recovered after a disaster? *PTAM – Pickup Truck Access Method Active Secondary Site Point-in-time Backup to Tape RPO: 4+ hrs RTO: 4+ hrs RPO: 24+ hrs RTO: Days

© 2014 Vicom Infinity Agenda  Background  Application-Based Replication  Storage-Based Replication –Tape Replication –Point-in-Time Replication –Synchronous Replication –Asynchronous Replication  Automation  Replication Examples  Key Questions for Any Solution 9

© 2014 Vicom Infinity Application-Based vs. Storage-Based Replication (1) Application/File/Transaction Based  Specific to application/file system/database  Generally less data is transferred –Lower telecommunication costs  No coordination across applications, FSs, DBs, etc.  Applications change - replication may need to change  May forget “other" related data necessary for recovery  With many transfers occurring in a corporation, it may be difficult to determine what is where in a disaster. RTO/RPO may not be repeatable, auditing may be difficult  Many targets possible (ex. millions of cell phones) 10

© 2014 Vicom Infinity Application-Based Replication Examples DB2 HADR  High availability solution for both partial and complete site failures  Log data is shipped and applied to standby database –One or more standby databases  If primary database fails, applications are redirected to the standby database  Standby database takes over in seconds –Avoids database restart upon a partial error LVM Mirroring  Create more than one copy of a physical partition to increase data availability  Handled at the logical volume level  If a disk fails, can still have access the data on an alternate disk  Remote LVM mirroring enables use of disks located at multiple locations –Replication between multiple storage systems via a Storage Area Network (SAN) 11

© 2014 Vicom Infinity Agenda  Background  Application-Based Replication  Storage-Based Replication –Tape Replication –Point-in-Time Replication –Synchronous Replication –Asynchronous Replication  Automation  Replication Examples  Key Questions for Any Solution 12

© 2014 Vicom Infinity Application-Based vs. Storage-Based Replication (2) Storage-Based – Block Level Replication  Independent of application, file systems, databases, etc.  Common technique for corporation –Managed by operations  Generally more data transferred –Higher telecommunication costs  Consistency groups yield cross volume/storage subsystem data integrity/consistency  Independent of application changes. –Mirror all pools of storage  Consistent repeatable RPO.  RTO depends on server/data/workload/network  Generally a handful of targets  Specific to data replication technique (tied to specific architecture & devices that support it) 13

© 2014 Vicom Infinity What Does Data Consistency Really Mean?  For storage-based replication, we are talking about “power fail” consistency  Typical Database transaction: 1.Update log – database update is about to occur 2.Update database 3.Update log - database update complete  Host is very careful to do each of the transactions in order –This provides power fail data consistency  BUT, these transactions are likely done to different volumes possibly on different control units  Failure to be careful about transaction order results in loss of data consistency and data may become unusable  In order to ensure data consistency at secondary site, dependent writes must be done in order  How does a storage system know which writes are dependent? –It doesn’t –What it does know is that writes that are done in parallel are not dependent –Any writes NOT done in parallel are assumed to be dependent  This is exacerbated for asynchronous replication 14 LOG-1,3 DB-2

© 2014 Vicom Infinity Storage-Based Replication Techniques Tape  Pickup Truck Access Method (PTAM)  Virtual Tape Replication Disk  Point-in-Time Copy  Synchronous Replication  Asynchronous Replication  Three-site Replication (Synchronous & Asynchronous) Automation  Hyperswap  Tivoli Storage Productivity Center for Replication  Globally Dispersed Parallel Sysplex (GDPS) 15

© 2014 Vicom Infinity Tape Replication - PTAM 16  Backups created and dumped to physical tapes –Recovery Point Objectives quite high – 24 hours at best?  Tapes are literally picked up by a truck and taken to another location –Hot site –Storage only –Recovery Time Objective fairly high in both cases  Lower cost and simpler option than disk replication

© 2014 Vicom Infinity Tape Replication – Virtual Tape Grid 17 WAN Cluster 0 TS7720 Cluster 1 TS7740 TS3500 Cluster 3 TS7720  Virtual Tape Servers appear to hosts as standard tape volumes –May or may not actually contain tape drives and tapes  Multiple clusters can be put together into a tape grid  Tape volumes can be selectively replicated to one or more other clusters  Tape volumes can be accessed through any cluster in the grid –Whether or not the tape volume physically resides on that cluster  Certain virtual tape server models have physical tape libraries behind them that can offload volumes to actual tapes  Hybrid with characteristics of both tape backup and replication –Recovery Point Objective much better than PTAM

© 2014 Vicom Infinity 7 Tiers of Business Recovery Options 18 Mission Critical Data Less Critical Data Key Customer Objectives: RTO – Recovery Time Objective RPO – Recovery Point Objective Tier 1 – PTAM* 15 Min.1-4 Hr.4 -8 Hr.8-12 Hr Hr..24 Hr..Days Tier 6 - RPO=Near Zero, RTO= Manual - Disk or Tape Data Mirroring Tier 5 - RPO > 15 min. RTO= Manual; PiT or SW Data Replication Tier 4 - Data Base Log Replication & Host Log Apply at Remote Tier 3 – Electronic Tape Vaulting Tier 2 – PTAM & Hot Site Cost of Ownership (Servers/Network Bandwidth/Storage) Tier 7 - RPO=Near Zero, RTO <1Hr. Server/Workload/Network/Data Automatic Site Switch Time to Recover – How quickly is an application recovered after a disaster? *PTAM – Pickup Truck Access Method Active Secondary Site Point-in-time Backup to Tape RPO: 4+ hrs RTO: 4+ hrs RPO: 24+ hrs RTO: Days

© 2014 Vicom Infinity Point-in-Time vs. Continuous Replication 19 Point-in-Time Local copy of data Data “Frozen” Provides protection against logical corruption, user error Data is not the most current Point-in-Time Local copy of data Data “Frozen” Provides protection against logical corruption, user error Data is not the most current Continuous Replication Remote copy of the data Provides protection against primary storage system or data center issue Continuously updated Data is always current (or close to it) Corruption/Errors on the primary site will be transferred to the secondary Continuous Replication Remote copy of the data Provides protection against primary storage system or data center issue Continuously updated Data is always current (or close to it) Corruption/Errors on the primary site will be transferred to the secondary

© 2014 Vicom Infinity Point-in-Time Copy  Internal to Storage System  New copy created and available immediately  Possible to read & write to both volumes  No-Copy –No data is copied to Target unless updated on the Source  Copy on Write –Data must be copied to Target before being updated on Source  Background Copy –All data from Source copied to Target –Relationship typically ends when copy is complete  Incremental Copy –Full background copy is done the first time –Only changes copied subsequently  Space Efficient/Thin Provisioned –Only allocate space as it is used 20 Storage System Target Source Write to source

© 2014 Vicom Infinity 7 Tiers of Business Recovery Options 21 Mission Critical Data Less Critical Data Key Customer Objectives: RTO – Recovery Time Objective RPO – Recovery Point Objective Tier 1 – PTAM* 15 Min.1-4 Hr.4 -8 Hr.8-12 Hr Hr..24 Hr..Days Tier 6 - RPO=Near Zero, RTO= Manual - Disk or Tape Data Mirroring Tier 5 - RPO > 15 min. RTO= Manual; PiT or SW Data Replication Tier 4 - Data Base Log Replication & Host Log Apply at Remote Tier 3 – Electronic Tape Vaulting Tier 2 – PTAM & Hot Site Cost of Ownership (Servers/Network Bandwidth/Storage) Tier 7 - RPO=Near Zero, RTO <1Hr. Server/Workload/Network/Data Automatic Site Switch Time to Recover – How quickly is an application recovered after a disaster? *PTAM – Pickup Truck Access Method Active Secondary Site Point-in-time Backup to Tape RPO: 4+ hrs RTO: 4+ hrs RPO: 24+ hrs RTO: Days

© 2014 Vicom Infinity 22 Synchronous Replication Overview Write to Secondary Write Acknowledged to Primary Server Write Write Acknowledge Primary Secondary

© 2014 Vicom Infinity Secondary Storage System Synchronous Replication  Data on secondary storage system is always identical to primary –Recovery Point Objective of 0  Standard implementation for many storage vendors  There is an impact on application I/Os –Dependent on distance between primary and secondary –Distance to 300 km –Bandwidth must be sufficient for peak  Data Freeze technology keeps all pairs in consistency group consistent –Requires automation to guarantee consistency across multiple storage systems 23 H1H1 H1H1 Primary Storage System Synchronous Replication H2

© 2014 Vicom Infinity Practice How you Recover and Recover How you Practice  Proper Disaster Recovery Tests require time & effort & commitment  If you haven’t successfully tested your exact DR plan, you don’t have a DR plan  A DR test may require you to stop data replication temporarily  Use Practice Volumes to test properly while continuing replication  Practice Volumes can also be used for other activities –Development, testing, data analytics  Make sure you always recover to the Practice Volumes – even in a real disaster 24

© 2014 Vicom Infinity Secondary Storage System Synchronous Replication with Practice Volumes 25 H1H1 H1H1 H2 I2 Primary Storage System Synchronous Replication PIT Copy for DR Testing  Standard synchronous replication as the basis  Typical synchronous replication requires replication outage for DR testing  Practice volumes provide capability to continue replication during DR testing  Data is recovered to secondary storage system  Point-in-Time copy created on secondary storage system  Replication is restarted while access to H2 volume still available  Should recover in actual disaster using the same method

© 2014 Vicom Infinity 26 Asynchronous Replication Overview Write to Secondary Write Acknowledged to Primary Server Write Write Acknowledge Primary Secondary

© 2014 Vicom Infinity Asynchronous Replication – No Consistency Asynchronous transfer of data updates No distance limitation Little impact on application I/Os Secondary not guaranteed consistent No write ordering No consistent data sets Hosts/Applications must be shut down to provide consistency Most useful for migration Can transition to/from Synchronous replication 27 H1H1 H1H1 Primary Storage System Asynchronous Replication H2 Secondary Storage System

© 2014 Vicom Infinity Asynchronous Replication – Two Volumes Asynchronous transfer of data updates Recovery Point Objective > 0 No distance limitation Little impact on application I/Os Data consistency maintained via: Write ordering Consistent data sets If bandwidth is not sufficient for peak, data will back up on the primary Some vendors require extra cache 28 H1H1 H1H1 Primary Storage System Asynchronous Replication H2 Secondary Storage System

© 2014 Vicom Infinity Secondary Storage System Asynchronous Replication – Three Volumes Asynchronous transfer of data updates Recovery Point Objective > 0 No distance limitation Little impact on application I/Os Data consistency created using 3 rd volume Consistency coordinated by primary storage system If bandwidth is not sufficient for peak, RPO will grow and “catch up” later 29 H1H1 H1H1 J2 H2 Primary Storage System Asynchronous Replication PIT Copy

© 2014 Vicom Infinity Secondary Storage System Asynchronous Replication With Practice Volumes  Standard asynchronous replication as the basis –Could be any of the consistent variants  Typical asynchronous replication requires replication outage for DR testing  Practice volumes provide capability to continue replication during DR testing  Data is recovered to secondary storage system in typical manner  Point-in-Time copy created on secondary storage system  Replication is restarted while access to H2 volume still available  Should recover in actual disaster using the same method 30 H1H1 H1H1 J2 I2 Primary Storage System Asynchronous Replication PIT Copy H2

© 2014 Vicom Infinity Asynchronous Replication – z/OS Interaction Asynchronous transfer of data updates Recovery Point Objective > 0 but very low (~seconds) No distance limitation Little impact on application I/Os Managed by System z Multiple Storage vendors 31 H1H1 H1H1 Primary Storage System Asynchronous Replication Controlled by SDM on Target System z Host H2 Secondary Storage System Source z Host Target z Host Server Write Write Acknowledge

© 2014 Vicom Infinity Secondary Storage System Asynchronous Replication – Four Volumes/PiT Copies Asynchronous transfer of data updates Recovery Point Objective typically higher than previously discussed implementations RPO is 2x the “cycling period” Can tolerate lower network bandwidth Little impact on application I/Os Periodic consistent PiT copies are created from primary volumes PiT copies are replicated to secondary volumes Does not require a consistent replication mechanism After copy is complete, PiT copies are created from secondary volumes for protection 32 H1H1 H1H1 Primary Storage System Asynchronous Replication C1 H2 C2 PiT Copy PiT Copy

© 2014 Vicom Infinity Secondary Storage System Three-Site Replication - Cascading  Combination of Synchronous & Asynchronous replication techniques  Synchronous replication to provide High Availability at metro distances –Protect against storage system & data center disasters  Asynchronous replication to provide disaster recovery capability at global distances –Protect against regional disasters  Ability to switch production between primary and secondary systems  Incremental resynchronization between primary and tertiary if secondary lost  Requires automation to handle the various transitions 33 Tertiary Storage System H1H1 H1H1 J3 H3 Primary Storage System Synchronous Replication PiT Copy H2 Asynchronous Replication

© 2014 Vicom Infinity Tertiary Storage System Secondary Storage System Three-Site Replication – Multi-Target  Combination of Synchronous & Asynchronous replication techniques  Synchronous replication to provide High Availability at metro distances –Protect against storage system & data center disasters  Asynchronous replication to provide disaster recovery capability at global distances –Protect against regional disasters 34 H2 Synchronous Replication H1H1 H1H1 Primary Storage System Asynchronous Replication H3

© 2014 Vicom Infinity Agenda  Background  Application-Based Replication  Storage-Based Replication –Tape Replication –Point-in-Time Replication –Synchronous Replication –Asynchronous Replication  Automation  Replication Examples  Key Questions for Any Solution 35

© 2014 Vicom Infinity 7 Tiers of Business Recovery Options 36 Mission Critical Data Less Critical Data Key Customer Objectives: RTO – Recovery Time Objective RPO – Recovery Point Objective Tier 1 – PTAM* 15 Min.1-4 Hr.4 -8 Hr.8-12 Hr Hr..24 Hr..Days Tier 6 - RPO=Near Zero, RTO= Manual - Disk or Tape Data Mirroring Tier 5 - RPO > 15 min. RTO= Manual; PiT or SW Data Replication Tier 4 - Data Base Log Replication & Host Log Apply at Remote Tier 3 – Electronic Tape Vaulting Tier 2 – PTAM & Hot Site Cost of Ownership (Servers/Network Bandwidth/Storage) Tier 7 - RPO=Near Zero, RTO <1Hr. Server/Workload/Network/Data Automatic Site Switch Time to Recover – How quickly is an application recovered after a disaster? *PTAM – Pickup Truck Access Method Active Secondary Site Point-in-time Backup to Tape RPO: 4+ hrs RTO: 4+ hrs RPO: 24+ hrs RTO: Days

© 2014 Vicom Infinity High Availability Hyperswap for Synchronous Replication Configurations  Triggered when there is a problem writing or accessing the primary storage devices  Swap from using primary storage devices to secondary storage devices  Transparent to applications (brief pause on the order of seconds)  Steps –Physically switch the secondary storage devices to be primary and allow access –Logically switch the OS internal pointers in the UCBs –Applications are not aware that they are now using the secondary devices  No Shutdown, No Configuration Changes, No IPL  Managed by Automation software (GDPS, TPC for Replication)  Planned Hyperswap –Use for maintenance, production site move, migration  Unplanned Hyperswap –Automated to protect against storage system failure 37

© 2014 Vicom Infinity Tivoli Storage Productivity Center for Replication  Automate and simplify complex data replication tasks  Control multiple replication types and storage systems from a single pane –Including CKD and FB volumes  Added Error Protection  Added Ease of Use  Facilitates DR Testing and DR Recovery  Enables Basic Hyperswap, Hyperswap, Open Hyperswap  GUI-based –Operational control of replication environment via a GUI rather than DSCLI scripts or TSO commands Also provides a CLI  Linux, Windows, AIX, z/OS 38

© 2014 Vicom Infinity Globally Dispersed Parallel Sysplex (GDPS)  Enables Business Recovery Tier 7 capability  Manage all forms of replication  Manage Hyperswap  Drive down RTO through automation  Scripting Capability provides ability to automate the recovery process at the DR site –Enable CBU –Automate Recovery of Disk systems –Automate IPL of LPARs –Automate application startup 39

© 2014 Vicom Infinity 7 Tiers of Business Recovery Options 40 Mission Critical Data Less Critical Data Key Customer Objectives: RTO – Recovery Time Objective RPO – Recovery Point Objective Tier 1 – PTAM* 15 Min.1-4 Hr.4 -8 Hr.8-12 Hr Hr..24 Hr..Days Tier 6 - RPO=Near Zero, RTO= Manual - Disk or Tape Data Mirroring Tier 5 - RPO > 15 min. RTO= Manual; PiT or SW Data Replication Tier 4 - Data Base Log Replication & Host Log Apply at Remote Tier 3 – Electronic Tape Vaulting Tier 2 – PTAM & Hot Site Cost of Ownership (Servers/Network Bandwidth/Storage) Tier 7 - RPO=Near Zero, RTO <1Hr. Server/Workload/Network/Data Automatic Site Switch Time to Recover – How quickly is an application recovered after a disaster? *PTAM – Pickup Truck Access Method Active Secondary Site Point-in-time Backup to Tape RPO: 4+ hrs RTO: 4+ hrs RPO: 24+ hrs RTO: Days

© 2014 Vicom Infinity Data Replication Considerations  Synchronous solutions do not work at distance  Asynchronous solutions have data loss and potential problems managing consistency, particularly across different storage platforms  Maximizing use of long distance link is critical for many customers –Smaller customers may want to purchase extended links which meet maximum transfer requirements for a shift, not their 15 second peak  Being able to test, recover data at the recovery site, and replicate back to the production site after resolution is critical –If you have not successfully tested your DR procedures, you do NOT have DR procedures –Practice how you recover, and recover how you practice 41

© 2014 Vicom Infinity Agenda  Background  Application-Based Replication  Storage-Based Replication –Tape Replication –Point-in-Time Replication –Synchronous Replication –Asynchronous Replication  Automation  Replication Examples  Key Questions for Any Solution 42

© 2014 Vicom Infinity Secondary Storage System Fit For Purpose – Two & Three Site Replication  Tailor your solution to your needs (and budget)  Synchronous replication for everything  Three-site Synchronous/Asynchronous only for your most important data 43 H1H1 H1H1 Primary Storage System H2 Source Host Tertiary Storage System J3 H3 PiT Copy H1H1 H1H1 H2 Synchronous Asynchronous

© 2014 Vicom Infinity Secondary Storage System Fit For Purpose – Asynchronous Replication  Save money and reduce complexity by replicating some data consistently and other data with no consistency  Make sure you understand the ramifications of these decisions 44 H1H1 H1H1 Primary Storage System H2 Source Host J3 PiT Copy H1H1 H1H1 H2 No Consistency Consistent Asynchronous

© 2014 Vicom Infinity Secondary Storage System Low Cost Asynchronous with Consistency  Create periodic consistent PiT copies of primary production volumes  Use asynchronous replication with no consistency to copy all data to secondary  When all the data is copied, the secondary volumes are consistent  Lower network bandwidth requirements  Avoids extra volume at secondary site  Use Space Efficient PiT copy to conserve even more space  Useful for testing, data analytics, or high RPO requirements 45 Source Host H1H1 H1H1 Primary Storage System T1 PiT Copy Asynchronous Replication H2

© 2014 Vicom Infinity Quaternary(?) Storage System Secondary Storage System Four Site Replication  Synchronous replication to provide high-availability  Asynchronous cascaded replication to provide global distance DR capability  Cascaded asynchronous leg to provide another copy of data –Used for development, testing, data analytics –Only consistent periodically 46 Primary Storage System H4 Source Host Tertiary Storage System J3 H3 PiT Copy H1H1 H1H1 H2 Synchronous Asynchronous Asynchronous – No Consistency

© 2014 Vicom Infinity Tape System PIT Copies for Tape Backup  Create periodic PiT copies of primary production volumes  Dump these PiT copies to tape for backups  Avoids the tape backup software accessing production volumes  Use minimum space by employing Space Efficient PiT copies 47 Source Host H1H1 H1H1 Primary Storage System T1 PiT Copy

© 2014 Vicom Infinity Key Questions for Any Potential Solution  How does the solution provide cross volume/cross subsystem data integrity/data consistency ?  What is the impact to the primary application I/O ?  What happens if data replication fails or slows down ?  Interoperability with other data replication solutions ?  Cost of installing & maintaining solution?  Do solutions provide “concurrent maintenance” ?  What flexibility does the solution provide ?  If I recover to the secondary site, how do I replicate back to the primary?  If I use different “types" of disk subsystems, after recovery can I maintain my QoS to my users? 48

© 2014 Vicom Infinity Questions? Thank you! 49 John Wolfgang Enterprise Storage Architecture & Services 26 June 2014

© 2014 Vicom Infinity About Vicom Infinity For More Information please contact… Len Santalucia, CTO & Business Development Manager Vicom Infinity, Inc. One Penn Plaza – Suite 2010 New York, NY office mobile About Vicom Infinity  Account Presence Since Late 1990’s  IBM Premier Business Partner  Reseller of IBM Hardware, Software, and Maintenance  Vendor Source for the Last 8 Generations of Mainframes/IBM Storage  Professional and IT Architectural Services  Vicom Family of Companies Also Offer Leasing & Financing, Computer Services, and IT Staffing & IT Project Management 50