Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cloud Data Center – Chicago Designed 2007/ Opened 2009 Generation 2 Deployment (SLA 99.999)Generation 3 Deployment (SLA 99.9) Physical Redundancy N+2,

Similar presentations


Presentation on theme: "Cloud Data Center – Chicago Designed 2007/ Opened 2009 Generation 2 Deployment (SLA 99.999)Generation 3 Deployment (SLA 99.9) Physical Redundancy N+2,"— Presentation transcript:

1

2

3 Cloud Data Center – Chicago Designed 2007/ Opened 2009 Generation 2 Deployment (SLA 99.999)Generation 3 Deployment (SLA 99.9) Physical Redundancy N+2, Tier 3 Software Geo-Redundancy Active/Active nodes – geo-distributed

4

5 Seats10,0001,000,000,000 TalentCustodiansDesigners BudgetFixed CostRates ArchitecturesManyFew App IntegrationLooseTight InfrastructureOverheadEnabler ReachRegionalGlobal Cost/Mb$1.74M$0.026M Network $/server>$200<$200 HardwareCustomCommodity AvailabilityInfrastructureService OperabilityMTBFMTTR ReliabilityHardwareSoftware Network DowntimeImpactingIrrelevant Network Availability99.9999%99.9% DesignPrimary/BackupActive/Active Speed Performant Deployment TimeWeeksMinutes Enterprise IT Cloud-scale From the enterprise to the cloud

6 SCRY Microsoft’s SCRY measurement tool aligns actual resource use with charge back model Tracking Carbon Tracking Utilization From Allocating by Space… …To Allocating by Power Tracking Power Billing & Cost Allocation $

7 Single architecture Limited configuration and customization options Initial deploy is still required to migrate data to Office 365 AD clean up and network upgrade is often required Understand your internal security and privacy requirements Balance between continuous innovations and minimize change Customer controls IT policies but not feature availability

8

9 DEFINE THE FABRIC

10 1. Purchase OS 2. Install Role 3. Install App 4. Deploy Context 5. Configure Requests

11 OS Role App Context OS Role App Context Virtualization

12 OS Role OS Role OS Role OS Role OS Role OS Role OS Role OS Role OS Role OS Role OS Role OS Role OS Role OS Role Infrastructure Fabric

13 DEFINE THE FABRIC o Offloaded Data transfer (ODX) o Storage Spaces o Thin-Provisioning o Deduplication o Tier-ing Storage Consolidation o High Performance & Share Nothing Live Migration o System Center Multi Hypervisor support (Hyper-V, VMware, XEN) o BitLocker Encryption o Up to 64TB Virtual Hard Disk (VHDX) Size Server Virtualization o Software Defined Networking o Virtual IP Address Management o Datacenter Bridging o Windows Server & Azure Active Directory o Active Directory Federation Services o PowerShell Automation, >3000 cmdlets o Desired Configuration o Windows Management Framework: WS- Management, REST, HTTP, PSRP o Hyper-V Replica o Windows Azure Hyper- V Recovery Manager System Center Windows Server 2012 Workloads Fast Track Microsoft Private Cloud Fast Track Guidance Set http://technet.microsoft.com/en-us/jj572811 Microsoft Azure App services Data services Infrastructure services Integration HPCAnalytics Web sites Mobile services Caching Identity Service busMedia Cloud services SQL databaseHDInsight Table Blob storage Virtual machines Virtual networkVPN Traffic manager CDN

14 DEFINE THE SERVICE

15 Microsoft Confidential – Internal Use Only

16

17

18

19

20

21 SCALE ^

22

23 Summary: Implement functional checks within an application that external tools can access through exposed endpoints at regular intervals. This pattern can help to verify that applications and services are performing correctly http://aka.ms/Health-Endpoint-Monitoring-Pattern

24

25

26 Phases Document Act Rate Discover Identify failure points Component interaction diagram Prioritize reliability work Remediate against effects and validate mitigations Record failure effects Assess risk priority using Impact and Likelihood Brainstorm failure modes DIAL categories (Discovery, Auth, Incorrectness, Limits, Component)

27 Document - Component interaction diagram

28 Discover Discovery Limits Auth Incorrectness Name resolution service health or configuration Caller configuration Timeouts and blocking Service unavailable or unhealthy, throttling Flooding, congestion, slow response times Protocol and version mismatch Corruption, data fidelity, poison message Duplicate request, invalid state, timing errors Authentication service health or configuration Resource authorization configuration Component Code or configuration changes Hangs, crashes, resource exhaustion Fault domains

29 Rate – Assessing Risk Effects Likelihood Resolution Detection Portion Affected When this failure occurs, how deeply is the functionality impaired? What is the frequency this failure is likely to occur? How long does it take the automated system or human to restore functionality after the failure has been detected? How long does it take until an automated system or human is notified to take corrective measures? When this failure occurs, what portion of users or transactions are affected? Impact Likelihood

30 Act – Prioritize and Mitigate ImpactLikelihood ID Component/ Dependency Interactions Failure Short Name Failure DescriptionConsequencesEffects Portion Affected DetectionResolutionLikelihood 3 Storage Layer -> Azure Storage Error 5xx from Azure Storage::Service Azure Storage may respond with error Return Error to caller. Service closed Major impairment of core functionality More than 50% More than 15 min More than 45 min Multiple times a year 4 Storage Layer -> Azure Storage No Response from Azure Storage::Service Azure Storage may fail to respond within the timeout period No retry. Return Error to caller. Major impairment of core functionality More than 50% More than 15 min More than 45 min Multiple times a year 5 Storage Layer -> Azure Storage Latency from Azure Storage::Service Azure Storage component may When memory pressure is sufficient, return Error to caller. Major impairment of core functionality More than 50% More than 15 min More than 45 min Less than once a year 6 Web Service -> Server API Latency from Server API The Server API may be slow to respond Caller will timeout resulting in a client retry. Major impairment of core functionality Less than 2% More than 15 min More than 45 min More than once a month 7 Storage Layer -> Azure Storage Error 5xx from Azure Storage::Service Azure Storage may respond with error Return Error to caller. Service closed Major impairment of core functionality More than 50% More than 15 min More than 45 min Multiple times a year 8 Storage Layer -> Azure Storage No Response from Azure Storage::Service Azure Storage may fail to respond within the timeout period No retry. Return Error to caller. Major impairment of core functionality Less than 50% More than 15 min More than 45 min Multiple times a year 9Azure DNS Azure DNS Failure::ClientAPI The Azure DNS system may fail to respond Error DNS not found returned to caller. Major impairment of core functionality Less than 2% Between 5 min and 15 min More than 45 min Less than once a year Risk ImpactLikelihood ID Component/ Dependency Interactions Failure Short Name Failure DescriptionConsequencesEffects Portion Affected DetectionResolutionLikelihood 4 Storage Layer -> Azure Storage No Response from Azure Storage::Service Azure Storage may fail to respond within the timeout period No retry. Return Error to caller. Major impairment of core functionality More than 50% More than 15 min More than 45 min Multiple times a year 3 Storage Layer -> Azure Storage Error 5xx from Azure Storage::Service Azure Storage may respond with error Return Error to caller. Service closed Major impairment of core functionality More than 50% More than 15 min More than 45 min Multiple times a year 5 Storage Layer -> Azure Storage Latency from Azure Storage::Service Azure Storage component may When memory pressure is sufficient, return Error to caller. Major impairment of core functionality More than 50% More than 15 min More than 45 min Less than once a year 7 Storage Layer -> Azure Storage Error 5xx from Azure Storage::Service Azure Storage may respond with error Return Error to caller. Service closed Major impairment of core functionality More than 50% More than 15 min More than 45 min Multiple times a year 6 Web Service -> Server API Latency from Server API The Server API may be slow to respond Caller will timeout resulting in a client retry. Major impairment of core functionality Less than 2% More than 15 min More than 45 min More than once a month 8 Storage Layer -> Azure Storage No Response from Azure Storage::Service Azure Storage may fail to respond within the timeout period No retry. Return Error to caller. Major impairment of core functionality More than 50% More than 15 min More than 45 min Multiple times a year 9Azure DNS Azure DNS Failure::ClientAPI The Azure DNS system may fail to respond Error DNS not found returned to caller. Major impairment of core functionality Less than 2% Between 5 min and 15 min More than 45 min Less than once a year ImpactLikelihood ID Component/ Dependency Interactions Failure Short Name Failure DescriptionConsequencesEffects Portion Affected DetectionResolutionLikelihood 4 Storage Layer -> Azure Storage No Response from Azure Storage::Service Azure Storage may fail to respond within the timeout period No retry. Return Error to caller. Major impairment of core functionality More than 50% More than 15 min More than 45 min Multiple times a year 3 Storage Layer -> Azure Storage Error 5xx from Azure Storage::Service Azure Storage may respond with error Return Error to caller. Service closed Major impairment of core functionality More than 50% More than 15 min More than 45 min Multiple times a year 7 Storage Layer -> Azure Storage Error 5xx from Azure Storage::Service Azure Storage may respond with error Return Error to caller. Service closed Major impairment of core functionality More than 50% More than 15 min More than 45 min Multiple times a year 5 Storage Layer -> Azure Storage Latency from Azure Storage::Service Azure Storage component may When memory pressure is sufficient, return Error to caller. Major impairment of core functionality More than 50% More than 15 min More than 45 min Less than once a year 8 Storage Layer -> Azure Storage No Response from Azure Storage::Service Azure Storage may fail to respond within the timeout period No retry. Return Error to caller. Major impairment of core functionality More than 50% More than 15 min More than 45 min Multiple times a year 6 Web Service -> Server API Latency from Azure Storage::Service The Server API may be slow to respond Caller will timeout resulting in a client retry. Major impairment of core functionality Less than 2% More than 15 min More than 45 min More than once a month 9Azure DNS Azure DNS Failure::ClientAPI The Azure DNS system may fail to respond Error DNS not found returned to caller. Major impairment of core functionality Less than 2% Between 5 min and 15 min More than 45 min Less than once a year ImpactLikelihood ID Component/ Dependency Interactions Failure Short Name Failure DescriptionConsequencesEffects Portion Affected DetectionResolutionLikelihood 4 Storage Layer -> Azure Storage No Response from Azure Storage::Service Azure Storage may fail to respond within the timeout period No retry. Return Error to caller. Major impairment of core functionality More than 50% More than 15 min More than 45 min Multiple times a year 7 Storage Layer -> Azure Storage Error 5xx from Azure Storage::Service Azure Storage may respond with error Return Error to caller. Service closed Major impairment of core functionality More than 50% More than 15 min More than 45 min Multiple times a year 3 Storage Layer -> Azure Storage Error 5xx from Azure Storage::Service Azure Storage may respond with error Return Error to caller. Service closed Major impairment of core functionality More than 50% More than 15 min More than 45 min Multiple times a year 5 Storage Layer -> Azure Storage Latency from Azure Storage::Service Azure Storage component may When memory pressure is sufficient, return Error to caller. Major impairment of core functionality More than 50% More than 15 min More than 45 min Less than once a year 8 Storage Layer -> Azure Storage No Response from Azure Storage::Service Azure Storage may fail to respond within the timeout period No retry. Return Error to caller. Major impairment of core functionality More than 50% More than 15 min More than 45 min Multiple times a year 6 Web Service -> Server API Latency from Server API The Server API may be slow to respond Caller will timeout resulting in a client retry. Major impairment of core functionality Less than 2% More than 15 min More than 45 min More than once a month 9Azure DNS Azure DNS Failure::ClientAPI The Azure DNS system may fail to respond Error DNS not found returned to caller. Major impairment of core functionality Less than 2% Between 5 min and 15 min More than 45 min Less than once a year Risk

31 Summary: Enable an application to handle anticipated, temporary failures when it attempts to connect to a service or network resource by transparently retrying an operation that has previously failed in the expectation that the cause of the failure is transient. This pattern can improve the stability of the application. http://aka.ms/Retry-Pattern

32 Summary: Handle faults that may take a variable amount of time to rectify when connecting to a remote service or resource. This pattern can improve the stability and resiliency of an application. http://aka.ms/Circuit-Breaker-Pattern

33 Summary: Control the consumption of resources used by an instance of an application, an individual tenant, or an entire service. This pattern can allow the system to continue to function and meet service level agreements, even when an increase in demand places an extreme load on resources. http://aka.ms/Throttling-Pattern

34

35

36 Come Visit Us in the Microsoft Solutions Experience! Look for Datacenter and Infrastructure Management TechExpo Level 1 Hall CD For More Information Windows Server 2012 R2 http://technet.microsoft.com/en-US/evalcenter/dn205286 Microsoft Azure http://azure.microsoft.com/en-us/ System Center 2012 R2 http://technet.microsoft.com/en-US/evalcenter/dn205295 Azure Pack http://www.microsoft.com/en-us/server- cloud/products/windows-azure-pack

37 www.microsoft.com/learning http://microsoft.com/msdn http://microsoft.com/technet http://channel9.msdn.com/Events/TechEd

38

39

40


Download ppt "Cloud Data Center – Chicago Designed 2007/ Opened 2009 Generation 2 Deployment (SLA 99.999)Generation 3 Deployment (SLA 99.9) Physical Redundancy N+2,"

Similar presentations


Ads by Google