Joan Wortman Architecting for the Cloud Bill Wilder An App in the Cloud is not a Cloud-Native App Boston Code Camp #19 08-Mar-2013 (2:50 – 4:00 PM EDT)

Slides:



Advertisements
Similar presentations
1 Perspectives from Operating a Large Scale Website Dennis Lee VP Technical Operations, Marchex.
Advertisements

Architecting to be Cloud Native On Windows Azure or Otherwise
“Try not. Do, or do not. There is no try.” - Yoda
System Center 2012 R2 Overview
What’s New in Windows Azure A platform overview + how it can fit into my development shop today… New England Microsoft Dev Group 06-June-2013 (6:30-8:30.
Microsoft Azure Cloud Platform an overview
Page 1 Ricardo Villalobos Windows Azure Architect Evangelist Microsoft Corporation Designing, Building, and Deploying Windows Azure applications.
Big Ideas in Software Architecture (in cloud or otherwise) 14-December-2011 Copyright (c) 2011, Bill Wilder – Use allowed under Creative Commons license.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Google AppEngine. Google App Engine enables you to build and host web apps on the same systems that power Google applications. App Engine offers fast.
SOFTWARE AS A SERVICE PLATFORM AS A SERVICE INFRASTRUCTURE AS A SERVICE.
Plan Introduction What is Cloud Computing?
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
Introduction To Windows Azure Cloud
Training Workshop Windows Azure Platform. Presentation Outline (hidden slide): Technical Level: 200 Intended Audience: Developers Objectives (what do.
Migrating Business Apps to Windows Azure Marc Müller Principal Consultant, 4tecture GmbH
@codingoutloud © 2014 Development Partners Software Corporation Meet Windows Azure, Your Next Data Center nearing.
Windows Azure Tour Benjamin Day Benjamin Day Consulting, Inc.
Introduction to Cloud Computing
Your First Azure Application Michael Stiefel Reliable Software, Inc.
@codingoutloud © 2014 Development Partners Software Corporation © 2014 Development Partners Software.
Cloud Computing & Amazon Web Services – EC2 Arpita Patel Software Engineer.
Windows Azure Conference 2014 Deploy your Java workloads on Windows Azure.
Overview of Cloud Computing Sven Rosvall ACCU
Plan  Introduction  What is Cloud Computing?  Why is it called ‘’Cloud Computing’’?  Characteristics of Cloud Computing  Advantages of Cloud Computing.
Personal Computer - Stand- Alone Database  Database (or files) reside on a PC - on the hard disk.  Applications run on the same PC and directly access.
Azure Best Practices How to Successfully Architect Windows Azure Apps for the Cloud 13-Mar-2013 (1:00 PM EDT) Bill Wilder An App in the Cloud is not (necessarily)
Except where noted contents © 2014 Development Partners Software Corporation Cloud Architecture Anti-Patterns.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
Server to Server Communication Redis as an enabler Orion Free
Welcome! 4:00 – 4:15 PM: socialize 4:15 – 5:00 PM: Overview of Microsoft Azure cloud platform toolbox 5:00 – 5:30 PM: networking break with snacks & food.
Windows Azure Virtual Machines Anton Boyko. A Continuous Offering From Private to Public Cloud.
Windows Azure. Azure Application platform for the public cloud. Windows Azure is an operating system You can: – build a web application that runs.
Text Microsoft to Or Tweet #uktechdays Questions?
Windows Azure Web Sites Second-generation PaaS Boston Cloud Meetup 14-January-2014 (00:30) Boston Azure User Group
3/12/2013Computer Engg, IIT(BHU)1 CLOUD COMPUTING-1.
Web Technologies Lecture 13 Introduction to cloud computing.
Cloud Architecture Patterns for Mere Mortals New England Code Camp #16 29-October-2011 Copyright (c) 2011, Bill Wilder – Use allowed under Creative Commons.
Except where noted contents © 2014 Development Partners Software Corporation the Microsoft Azure.
Except where noted contents © 2014 Development Partners Software Corporation the Microsoft Azure.
Architecture Patterns for Building Cloud-Native Applications NYC Code Camp 7 15-September-2012 (10:45 – noon) Boston Azure User Group
Microsoft Cloud Computing. Topics to be covered 1.Environmental Features of windows azure 2.What is Cloud Computing 3.Roles in Cloud Computing 4.Benefits.
Windows Azure Overview for IT Pros Anton Boyko. Intro to Cloud Computing Intro to Windows Azure Cloud Services Web Sites Virtual Machines Workload Options.
Hello Cloud… Mike Benkovich
Building Cloud Solutions Presenter Name Position or role Microsoft Azure.
(re)-Architecting cloud applications on the windows Azure platform CLAEYS Kurt Technology Solution Professional Microsoft EMEA.
Cloud Computing from a Developer’s Perspective Shlomo Swidler CTO & Founder mydrifts.com 25 January 2009.
Building web applications with the Windows Azure Platform Ido Flatow | Senior Architect | Sela | This session.
Inspirirani ljudima. Ugasite mobitele. Hvala.. Paolo Pialorsi Senior Consultant PiaSys ( Publishing apps for SharePoint 2013 on Microsoft.
 Cloud Computing technology basics Platform Evolution Advantages  Microsoft Windows Azure technology basics Windows Azure – A Lap around the platform.
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
Aaron Stanley King. What is SQL Azure? “SQL Azure is a scalable and cost-effective on- demand data storage and query processing service. SQL Azure is.
Cloud-Native Architecture Patterns (Or… why your pre-cloud architecture won’t work so well in the cloud) Azure Florida Association 28-March-2012 Boston.
Boston Code Camp October-2012 (1:30 – 2:40)
Cloud Architecture Patterns for Mere Mortals
Logo here Module 3 Microsoft Azure Web App. Logo here Module Overview Introduction to App Service Overview of Web Apps Hosting Web Applications in Azure.
Exploring Azure Event Grid
Architecture Patterns for Scalability & Reliability
Hello Farmington! 4:30-5:30, then dinner.
Serverless CQRS in Azure!
New England Code Camp October-2010
Azure Event Grid with Custom Events
Outline Virtualization Cloud Computing Microsoft Azure Platform
Developing Advanced Applications with Windows Azure
DevBoston 07-February-2013 (6:00 PM)
Building global and highly-available services using Windows Azure
Making Windows Azure Relevant to IT Professionals
The Database World of Azure
Presentation transcript:

Joan Wortman Architecting for the Cloud Bill Wilder An App in the Cloud is not a Cloud-Native App Boston Code Camp #19 08-Mar-2013 (2:50 – 4:00 PM EDT)

Questions for the end What are the other options? – VM for Legacy app (can’t change or rearch) When to know WHEN to go with cloud service vs Web Site – State, Customization, Scale, Latency, Perf, CDN, Geo-LB – NOT FACTORS: high volume, auto-scaling, monitoring – both are PaaS How does all this make it more manageable? – Self-healing (QCW + what Fabric Controller does), Auto-Scale, Fabric Controller helps, Log Gathering & Metrics Dashboard (could use some more work) – Auto-Scaling Pattern enabled through WASABi or 3 rd Party Service or Azure Service in the Azure Store (?)

Answer inline TTM Discover how you can successfully architect Windows Azure-based applications to avoid and mitigate performance and reliability issues with our live webinar Microsoft’s Windows Azure cloud offerings provide you with the ability to build and deliver a powerful cloud-based application in a fraction of the time and cost of traditional on-premise approaches. So what’s the problem? Tried-and-true traditional architectural concepts don’t apply when it comes to cloud-native applications. Building cloud-based applications must factor in answers to such questions as: How to scale? How to overcome failure? -- QCW How to build a manageable system? – Self-healing, Auto-Scale, Fabric Controller helps, PaaS (less for ME to do) How to minimize monthly bills from cloud vendors? If you want to avoid long nights, help-desk calls, frustrated business owners and end-users, then don’t miss this webinar or your chance to learn how to deliver highly-scalable, high-performance cloud applications.

Who is Bill Wilder?

Roadmap for this talk… … 1.Define relevant “cloud” types from software development point of view 2.App in the Cloud != Cloud App (or at least not a Cloud-Native App) 3.What could go wrong? 4.Consider UX factors ?

The term “cloud” is nebulous…

NIST Terminology SaaS = Software as a Service (BYO users) PaaS = Plaform as a Service (BYO apps) IaaS = Infrastructure as a Service (BYO VMs) Simplicity Complexity Flexibility Rigidity Power? Power?

___________________ as a Service Apps, $/user, Expertise, SLA App Services as OpEx, OS, DBMS, etc. with patching & upgrades, Environment Monitoring, Expertise, SLA Virtualized Hardware as OpEx, Networking, Automation, Elasticity, Price Transparency, Global Data Centers, Expertise, SLA Public Cloud Rental Models AppHarbor

“Bring Your Own” ____ as a Service

What is different about the cloud?

1/9 th above water  TTM & Sleeping well =

MTBF MTTR multitenant services + commodity hardware = cost-efficient cloud

This bar is always open *and* has an API Pay by the Drink

∞ Resource allocation (scaling) is: – Horizontal – Bi-directional – Automatable The “illusion of infinite resources”

Cloud-Native Application Characteristics Application architecture is aligned with the cloud platform architecture – uses the platform in the most natural way – lets the platform do the heavy lifting

3- or N-tier, SOA Multi-data center Horizontal scaling Expects failure PaaS Traditional Cloud-Native 2-tier Single data center Vertical scaling Ignores failure Hardware or IaaS Less flexible More manual/attention Less reliable (SPoF) Maintenance window Less scalable Agile/faster TTM Auto-scaling Self-healing HA Geo-LB/FO TELLS/CLUES CONSEQUENCES Tells: Traditional vs Cloud-Native   Which is “best” architecture? There is no “best” architecture – it is situational, depending on technical and business context. Not every application should be cloud-native. Traditional architectures are fine for many apps. Cloud-native popularity growing in proportion to the shrinking cost and competitive benefits.

Putting Cloud Services to work Putting the cloud to work

Simple idea, simple app Two-tiers: web tier (one server) + database What’s the problem? But… what’s WRONG with this architecture? Different ≠ WRONG. Use the right tool for the job. Some apps simply not good fit for cloud. ?

Simple idea, simple app Two-tiers: web tier (one server) + database What can go wrong We’ll reexamine 1.Scaling the web tier 2.Scaling the service tier 3.Scaling the data tier 4.Handling failure 5.Operational efficiency (scale the app, not the team!)

Horizontal Scaling Compute Pattern pattern 1 of 5

Common Terminology: Scaling Up/Down  Vertical Scaling Scaling Out/In  Horizontal “Scaling”  But really is Horizontal Resource Allocation Architectural Decision – Big decision… hard to change Scale Up (and Scale Down??) vs. Horizontal Resourcing

Vertical Scaling (“Scaling Up”). Resources that can be “Scaled Up” Memory: speed, amount CPU: speed, number of CPUs Disk: speed, size, multiple controllers Bandwidth: higher capacity pipe … and it sure is EASY Downsides of Scaling Up Hard Upper Limit HIGH END HARDWARE  HIGH END CO$T Lower value than “commodity hardware” May have no other choice (architectural)

Scaling Horizontally: Adding Boxes Autonomous nodes for scalability (stateless web servers, shared nothing DBs, your custom code in QCW) Autonomous nodes *and* Homogeneous nodes for operational simplicity *and* Anonymous nodes don‘t get emotionally involved! This is how a [public] CLOUD PLATFORM works *and* This is how YOUR CLOUD-NATIVE app works

Load Balancer (Cloud Service) Managed VMs (Cloud Service) “Web Role” Example: Web Tier

1.Auto-Scale Bidirectional 2.Nodes can fail Auto-Scale is only one cause Handle shutdown signals Stateless (“like a taxi”) vs. Sticky Sessions Stateless nodes vs. Stateless apps N+1 rule vs. occasional downtime (UX) Horizontal Scaling Considerations

What’s the difference between performance and scale? ?

Do Performance and Scale Matter? System Responsiveness* Users perception 0.1 Secondsfeeling of instantaneous response 1 Seconduser's flow of thought seamless 10 Secondsstart thinking about other things * NNG ** Kissmetrics - > 3 seconds 40% of visitors abandon**

Bottom line for your business * Kissmetrics % Lost Revenue Reduced Clicks 00:00:02 Delay

Elastic Scaling – Peak usage – Data analysis

During Super Bowl 2013 – Anticipated network spike – Scaled to 200 clusters – Millions of tags After – Scaled back

Aug 2012 Obama Ask Me Anything Spike in traffic crashed the site 2,987,307 page views 30 dedicated servers overwhelmed

Queue-Centric Workflow Pattern (QCW for short) pattern 2 of 5

Extend example into Service Tier QCW enables applications where the UI and back-end services are Loosely Coupled (Compare to CQRS at end if there is interest)

QCW Example: User Uploads Photo Web Server Compute Service Reliable Queue Reliable Storage

QCW WE NEED: Compute (VM) resources to run our code Reliable Queue to communicate Durable/Persistent Storage

Where does Windows Azure fit?

QCW [on Windows Azure] WE NEED: Compute (VM) resources to run our code Web Roles (IIS) and Worker Roles (w/o IIS) Reliable Queue to communicate Azure Storage Queues Durable/Persistent Storage Azure Storage Blobs & Tables; WASD

QCW on Azure: User Uploads a Photo Web Role (IIS) Web Role (IIS) Worker Role Worker Role Azure Queue Azure Blob UX implications: how does user know thumbnail is ready? push pull

QCW enables Responsive UX Response to interactive users is as fast as a work request can be persisted Time consuming work done asynchronously Comparable total resource consumption, arguably better subjective UX UX challenge – how to express Async to users? – Communicate Progress – Display Final results – Long Polling/Web Sockets (e.g., SignalR or Node.io)

QCW enables Scalable App Decoupled front/back provides insulation – Blocking is Bane of Scalability – Order processing partner doing maintenance – Twitter down – server unreachable – Internet connectivity interruption Loosely coupled, concern-independent scaling – (see next slide) – Get Scale Units right – Key to optimizing operational CO$T$

General Case: Many Roles, Many Queues Web Role (IIS) Web Role (IIS) Worker Role Worker Role Web Role (IIS) Web Role (IIS) Web Role (Public) Web Role (Public) Worker Role Worker Role Worker Role Worker Role Worker Role Type 1 Worker Role Type 1 Worker Role Worker Role Worker Role Worker Role Worker Role Worker Role Worker Role Type 2 Worker Role Type 2 Queue Type 1 Queue Type 2 Queue Type 1 Queue Type 2 Queue Type 3 Scaling best when Investment α Benefit Optimize for CO$T EFFICIENCY Logical vs. Physical Architecture depends on current scale Worker Role Type 2 Worker Role Type 2 Worker Role Type 2 Worker Role Type 2 Worker Role Type 2 Worker Role Type 2 Web Role (Admin) Web Role (Admin)

Reliable Queue & 2-step Delete (IIS) Web Role (IIS) Web Role Worker Role Worker Role var url = “ queue.AddMessage( new CloudQueueMessage( url ) ); var invisibilityWindow = TimeSpan.FromSeconds( 10 ); CloudQueueMessage msg = queue.GetMessage( invisibilityWindow ); (… do some processing then …) queue.DeleteMessage( msg ); Queue

QCW requires Idempotent Perform idempotent operation more than once, end result same as if we did it once Example with Thumbnailing (easy case) App-specific concerns dictate approaches – Compensating action, Last write wins, etc. PARTNERSHIP: division of responsibility between cloud platform & app – Far cry from database transaction

QCW expects Poison Messages A Poison Message cannot be processed – Error condition for non-transient reason – Check CloudQueueMessage.DequeueCount property Falling off the queue may kill your system Determine a Max Retry policy per queue – Delete, put on “bad” queue, alert human, …

QCW requires “Plan for Failure” VM restarts will happen – Hardware failure, O/S patching, crash (bug) Bake in handling of restarts into our apps – Restarts are routine: system “just keeps working” – Idempotent mindset is key – Event Sourcing (commonly seen with CQRS) may help Not an exception case! Expect it! Consider N+1 Rule

Typical SiteAny 1 Role InstOverall System Operating System Upgrade Application Code Update Scale Up, Down, or In Hardware Failure Software Failure (Bug) Security Patch What’s Up? Reliability as EMERGENT PROPERTY

Aside: Is QCW same as CQRS? Short answer: “no” CQRS – Command Query Responsibility Segregation Commands change state Queries ask for current state Any operation is one or the other Sometimes includes Event Sourcing Sometimes modeled using Domain Driven Design (DDD)

What about the Data? You: Azure Web Roles and Azure Worker Roles – Taking user input, dispatching work, doing work – Follow a decoupled queue-in-the-middle pattern – Stateless compute nodes Cloud: “Hard Part”: persistent, scalable data – Azure Queue & Blob Services – Three copies of each byte – Blobs are geo-replicated – Busy Signal Pattern

What about the Users? No direct connection between user’s action and system’s reaction User Experience Challenge System Status Keep user informed about what’s going on Appropriate feedback in reasonable amount of time

LIE…in a good way Uploading video files to FB – Block users w/status indicator – Upload and conversion Stack Overflow – My post is cached – Delay for others

Badges and Notifications

Confirmations Amazon tells you your order was taken, but doesn’t mean you own it yet… – They recheck inventory – Send confirmation Credit card/Cell bills – Post next business day Airline reservations – Some will even tell you how many seats left

Polling

Database Sharding Pattern pattern 3 of 5

Extend example into Data Tier What happens when demands on data tier grow? The Database Sharding Pattern a little about reliability – a lot about scale and performance

Foursquare is a Social Network

Foursquare #Fail October 4, 2010 – trouble begins… After 17 hours of downtime over two days… “Oct. 5 10:28 p.m.: Running on pizza and Red Bull. Another long night.” WHAT WENT WRONG?

What is Sharding? Problem: one database can’t handle all the data – Too big, not performant, needs geo distribution, … Solution: split data across multiple databases – One Logical Database, multiple Physical Databases Each Physical Database Node is a Shard Most scalable is Shared Nothing design – May require some denormalization (duplication)

All shard have same schema SHARDS

Sharding is Difficult What defines a shard? (Where to put stuff?) – Example – use country of origin: customer_us, customer_fr, customer_cn, customer_ie, … – Use same approach to find records (can use lookup) What happens if a shard gets too big? – Rebalancing shards can get complex – Foursquare case study is interesting How to query / join / transact across shards Cache coherence, connection pool management – Roll-your-own challenge

Where does Windows Azure fit?

Windows Azure SQL Database (WASD) is SQL Server Except… Common SQL Server Specific (for now) WASD Specific “Just change the connection string…” Full Text Search Transparent Data Encryption (TDE) Many more… Limitations 150 GB size limit Busy Signal Pattern Extra Capabilities Managed Service Highly Available Rental model Federations Additional information on Differences:

Windows Azure SQL Databse Federations for Sharding Single “master” database – “Query Fanout” makes partitions transparent – Instead of customer_us, customer_fr, etc… we are back to customer database Handles redistributing shards Handles cache coherence Simplifies connection pooling No MERGE (yet); SPLIT only Bonus feature for Multitenant Applications USE FEDERATION myfed (myfedkey = 911) WITH FILTERING=ON RESET connectivity-model-for-federated-data.aspx connectivity-model-for-federated-data.aspx

Foursquare #Fail Foursquare was implementing database sharding in the application layer. WASD Federations makes this unnecessary. WHAT WENT WRONG?

My database instance is limited to 150 GB. ∞ ∞ ∞ Does that mean the cloud doesn’t really offer the illusion of infinite resources? ?

Busy Signal Pattern pattern 4 of 5

Auto-Scaling Pattern pattern 5 of 5

in conclusion In Conclusion

Pre-Cloud vs. Cloud-Native Lessons : being Cloud- Native 1:15,000Efficiency Auto-Scaling via APIDynamic/∞ Resources Pay-As-You-GoVariable/OpEx Stateless, AutonomousHorizontal Resourcing N+1, IdempotentMinimize MTTR SQL, NoSQL, BlobScenario-specific Storage VM, Storage, LB, DRManaged Infrastructure

Know the rules “Know the rules well, so you can break them effectively.” - Dalai Lama XIV

Further Information Windows Azure Boston Azure User Group Cloud Architecture Patterns

Cloud Architecture Patterns book Primer Chapters 1.Scalability 2.Eventual Consistency 3.Multitenancy and Commodity Hardware 4.Network Latency

Cloud Architecture Patterns book Pattern Chapters 1.Horizontally Scaling Compute Pattern 2.Queue-Centric Workflow Pattern 3.Auto-Scaling Pattern 4.MapReduce Pattern 5.Database Sharding Pattern 6.Busy Signal Pattern 7.Node Failure Pattern 8.Colocate Pattern 9.Valet Key Pattern 10.CDN Pattern 11.Multisite Deployment Pattern

BostonAzure.org Boston Azure Cloud User Group Focused on Microsoft’s Public Cloud Platform Roles: Architect, Dev, IT Pro, DevOps (“WazOps”) Talks, Demos, Tools, Hands-on, special events, … Monthly, 6:00-8:30 PM in Boston area (free) Follow on More info or to join our Meetup.com group:

Joan Wortman User Experience Specialist 17 years experience

Business Card

My name is Bill Wilder professional ·· ·· ·· blog.codingoutloud.com ·· Bill Wilder

Questions? Comments? More information? ?

DONE