Cloud Computing Development. Shallow Introduction.

1 Cloud Computing Development

2 Shallow Introduction

3 Introduction

What is the cloud computing

What is the cloud computing
Is it computing while in flight? NO

What is the cloud computing

What is the cloud computing
Cloud computing is consumption of computing resources without worrying about specifics.

What is the cloud computing
As well as ability to add or remove resources according to the demand.

What is the cloud computing
Similar to the power grid and telephone network.

Similar to the power grid and telephone network.

11 How does it work? Consumer signs up for the service. (Same as if you get a mobile phone plan) Consumer uses services according to their needs Provider sends the bill at the end of the cycle Consumer pays

12 Provider Models Software As A Service SAAS Email CRM Office Apps

13 Provider Models Software As A Service SAAS Email CRM Office Apps Platform As A Service PAAS Application Servers Databases Middleware

14 Provider Models Software As A Service SAAS Email CRM Office Apps Platform As A Service PAAS Application Servers Databases Middleware Infrastructure As A Service IAAS Bare Hardware (Sort of )

15 Providers Software As A Service SAAS Google (GMail) Salesforce Microsoft (Office Live)

16 Providers Software As A Service SAAS Google (GMail) Salesforce Microsoft (Office Live) Platform As A Service PAAS Google App Engine Heroku / Engine Yard (Rails) Windows Azure (.NET)

17 Providers Software As A Service SAAS Google (GMail) Salesforce Microsoft (Office Live) Platform As A Service PAAS Google App Engine Heroku / Engine Yard (Rails) Windows Azure (.NET) Infrastructure As A Service IAAS Amazon AWS Rackspace GoGrid

18 Provider: Windows Azure Platform as a service Windows based Storage provided through blob storage, drives, SQL Azure State is stored and propagated with Queues and Tables Integrated with Visual Studio Eclipse plug-in for PHP

20 Provider: Google App Engine Platform as a service Python or Java based Storage provided through BigTable Automatically scales web nodes

21 Provider: Rackspace Infrastructure as a service Very Basic just a few Linux or Windows images Provides storage with CloudFiles Very Cheap Open source API Relatively New

22 Provider: Amazon AWS Oldest on the market Many services / Images / Third party providers Provides computation through EC2 / EMR Provides state / storage through S3, SQS, RDS, SimpleDB Multiple APIs

23 Sample Prices Amazon Compute $0.10+ VM/Hr Storage $0.15+ GB/Month $0.15+ GB/XFer Rackspace Compute $0.02+ VM/Hr Storage $0.15+ GB/Month $0.22+ GB/XFer Microsoft Compute $0.12 VM/Hr Storage $0.15 GB/mo Bandwidh $0.15 GB/XFer

24 Development

25 Practical Considerations Cloud Development is slightly different from traditional in house model.

26 Practical Considerations Cloud Development is slightly different from traditional in house model. Everything is virtualized (most of the time) Everything is distributed Per instance reliability is much lower Overall reliability is much higher

27 Cloud Programming Model

28 Compute and Interface nodes are not reliable, they can crash and disappear at any time. Storage and State are reliable and heavily distributed. At any time we can start more compute or interface nodes and shut them down when demand subsides.

29 Cloud Programming Model on Azure Compute : Worker Nodes State: Tables / Queues / SQL Storage: SQL / Tables / Blobs / Drives Client Inteface: Web Nodes

30 Cloud Programming Model on AWS Compute : EC2 Instances State: S3 / Queues / SimpleDB / RDS Storage: S3 / SimpleDB / RDS Client Inteface: S3 / EC2 / CloudFront

31 AWS Details: S3 S3 = Simple Storage Service Guaranteed to be reliable Simple {Key, Value} storage Keys are stored within buckets Values could be as large as 5GB Default Storage Mechanism for AWS

32 AWS Details: Simple DB Schema less database Main storage unit is domain ( similar to table ) Each record can have many attributes, new attributes could be added at any time Similar to LISP / Scheme attributes Can query domain for records containing particular attribute No Joins / Unions with other domains Eventual Consistency

33 AWS Details: RDS RDS = Relational Data Storage MySQL in a cluster mode Preferred to simply running DB server within instance (ask me why for details)

34 AWS Details: SQS SQS = Simple Queue System Massively scalable Allows to put message in the queue and retrieve later on Retrieving the message hides it from the other users When message is processed it is deleted from the queue If message is not deleted before the timeout it is returned back

35 AWS Details: EC2 EC2 = Elastic Compute Cloud Allows to run arbitrary virtual machines Provided they are compatible with Amazons modified Xen Kernels and Startup Disks are stored in S3 Also have large local storage Machines are not exactly like physical machines Local storage is not persistent When machine is shut down all local data disappears. Hardware TCP [No packet layer / No Broadcast ] Can launch many copies of the machine at the same time Lots of preconfigured machines

36 AWS Details: Other Services EMR = Elastic Map Reduce Lets run Hadoop jobs on EC2 CloudFront Content Delivery Network ELB = Elastic Load Balancer EBS = Elastic Block Storage S3 backed persistent storage Public Data Sets - Lots of publicly available data Census ( 1980, 1990, 2000 ), Wikipedia logs, Freebase dumps, Genetic and Chemistry data

37 Starting Up Amazon Account Credentials KeyID : SecretKey X509 Ceriticate

38 Helpful Tools S3 Fox - Firefox extension for browsing S3 Elastic Fox - Firefox extension for operating EC2 Transmit - Mac utility for S3 ($) Right Scale - Web based platform for managing everything ( Free / $ )

39 Libraries Official Amazon Libraries (Java) Unofficial Libraries -.Net / Ruby / Perl AWS4C - C/C++/Objective C Boto - Very popular Python library (official Hadoop/EC2 library)

41 Running Hadoop on EC2

