Presentation on theme: "Cloud Service Models and Performance Ang Li 09/13/2010."— Presentation transcript:
Cloud Service Models and Performance Ang Li 09/13/2010
Roadmap Re-cap on cloud computing Classes of cloud providers Common services offered by cloud providers Cloud performance comparison
Re-cap on cloud computing What is cloud computing? Cloud application: software-as-a-service Cloud provider: hardware and software infrastructure that supports the cloud applications Benefits of cloud computing Pay-as-you-go On-demand scaling
Classes of cloud providers Platform as a Service (PaaS) Infrastructure as a Service (IaaS)
Infrastructure-as-a-Service providers Offer near bare-metal virtual machines You can ssh, get root privilege, install new software, do whatever you want Can use the APIs provided by the OS Highly flexible and customizable Charge by machine-hour (How many machines you use) X (how long you use them) You can afford it!
Platform-as-a-Service providers Offer a sandbox environment Upload a program, they run it for you Can only use the APIs provided by the environment Charged by CPU utilization Pay only how much resource you use A few free hours per day!
Windows Azure: a combination of both Offer a sand-box environment of C# (PaaS) Charge by machine-hour (IaaS) Demo time!
Discussion What are the pros and cons of IaaS and PaaS? Which one do you prefer? Your homepage E-business application Video processing
Services offered by a cloud Elastic compute cluster Persistent storage Intra-cloud and wide-area network and others… MapReduce service CDN service
Elastic compute cluster Where your application is running VM or sandbox environment Why cluster? Multiple instances can be running your application simultaneously Why elastic? You can add new instances or remove existing ones with very short latency
Scaling the compute cluster Opaque scaling User can manually increase/decrease the number of instances Alternatively, she can set up a scaling policy IaaS providers (including Azure) Transparent scaling Scaling happens automatically (magically) PaaS providers
Persistent storage Where to store your applications data Why persistent? Local storage (VM disk) is not reliable Different types of storage service Table, blob, queue, etc. Highly scalable and available
Access the storage services Storage APIs HTTP-based GET https://sdb.amazonaws.com/ ?Action=PutAttributes &Attribute.1.Name=data &Attribute.1.Value=haha &AWSAccessKeyId=[valid access key id] &DomainName=table &ItemName=k… Library call DatastoreService datastore = DatastoreServiceFactory.getDatastoreService(); Key k = KeyFactory.createKey(table, key); Entity e = new Entity(k); e.setProperty("data", haha"); datastore.put(e);
Networking service Intra-cloud network: between two instances or between an instance and the storage service Wide-area network: between a cloud instance and the end user
Intra-cloud network Within a data center High bandwidth, low latency
Wide-area network Every provider has multiple locations to host cloud applications Application content can be served from the closest location Why is this a good thing?
Re-cap: cloud services Intra-cloud network Storage service Computation service Wide-area network Web application
CloudCmp: comparing cloud providers Motivation Provide shopping advice Analogy: should I buy an IBM or a Dell or a Mac? Identify performance problems Challenges What to compare? Each provider is unique in some way… How do we measure?
Methodology Identify the common services We just did this Pick a few metrics for each service relevant to application performance Develop benchmarking task for each metric Run the tasks on different providers and compare
Choose the providers to compare AWS, Rackspace, Azure, and AppEngine
Metrics: elastic compute cluster Benchmark finishing time Java-based benchmark tasks Cost per benchmark From per-hour price and billing API Scaling latency Periodically allocate new instances
Metrics: persistent storage Operation latency Client matters Cost per operation Time to consistency Read-after-write test
Metrics: networking service Intra-cloud network Bandwidth (iperf) Latency (ping) Wide-area network Use PlanetLab nodes to simulate a diverse user base Let each node ping each data center of a cloud provider Optimal wide-area latency
Result: computation performance What can we learn from the figure? How do the performance differ across different clouds? Within a provider, how do performance differ across different instance types? Is this enough information to choose provider/instance? How about cost?
Result: computation cost What can we learn from this figure? Is some provider particularly cost-effective? Which instance type should I choose within a provider?
Result: scaling efficiency How is the scaling latency (high/low)? Linux vs Windows?
Result: storage (table) The storage services show high variation Median – 30ms, 90 percentile – 60ms Is it good or bad? Why?
Result: intra-cloud network Intra-cloud bandwidth varies across different providers From 200Mbps to 800Mbps What might be the cause?
Result: wide-area network Wide-area network latency also varies a lot C3 is much better than the others What might be the reason?
Result summary No provider stands out C1 has the highest network bandwidth, while its instance is not the most cost-effective C2 has the most powerful instances, while its network bandwidth is low C3 has the lowest wide-area network latency, while its storage is slower than others It is not trivial to shop for a cloud provider! Many research challenges in developing a sound mechanism to select the best provider for an application
Summary Two main classes of cloud providers IaaS: bare-metal virtual machines PaaS: sand-box environment Four common services Elastic compute cluster Persistent storage Intra-cloud network Wide-area network No provider has the best performance over all services