Presentation on theme: "Amazon Web Services and Eucalyptus Darshan R. Kapadia Gregor von Laszewski 1http://grid.rit.edu."— Presentation transcript:
Amazon Web Services and Eucalyptus Darshan R. Kapadia Gregor von Laszewski 1http://grid.rit.edu
What is Amazon Web Services? The Amazon Web Services (AWS) are a collection of remote computing services (also called web services) offered over the Internet by Amazon.com. Amazon Web Services (AWS) provides companies of all sizes with an infrastructure web services platform in the cloud. http://grid.rit.edu2 http://aws.amazon.com/what-is-aws/ http://en.wikipedia.org/wiki/Amazon_Web_Services
What does AWS offers? With AWS you can requisition compute power, storage, and other services–gaining access to a suite of elastic IT infrastructure services as your business demands them. With AWS you have the flexibility to choose whichever development platform or programming model makes the most sense for the problems you’re trying to solve. http://grid.rit.edu3 http://aws.amazon.com/what-is-aws/
Advantages of AWS Cost-effective Dependable Flexible Comprehensive http://grid.rit.edu5
Amazon Simple Storage Service (Amazon S3™) Amazon S3 is storage for the Internet. Amazon S3 provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web.
AWS S3 Functionalities Write, read, and delete objects containing from 1 byte to 5 gigabytes of data each. The number of objects you can store is unlimited. Each object is stored in a bucket and retrieved via a unique, developer-assigned key. A bucket can be located in the United States or in Europe. All objects within the bucket will be stored in the bucket’s location, but the objects can be accessed from anywhere. Authentication mechanisms are provided to ensure that data is kept secure from unauthorized access. Objects can be made private or public, and rights can be granted to specific users. Uses standards-based REST and SOAP interfaces designed to work with any Internet-development toolkit. http://grid.rit.edu7
Properties of AWS S3 Scalable: Amazon S3 can scale in terms of storage, request rate, and users to support an unlimited number of web- scale applications. It uses scale as an advantage: Adding nodes to the system increases, not decreases, its availability, speed, throughput, capacity, and robustness. Reliable: Store data durably, with 99.99% availability. There can be no single points of failure. All failures must be tolerated or repaired by the system without any downtime. Fast: Amazon S3 must be fast enough to support high- performance applications. Server-side latency must be insignificant relative to Internet latency. Any performance bottlenecks can be fixed by simply adding nodes to the system.
Contd.. Inexpensive: Amazon S3 is built from inexpensive commodity hardware components. As a result, frequent node failure is the norm and must not affect the overall system. It must be hardware-agnostic, so that savings can be captured as Amazon continues to drive down infrastructure costs. Simple: Building highly scalable, reliable, fast, and inexpensive storage is difficult. Doing so in a way that makes it easy to use for any application anywhere is more difficult. Amazon S3 must do both.
Pricing Storage (Linux Based) $0.150 per GB – first 50 TB / month of storage used $0.140 per GB – next 50 TB / month of storage used $0.130 per GB – next 400 TB /month of storage used $0.120 per GB – storage used / month over 500 TB Data Transfer $0.170 per GB – first 10 TB / month data transfer out $0.130 per GB – next 40 TB / month data transfer out $0.110 per GB – next 100 TB / month data transfer out $0.100 per GB – data transfer out / month over 150 TB Requests $0.01 per 1,000 PUT, COPY, POST, or LIST requests $0.01 per 10,000 GET and all other requests http://grid.rit.edu10
Amazon CloudFront Amazon CloudFront is a web service for content delivery. It integrates with other Amazon Web Services to give developers and businesses an easy way to distribute content to end users with low latency, high data transfer speeds, and no commitments. http://grid.rit.edu11
How to use Amazon CloudFront? Store the original versions of your files in an Amazon S3 bucket. Create a distribution to register that bucket with Amazon CloudFront through a simple API call. Use your distribution’s domain name in your web pages or application. When end users request an object using this domain name, they are automatically routed to the nearest edge location for high performance delivery of your content.
Amazon Elastic Compute Cloud (Amazon EC2™) Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud. http://grid.rit.edu13
Amazon EC2 Service Highlights Elastic - Amazon EC2 enables you to increase or decrease capacity within minutes, not hours or days. You can commission one, hundreds or even thousands of server instances simultaneously. Flexible- You have the choice of multiple instance types, operating systems, and software packages. Designed for use with other Amazon Web Services-Amazon EC2 works in conjunction with Amazon Simple Storage Service (Amazon S3), Amazon SimpleDB and Amazon Simple Queue Service (Amazon SQS) to provide a complete solution for computing, query processing and storage across a wide range of applications.
How to use EC2 Create an Amazon Machine Image (AMI) containing your applications, libraries, data and associated configuration settings. Upload the AMI into Amazon S3. Amazon EC2 provides tools that make storing the AMI simple. Amazon S3 provides a safe, reliable and fast repository to store your images. Use Amazon EC2 web service to configure security and network access. Choose which instance type(s) and operating system you want, then start, terminate, and monitor as many instances of your AMI as needed, using the web service APIs or the variety of management tools provided. Determine whether you want to run in multiple locations, utilize static IP endpoints, or attach persistent block storage to your instances. Pay only for the resources that you actually consume, like instance-hours or data transfer.
Operating Systems Red Hat Enterprise Linux Windows Server 2003 Oracle Enterprise Linux OpenSUSE Linux Ubuntu Linux Fedora Gentoo Linux Debian http://grid.rit.edu16
Software Databases – IBM DB2 – IBM Informix Dynamic Server – Microsoft SQL Server Standard 2005 – MySQL Enterprise – Oracle 11g Batch Processing – Hadoop – Condor – Open MPI http://grid.rit.edu17
Contd.. Web Hosting – Apache HTTP – IIS/Asp.Net – IBM Lotus Web Content Management – IBM WebSphere Portal Server
Pricing Standard On-Demand Instances Small (Default) $0.10 per hour Large $0.40 per hour Extra Large $0.80 per hour Standard Reserved Instances http://grid.rit.edu19
Creating an AMI Select an AMI Generate a Key Pair Launch the Instance Get Administrator Password Authorize Network Access Connect to the Instance Load Software and Make Changes http://grid.rit.edu21
SOAP and Query API http://grid.rit.edu22 http://docs.amazonwebservices.com/AWSEC2 /2007-08-29/DeveloperGuide/
Amazon Elastic MapReduce Amazon Elastic MapReduce is a web service that enables businesses, researchers, data analysts, and developers to easily and cost- effectively process vast amounts of data. It utilizes a hosted Hadoop framework running on the web-scale infrastructure of Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3). http://grid.rit.edu23
Contd… Develop your data processing application authored in your choice of Java, Ruby, Perl, Python, PHP, R, or C++. Upload your data and your processing application into Amazon S3. Amazon S3 provides reliable, scalable, easy-to-use storage for your input and output data. Log in to the AWS Management Console to start an Amazon Elastic MapReduce “job flow.” Simply choose the number and type of Amazon EC2 instances you want, specify the location of your data and/or application on Amazon S3, and then click the “Create Job Flow” button. Monitor the progress of your job flow(s) directly from the AWS Management Console, Command Line Tools or APIs. And, after the job flow is done, retrieve the output from Amazon S3. http://grid.rit.edu24
Eucalyptus http://grid.rit.edu25 Eucalyptus - Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems - is an open-source software infrastructure for implementing "cloud computing" on clusters. The current interface to Eucalyptus is compatible with Amazon's EC2, S3, and EBS interfaces, but the infrastructure is designed to support multiple client-side interfaces. Eucalyptus is implemented using commonly available Linux tools and basic Web-service technologies making it easy to install and maintain.
Features of EUCALYPTUS Interface compatibility with EC2 (both Web service and Query interfaces) Simple installation and deployment using Rocks cluster- management tools Secure internal communication using SOAP with WS- security Overlay functionality requiring no modification to the target Linux environment Basic "Cloud Administrator" tools for system management and user accounting The ability to configure multiple clusters, each with private internal network addresses, into a single Cloud. http://grid.rit.edu26