3Then to now!1961: John McCarthy predicts that “Computation may someday be organized as a public utility”Mid 1990s: the term Grid is coined by the community2002: Ian Foster publishes a grid checklist.July 2002: Amazon launches its AWS.Subsequent Years: Large-scale federated systems such as TeraGrid, Open Science Grid etc. are developed.What is the Grid? A Three Point Checklist
4Definitions!! Grid Computing: The ability, using a set of open standards and protocols, to gain access to applications and data, processing power, storage capacity and a vast array of other computing resources over the Internet. A grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of resources distributed across ‘multiple’ administrative domains based on their (resources) availability, capacity, performance, cost and users’ quality-of-service requirements.11- IBM Solutions Grid for Business Partners: Helping IBM Business Partners to Grid-enable applications for the next phase of e-business on demand
5Characteristics of a Grid Coordinates resources that are not subject to centralized controlUses standard, open, general-purpose protocols and interfacesDelivers non-trivial qualities of service
6Definitions!!Cloud Computing: A large-scale distributed computing paradigm that is driven by economies of scale, in which a pool of abstracted, virtualized, dynamically-scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the Internet.
7Characteristics of a Cloud Massively scalableCan be encapsulated as an abstract entity that delivers different levels of services to customers outside the CloudDriven by economies of scaleCan be dynamically configured (via virtualization or other approaches) and delivered on demand
8The BIG Question!!!Is Cloud Computing just a new name for Grid? YES: The vision is the same. BUT NO: The approach is different. NEVERTHELESS, YES: The problem space, both paradigms attempt to solve, is overlapping.
9Why cloud computing?Exponentially growing data size in scientific instrumentation/simulation and Internet publishing and archivingWide spread adoption of Computing Services and Web 2.0 applicationsRapid decrease in hardware cost and increase in computing power and storage capacity, and the advent of multi-core architecture and modern supercomputers consisting of hundreds of thousands of cores
10Relationship between Clouds and other domains it overlaps with
11Side-by-Side: Grids vs. Clouds Six comparison metrics:Business ModelArchitectureResource ManagementProgramming ModelApplication ModelSecurity Model
12Business Model CLOUD GRID On-demand pay-per-use type Model Services are charged usually on the basis of per instance-hour, per GB-Month of storage, per TB/Month data transfer etcProject-oriented ModelUsers pool resources that are shared by the communityEach Grid site is responsible for maintaining their own set of resources
13ArchitectureGRIDGrids provide protocols and services at five different layersFabric Layer: Provides access to different resource types such as compute, storage and network resource, code repository etc.Connectivity Layer: Defines core communication and authentication protocols for easy and secure network transactions.Resource Layer: Defines protocols for the publication, discovery, negotiation, monitoring, accounting and payment of sharing operations on individual resources.Collective Layer: Captures interactions across collections of resources.Application Layer: Comprises whatever user applications built on top of the above protocols and APIs and operate in VO (Virtual Organization) Environments.
14ArchitectureCLOUDClouds architecture can be divided in to four different layers:Fabric Layer: Contains the raw hardware level resources.Unified Resource Layer: Contains resources that have been abstracted/encapsulated (usually by virtualization) so that they can be exposed to upper layer.Platform Layer: Adds on a collection of specialized tools, middleware and services on top of the unified resources.Application Layer: Contains the applications that would run in the Clouds.
16Architecture CLOUD Clouds provide services at three different levels: IaaS: Provisions hardware, software, and equipments to deliver software application environments with a resource usage-based pricing model.PaaS: Offers a high-level integrated environment to build, test, and deploy custom applications.SaaS: Delivers special-purpose software that is remotely accessible by consumers through the Internet with a usage-based pricing model.Together the three are referred to as the Cloud Computing Onion.
18Resource Management Comparison based on the following metrics: Compute ModelData ModelData LocalityCombining Compute and Data ManagementVirtualizationMonitoringProvenance
19Resource Management CLOUD GRID Compute Model Users simultaneous share resources and are serviced instantlyQueuing based batch-scheduled modelData ModelCurrent data model houses the data inside the cloud however in the future this could changeSpecially designed Data-Grids along with metadata catalogs that keep track of the location of each piece of dataData LocalityHigh focus on storing and replicating data near the associated compute unitData is stored in shared file systems where data locality can not easily be applied. However data-aware schedulers dramatically improve performanceCombining Compute and Data ManagementInitially clouds were not used much for data-intensive applications so not much work had been done to manage large amounts of data over compute resources. However this has changed substantially nowGrids achieve this well because of the usage of data-aware schedulers to schedule jobs close to the node responsible for computing them
20Resource Management CLOUD GRID Virtualization Highly virtualized to meet service level agreements. Abstractions provided at each layer to assist the process of virtualizationLittle or no support for virtualizationMonitoringSince services provided are layer-specific, monitoring information is not provided for the entire systemUsers have greater flexibility over the resources they are allocated and hence can deploy fine grained monitoring infrastructureProvenanceStill an under-explored area however given that clouds are increasingly being used for e-science research new provenance systems are emerging fastSince Grids are project oriented, provenance is essential and therefore Grids provide a lot of support for this
21Programming Model CLOUD GRID Most common parallel programming model is MapReduceStandard parallel programming models are also usedScripting is used in place of workflow management systemsClouds have generally adopted Web Services APIs for providing services over the web.The most commonly used parallel programming model is based on message passingLess used models employ coordination languages that allow heterogeneous components to coordinate and interactIn a recent effort to develop a service oriented Grid programming model the community has started using WSRF (Web Services Resource Framework)Workflow Management Systems are used when processing large sets of data involving complex tasks
22Application Model CLOUD GRID Cloud computing is still in its infancy so the app space has not yet been clearly understood however one can characterize most applications as loosely coupledAll three layers mentioned previously provide their own set of functionalities and applications and tools are being developed that exploit these functionalitiesGrids support a wide variety of applications such as High Performance Computing (HPC) and High Throughput Computing (HTC) appsTightly coupled applications make use of the Message Passing Interface (MPI) for inter-process communication whereas loosely coupled apps rely on Workflow Management SystemsAnother emerging set of apps are the scientific gateways apps. These provide a large variety of services through a browser-based user interface
23Security Model CLOUD GRID Clouds mostly comprise dedicated data centers with most of the infrastructure homogeneous in nature however when cross-data center interaction occurs compatibility issues ariseClouds seem to have relatively simpler and less secure security modelsSecurity is the biggest obstruction to the wide scale adoption of cloud computingGrids are built under the assumption that the shared resources will be mostly heterogeneous in nature and hence are better equipped for challenges pertaining to interoperability and compatibilitySince each Grid site has operational autonomy, they have security engineered in to them by the administratorsSecurity model is more complex as compared to Clouds and perhaps more time consuming too
24ConclusionClouds and Grids share a common vision and have overlapping architecture, technology and application space. However they take different steps to provide scalable distributed on-demand computation resulting in the evolution of two parallel infrastructures. Tomorrow’s distributed systems will need to have the centralized scale of today’s Cloud utilities and the distribution and interoperability of today’s Grid facilities.
25Discussion QuestionsIs Cloud computing the same as Grid computing? Why? Why not?What does the future entail for each of these paradigms? Can you see a unified distributed paradigm in the future? Why? Why not?Will the community shift more or more towards Clouds and Grids or do you envision a future where high-end desktop machines will dominate the market?Do you think Clouds and Grids are reliable enough to be trusted with sensitive data and computation?Currently Clouds can’t guarantee reliability, security and control however they provide ease of access, portability etc at a very low cost. Do you think the benefits outweigh the costs?
26References1- “Cloud Computing and Grid Computing 360 Degree Compared” Authors: Ian Foster, Yong Zhao, Ioan Raicu, Shiyong Lu This paper appears in: Grid Computing Environments Workshop, GCE '08