GRID AND CLOUD COMPUTING

GRID AND CLOUD COMPUTING
Introduction to Grid and Cloud Computing Courtesy: Dr Gnanasekaran Thangavel

Evolution of Distributed computing:
UNIT I INTRODUCTION Evolution of Distributed computing: – Scalable computing over the Internet – Technologies for network based systems – Clusters of cooperative computers – Grid computing Infrastructures – Cloud computing 11/12/2018

Text Book Author: Ian Foster, Carl Kesselman Publisher: Elsevier,
- 2-Dec-2003 - 748 pages 11/12/2018

Text Book Author: Barrie Sosinsky Publisher: John Wiley & Sons,
- 10-Dec-2010 - 473 pages 11/12/2018

Reference Book Authors: Judith Hurwitz, Robin Bloor, Marcia Kaufman, Fern Halper Publisher: John Wiley & Sons, - 2010 - 339 pages 11/12/2018

Distributed Computing
Definition: “A distributed system consists of multiple autonomous computers that communicate through a computer network. “Distributed computing utilizes a network of many computers, each accomplishing a portion of an overall task, to achieve a computational result much more quickly than with a single computer.” “Distributed computing is type of computing that involves multiple computers; remote from each other with each having a role in a computation problem or information processing.” 11/12/2018

Introduction A distributed system is one in which hardware or software components located at networked computers communicate and coordinate their actions only by message passing. In the term distributed computing, the word distributed means spread out across space. Thus, distributed computing is an activity performed on a spatially distributed system. These networked computers may be in the same room, same campus, same country, or in different continents 11/12/2018

Introduction Internet 11/12/2018
Cooperation Internet Large-scale Application Resource Management Subscription Distribution Agent Job Request A distributed system consists of collection of autonomous computers, connected through a network and distributed operating system software, which enables computers to coordinate their activities and to share the resources of the system - hardware, software and data, so that users perceive the system as a single, integrated computing facility. A distributed system is a collection of independent computers that appears to its users as a single coherent system. This definition has several important aspects. The first one is that a distributed system consists of components (i.e., computers) that are autonomous. A second aspect is that users (be they people or programs) think they are dealing with a single system. This means that one way or the other the autonomous components need to collaborate. How to establish this collaboration lies at the heart of developing distributed systems. Note that no assumptions are made concerning the type of computers. In principle, even within a single system, they could range from high-performance mainframe computers to small nodes in sensor networks. Likewise, no assumptions are made on the way that computers are interconnected. 11/12/2018

Motivation Inherently distributed applications Performance/cost
Resource sharing Flexibility and extensibility Availability and fault tolerance Scalability Network connectivity is increasing. Combination of cheap processors often more cost-effective than one expensive fast system. Potential increase of reliability. The main motivations in moving to a distributed system are the following: Inherently distributed applications. Distributed systems have come into existence in some very natural ways, e.g., in our society people are distributed and information should also be distributed. Distributed database system information is generated at different branch offices (subdatabases), so that a local access can be done quickly. The system also provides a global view to support various global operations. Performance/cost. The parallelism of distributed systems reduces processing bottlenecks and provides improved all-around performance, i.e., distributed systems offer a better price/performance ratio. Resource sharing. Distributed systems can efficiently support information and resource (hardware and software) sharing for users at different locations. Flexibility and extensibility. Distributed systems are capable of incremental growth and have the added advantage of facilitating modification or extension of a system to adapt to a changing environment without disrupting its operations. Availability and fault tolerance. With the multiplicity of storage units and processing elements, distributed systems have the potential ability to continue operation in the presence of failures in the system. Scalability. Distributed systems can be easily scaled to include additional resources (both hardware and software). 11/12/2018

Parallel computing was favored in the early years
History Parallel computing was favored in the early years Primarily vector-based at first Gradually more thread-based parallelism was introduced The first distributed computing programs were a pair of programs called Creeper and Reaper invented in 1970s Ethernet that was invented in 1970s. ARPANET was invented in the early 1970s and probably the earliest example of a large-scale distributed application. The use of concurrent processes that communicate by message-passing has its roots in operating system architectures studied in 1960s.[19] The first wide-spread distributed systems were local-area networks such as Ethernet that was invented in 1970s.[20] ARPANET, the predecessor of the Internet, was introduced in the late 1960s, and ARPANET was invented in the early 1970s. became the most successful application of ARPANET,[21] and it is probably the earliest example of a large-scale distributed application. In addition to ARPANET and its successor Internet, other early worldwide computer networks included Usenet and FidoNet from 1980s, both of which were used to support distributed discussion systems. The study of distributed computing became its own branch of computer science in the late 1970s and early 1980s. The first conference in the field, Symposium on Principles of Distributed Computing (PODC), dates back to 1982, and its European counterpart International Symposium on Distributed Computing (DISC) was first held in 1985. The first distributed computing programs were a pair of programs called Creeper and Reaper which made their way through the nodes of the ARPANET in the 1970s, the predecessor of the Internet. The Creeper came first and was a worm program, using the idle CPU cycles of processors in the ARPANET to copy itself onto the next system and then delete itself from the previous one. It was modified to remain on all previous computers and the Reaper was created which traveled through the same network and deleted all copies of the Creeper. In this way Creeper and Reaper were the first infectious computer programs and are actually often thought of as the first network viruses. They did no damage, however, to the computers they passed through and were instrumental in exploring the possibility of making use of idle computational power. 11/12/2018

History Massively parallel architectures start rising and message passing interface and other libraries developed Bandwidth was a big problem The first Internet-based distributed computing project was started in 1988 by the DEC System Research Center. Distributed.net was a project founded in considered the first to use the internet to distribute data for calculation and collect the results, The first Internet-based distributed computing project was started in 1988 by the DEC System Research Center. The project sent tasks to volunteers through , who would run these programs during idle time and then send the results back to DEC and get a new task. The project worked to factor large numbers and by 1990 had about 100 users. The most prominent group, considered the first to actually use the internet to distribute data for calculation and collect the results, was a project founded in 1997 called distributed.net. They used independently owned computers as DEC had, but allowed the users to download the program that would utilize their idle CPU time instead of ing it to them. Distributed.net completed several cryptology challenges by RSA Labs as well as other research facilities with the help of thousands of users. 11/12/2018

History 1995 – Today Cluster/grid architecture increasingly dominant
Special node machines were avoided in favor of COTS technologies Web-wide cluster software Google take this to the extreme (thousands of nodes/cluster) started in May analyze the radio signals that were being collected by the Arecibo Radio Telescope in Puerto Rico. Commercial, off-the-shelf (COTS) is a term for software or hardware, generally technology or computer products, that are ready-made and available for sale, lease, or license to the general public. They are often used as alternatives to in-house developments or one-off government-funded developments. The use of COTS is being mandated across many government and business programs, as they may offer significant savings in procurement and maintenance. However, since COTS software specifications are written by external sources, government agencies are sometimes wary of these products because they fear that future changes to the product will not be under their control. The project that truly popularized distributed computing and showed that it could work was an effort by the Search for Extraterrestrial Intelligence (SETI) at the University of California at Berkeley. The project was started in May 1999 to analyze the radio signals that were being collected by the Arecibo Radio Telescope in Puerto Rico. It has gained over three million independent users who volunteer their idle computers to search for signals that may not have originated from Earth. This project has really brought the field to light, so that other groups and companies are quickly following their lead. (See Current Projects) 11/12/2018

Goal Making Resources Accessible Data sharing and device sharing
Distribution Transparency Access, location, migration, relocation, replication, concurrency, failure Communication Make human-to-human comm. easier. E.g.. : electronic mail Flexibility Spread the work load over the available machines in the most cost effective way To coordinate the use of shared resources To solve large computational problem Solving a large computational problem. Alternatively, each computer may have its own user with individual needs, and the purpose of the distributed system is to coordinate the use of shared resources or provide communication services to the users. From Tanenbaum book: Making Resources Accessible The main goal of a distributed system is to make it easy for the users (and applications) to access remote resources, and to share them in a controlled and efficient way. Resources can be just about anything, but typical examples include things like printers, computers, storage facilities, data, files, Web pages, and networks, to name just a few. There are many reasons for wanting to share resources. One obvious reason is that of economics. For example, it is cheaper to let a printer be shared by several users in a smaJl office than having to buy and maintain a separate printer for each user. Likewise, it makes economic sense to share costly resources such as supercomputers, high-performance storage systems and other expensive peripherals. Distribution Transparency An important goal of a distributed system is to hide the fact that its processes and resources are physically distributed across multiple computers. A distributed system that is able to present itself to users and applications as if it were only a single computer system is said to be transparent. Openness Another important goal of distributed systems is openness. An open distributed system is a system that offers services according to standard rules that describe the syntax and semantics of those services. For example, in computer networks, standard rules govern the format, contents, and meaning of messages sent and received. Such rules are formalized in protocols. In distributed systems, services are generally specified through interfaces, which are often described in an Interface Definition Language (IDL). Interface definitions written in an IDL nearly always capture only the syntax of services. In other words, they specify precisely the names of the functions that are available together with types of the parameters, return values, possible exceptions that can be raised, and so on. The hard part is specifying precisely what those services do, that is, the semantics of interfaces. In practice, such specifications are always given in an informal way by means of natural language. Scalability Worldwide connectivity through the Internet is rapidly becoming as common as being able to send a postcard to anyone anywhere around the world. With this in mind, scalability is one of the most important design goals for developers of distributed systems. Scalability of a system can be measured along at least three different dimensions (Neuman, 1994). First, a system can be scalable with respect to its size, meaning that we can easily add more users and resources to the system. Second, a geographically scalable system is one in which the users and resources may lie far apart. Third, a system can be administratively scalable, meaning that it can still be easy to manage even if it spans many independent administrative organizations. Unfortunately, a system that is scalable in one or more of these dimensions often exhibits some loss of performance as the system scales up. 11/12/2018

Characteristics Resource Sharing Openness Concurrency Scalability
Fault Tolerance Transparency Resource Sharing:- Resource sharing is the ability to use any hardware, software or data anywhere in the system. Resources in a distributed system, unlike the centralized one, are physically encapsulated within one of the computers and can only be accessed from others by communication. It is the resource manager to offers a communication interface enabling the resource be accessed, manipulated and updated reliability and consistently. There are mainly two kinds of model resource managers: client/server model and the object-based model. Object Management Group uses the latter one in CORBA, in which any resource is treated as an object that encapsulates the resource by means of operations that users can invoke. Openness:- Openness is concerned with extensions and improvements of distributed systems. New components have to be integrated with existing components so that the added functionality becomes accessible from the distributed system as a whole. Hence, the static and dynamic properties of services provided by components have to be published in detailed interfaces. Concurrency:- Concurrency arises naturally in distributed systems from the separate activities of users, the independence of resources and the location of server processes in separate computers. Components in distributed systems are executed in concurrent processes. These processes may access the same resource concurrently. Thus the server process must coordinate their actions to ensure system integrity and data integrity. Scalability:- Scalability concerns the ease of the increasing the scale of the system (e.g. the number of processor) so as to accommodate more users and/or to improve the corresponding responsiveness of the system. Ideally, components should not need to be changed when the scale of a system increases. Fault tolerance:- Fault tolerance cares the reliability of the system so that in case of failure of hardware, software or network, the system continues to operate properly, without significantly degrading the performance of the system. It may be achieved by recovery (software) and redundancy (both software and hardware). Transparency:- Transparency hides the complexity of the distributed systems to the users and application programmers. They can perceive it as a whole rather than a collection of cooperating components in order to reduce the difficulties in design and in operation. This characteristic is orthogonal to the others. There are many aspects of transparency, including access transparency, location transparency, concurrency transparency, replication transparency, failure transparency, migration transparency, performance transparency and scaling transparency. 11/12/2018

Distributed Computing Architecture
Client-server 3-tier architecture N-tier architecture loose coupling, or tight coupling Peer-to-peer Space based Client-server:- Smart client code contacts the server for data, then formats and displays it to the user. Input at the client is committed back to the server when it represents a permanent change. 3-tier architecture :- Three tier systems move the client intelligence to a middle tier so that stateless clients can be used. This simplifies application deployment. Most web applications are 3-Tier. N-tier architecture:- N-Tier refers typically to web applications which further forward their requests to other enterprise services. This type of application is the one most responsible for the success of application servers. Tightly coupled (clustered):- Tightly coupled architecture refers typically to a cluster of machines that closely work together, running a shared process in parallel. The task is subdivided in parts that are made individually by each one and then put back together to make the final result. Peer-to-peer:- Peer-to-peer is an architecture where there is no special machine or machines that provide a service or manage the network resources. Instead all responsibilities are uniformly divided among all machines, known as peers. Peers can serve both as clients and servers. Space based :- Space based refers to an infrastructure that creates the illusion (virtualization) of one single address-space. Data are transparently replicated according to application needs. Decoupling in time, space and reference is achieved. Another basic aspect of distributed computing architecture is the method of communicating and coordinating work among concurrent processes. Through various message passing protocols, processes may communicate directly with one another, typically in a master/slave relationship. Alternatively, a "database-centric" architecture can enable distributed computing to be done without any form of direct inter-process communication, by utilizing a shared database. 11/12/2018

Application of Distributed Systems
Examples of commercial application : Database Management System Distributed computing using mobile agents Local intranet Internet (World Wide Web) JAVA Remote Method Invocation (RMI) Examples of commercial application of distributed system, such as the Database Management System, distributed computing using mobile agents, local intranet, internet (World Wide Web), JAVA RMI 11/12/2018

Distributed Computing Using Mobile Agents
Mobile agents can be wandering around in a network using free resources for their own computations. 11/12/2018

Local Intranet A portion of Internet that is separately administered & supports internal sharing of resources (file/storage systems and printers) using Internet Protocols is called local intranet. 11/12/2018

Internet The Internet is a global system of interconnected computer networks that use the standardized Internet Protocol Suite (TCP/IP). 11/12/2018

JAVA RMI Embedded in language Java:-
Object variant of remote procedure call Adds naming compared with RPC (Remote Procedure Call) Restricted to Java environments Java Remote Method Invocation (RMI), which is a simple and powerful network object transport mechanism, provides a way for a Java program on one machine to communicate with objects residing in different address spaces. Some Java parallel computing environments use RMI for communication, such as JavaParty. It is also the foundation of Jini technology. RMI is an implementation of the distributed object programming model, comparable with CORBA, but simpler, and specialized to the Java language. An overview of the RMI architecture is shown in Figure above. Goals A primary goal for the RMI designers was to allow programmers to develop distributed Java programs with the same syntax and semantics used for non-distributed programs. To do this, they had to carefully map how Java classes and objects work in a single Java Virtual Machine1 (JVM) to a new model of how classes and objects would work in a distributed (multiple JVM) computing environment. Java RMI Architecture The design goal for the RMI architecture was to create a Java distributed object model that integrates naturally into the Java programming language and the local object model. RMI architects have succeeded; creating a system that extends the safety and robustness of the Java architecture to the distributed computing world. Important parts of the RMI architecture are the stub class, the object serialization, and the server-side Run-time System. The stub class implements the remote interface and is responsible for marshaling and unmarshaling the data and managing the network connection to a server. An instance of the stub class is needed on each client. Local method invocations on the stub class will be made whenever a client invokes a method on a remote object. Java has a general mechanism for converting objects into streams of bytes that can later be read back into an arbitrary JVM. This mechanism, called object serialization, is an essential functionality needed by Java's RMI implementation. It provides a standardized way to encode all the information into a byte stream suitable for streaming to some type of network or to a file-system. In order to provide the functionality, an object must implement the Serializable interface. The server-side run-time system is responsible for listening for invocation requests on a suitable IP port, and dispatching them to the proper, remote object on the server. Since RMI is designed for Web based client-server applications over slow network, it is not clear it is suitable for high performance distributed computing environments with low latency and high bandwidth. A better serialization would be needed, since Java's current object serialization often takes at least 25% and up to 50% of the time [50] needed for a remote invocation. Stubs: Stub is a piece of code emulating a called function , It is a temporary called program.It functions similarly like sub modules when called by the main module. A piece of code that simulates the activity of missing component. Stubs are simulations of the sub-code that otherwise is very difficult to execute in the test code. Stub is a simple routine that takes the place of the real routine. Skeletons: The role of the stubs is to marshal and unmarshal the messages that are sent and received on the client or the server side . RMI Architecture 11/12/2018

Advantages Economics:-
Computers harnessed together give a better price/performance ratio than mainframes. Speed:- A distributed system may have more total computing power than a mainframe. Inherent distribution of applications:- Some applications are inherently distributed. e.g., an ATM-banking application. Reliability:- If one machine crashes, the system as a whole can still survive if you have multiple server machines and multiple storage devices (redundancy). Extensibility and Incremental Growth:- Possible to gradually scale up (in terms of processing power and functionality) by adding more resources (both hardware and software). This can be done without disruption to the rest of the system. Economics:- Computers harnessed together give a better price/performance ratio than mainframes. Speed:- A distributed system may have more total computing power than a mainframe. Inherent distribution of applications:- Some applications are inherently distributed. e.g., an ATM-banking application. Reliability:- If one machine crashes, the system as a whole can still survive if you have multiple server machines and multiple storage devices (redundancy). Extensibility and Incremental Growth:- Possible to gradually scale up (in terms of processing power and functionality) by adding more sources (both hardware and software). This can be done without disruption to the rest of the system. Distributed custodianship:- The National Spatial Data Infrastructure (NSDI) calls for a system of partnerships to produce a future national framework for data as a patchwork quilt of information collected at different scales and produced and maintained by different governments and agencies. NSDI will require novel arrangements for framework management, area integration, and data distribution. This research will examine the basic feasibility and likely effects of such distributed custodianship in the context of distributed computing architectures, and will determine the institutional structures that must evolve to support such custodianship. Data integration:- This research will contribute to the integration of geographic information and GISs into the mainstream of future libraries, which are likely to have full digital capacity. The digital libraries of the future will offer services for manipulating and processing data as well as for simple searches and retrieval. Missed opportunities:- By anticipating the impact that a rapidly advancing technology will have on GISs, this research will allow the GIS community to take better advantage of the opportunities that the technology offers. 11/12/2018

Disadvantages Complexity :-
Lack of experience in designing, and implementing a distributed system. e.g. which platform (hardware and OS) to use, which language to use etc. Network problem:- If the network underlying a distributed system saturates or goes down, then the distributed system will be effectively disabled thus negating most of the advantages of the distributed system. Security:- Security is a major hazard since easy access to data means easy access to secret data as well. 11/12/2018

Issues and Challenges Heterogeneity of components :-
Variety or differences that apply to computer hardware, network, OS, programming language and implementations by different developers. All differences in representation must be dealt with to do message exchange. Example : Different calls for exchange of messages in UNIX is different from Windows. Openness:- System can be extended and re-implemented in various ways. Cannot be achieved unless the specification and documentation are made available to software developer. The most challenge to designer is to tackle the complexity of distributed system; design by different people. Heterogeneity They must be constructed from a variety of diff. networks, OS, computer hardware and programming language. The Internet comm. Protocols mask the difference in networks, and middleware can deal with other differences. Openness Dist. Systems should be extensible – the 1st step is to publish the interfaces of the components, but the integration of components written by diff. programmers is a real challenge. 11/12/2018

Issues and Challenges cont…
Transparency:- Aim : make certain aspects of distribution invisible to the application programmer; focus on design of their particular application. They are not concerned about the locations and details of how it operates, either replicated or migrated. Failures can be presented to application programmers in the form of exceptions – that must be handled. 11/12/2018

Transparency:- This concept can be summarize as shown in this Figure: Location Transparency Refers to the fact that in a true dist. System, users cannot tell where hardware and software resources such as CPUs, printers, files and databases are located. Migration Transparency Resources must be free to move from one location to another without having their names change. Replication Transparency OS is free to make additional copies of files and other resources on its own without the users noticing. Eg.: The servers can decide by themselves to replicate any file on any or all servers, without the users having to know about it. Concurrency Transparency The users will not notice the existence of other users. Parallelism Transparency can be regarded as the holy grail for dist. Systems designers. 11/12/2018

Security:- Security for information resources in distributed system have 3 components : a. Confidentiality : protection against disclosure to unauthorized individuals. b. Integrity : protection against alteration/corruption c. Availability : protection against interference with the means to access the resources. The challenge is to send sensitive information over Internet in a secure manner and to identify a remote user or other agent correctly. Encryption can be used to provide adequate protection of shared resources and to keep sensitive information secret when is transmitted in message over a network. Denial of service attacks are still a problem. 11/12/2018

Issues and Challenges cont..
Scalability :- Distributed computing operates at many different scales, ranging from small Intranet to Internet. A system is scalable if there is significant increase in the number of resources and users. The challenges is : a. controlling the cost of physical resources. b. controlling the performance loss. c. preventing software resource running out. d. avoiding performance bottlenecks. Dist. Computing is scalable if the cost of adding a user is a constant amount in terms of the resources that must be added. The algorithms used to access shared data should avoid performance bottlenecks and data should be structured hierarchically to get the best access times. 11/12/2018

Failure Handling :- Failures in a distributed system are partial – some components fail while others can function. That’s why handling the failures are difficult a. Detecting failures : to manage the presence of failures cannot be detected but may be suspected. b. Masking failures : hiding failure not guaranteed in the worst case. Concurrency :- Where applications/services process concurrency, it will effect a conflict in operations with one another and produce inconsistence results. Each resource must be designed to be safe in a concurrent environment. Failure Handling Any process, computer or network may fail independently of the others. Therefore each component needs to be aware of the possible ways in which the components its depends on may fail and be designed to deal with each of those failure appropriately. 11/12/2018

Conclusion The concept of distributed computing is the most efficient way to achieve the optimization. Distributed computing is anywhere : intranet, Internet or mobile ubiquitous computing (laptop, PDAs, pagers, smart watches, hi-fi systems). It deals with hardware and software systems, that contain more than one processing / storage and run in concurrently. Main motivation factor is resource sharing; such as files , printers, web pages or database records. Grid computing and Cloud computing are form of distributed computing. In this age of optimization everybody is trying to get optimized output from their limited resources. The concept of distributed computing is the most efficient way to achieve the optimization. In case of distributed computing the actual task is modularized and is distributed among various computer system. It not only increases the efficiency of the task but also reduce the total time required to complete the task. Now the advance concept of this distributed computing, that is the distributed computing through mobile agents is setting a new landmark in this technology. A mobile agent is a process that can transport its state from one environment to another, with its data intact, and be capable of performing appropriately in the new environment. 11/12/2018

Grid Computing Grid computing is a form of distributed computing whereby a "super and virtual computer" is composed of a cluster of networked, loosely coupled computers, acting in concert to perform very large tasks. Grid computing (Foster and Kesselman, 1999) is a growing technology that facilitates the executions of large-scale resource intensive applications on geographically distributed computing resources. Facilitates flexible, secure, coordinated large scale resource sharing among dynamic collections of individuals, institutions, and resource. Enable communities (“virtual organizations”) to share geographically distributed resources as they pursue common goals. 11/12/2018

Criteria for a Grid: Benefits Exploit Underutilized resources
Coordinates resources that are not subject to centralized control Uses standard, open, general-purpose protocols and interfaces. Delivers nontrivial qualities of service Benefits Exploit Underutilized resources Resource load Balancing Virtualize resources across an enterprise Data Grids, Compute Grids Enable collaboration for virtual organizations 11/12/2018

Grid Applications Data and computationally intensive applications:
This technology has been applied to computationally-intensive scientific, mathematical, and academic problems like drug discovery, economic forecasting, seismic analysis back office data processing in support of e-commerce A chemist may utilize hundreds of processors to screen thousands of compounds per hour. Teams of engineers worldwide pool resources to analyze terabytes of structural data. Meteorologists seek to visualize and analyze petabytes of climate data with enormous computational demands. Resource sharing Computers, storage, sensors, networks, … Sharing always conditional: issues of trust, policy, negotiation, payment, … Coordinated problem solving distributed data analysis, computation, collaboration, … 11/12/2018

Grid Topologies • Intragrid – Local grid within an organization – Trust based on personal contracts • Extragrid – Resources of a consortium of organizations connected through a (Virtual) Private Network – Trust based on Business to Business contracts • Intergrid – Global sharing of resources through the internet – Trust based on certification 11/12/2018

Computational Grid “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.” ”The Grid: Blueprint for a New Computing Infrastructure”, Kesselman & Foster Example : Science Grid (US Department of Energy) 11/12/2018

Data Grid A data grid is a grid computing system that deals with data — the controlled sharing and management of large amounts of distributed data. Data Grid is the storage component of a grid environment. Scientific and engineering applications require access to large amounts of data, and often this data is widely distributed. A data grid provides seamless access to the local or remote data required to complete compute intensive calculations. Example : Biomedical informatics Research Network (BIRN), the Southern California Earthquake Center (SCEC). 11/12/2018

Methods of Grid Computing
Distributed Supercomputing High-Throughput Computing On-Demand Computing Data-Intensive Computing Collaborative Computing Logistical Networking 11/12/2018

Distributed Supercomputing
Combining multiple high-capacity resources on a computational grid into a single, virtual distributed supercomputer. Tackle problems that cannot be solved on a single system. 11/12/2018 37

High-Throughput Computing
Uses the grid to schedule large numbers of loosely coupled or independent tasks, with the goal of putting unused processor cycles to work. On-Demand Computing Uses grid capabilities to meet short-term requirements for resources that are not locally accessible. Models real-time computing demands. 11/12/2018 38

Collaborative Computing
Concerned primarily with enabling and enhancing human-to-human interactions. Applications are often structured in terms of a virtual shared space. Data-Intensive Computing The focus is on synthesizing new information from data that is maintained in geographically distributed repositories, digital libraries, and databases. Particularly useful for distributed data mining. 11/12/2018 39

Logistical Networking
Logistical networks focus on exposing storage resources inside networks by optimizing the global scheduling of data transport, and data storage. Contrasts with traditional networking, which does not explicitly model storage resources in the network. high-level services for Grid applications Called "logistical" because of the analogy it bears with the systems of warehouses, depots, and distribution channels. 11/12/2018 40

P2P Computing vs Grid Computing
Differ in Target Communities Grid system deals with more complex, more powerful, more diverse and highly interconnected set of resources than P2P P2P uses heterogeneous end user devices for resource sharing to fulfill the application requirements. Business logic and data is distributed among end user nodes for P2P applications. 11/12/2018

A typical view of Grid environment
User Resource Broker Grid Resources Grid Information Service 2. A User sends computation or data intensive application to Global Grids in order to speed up the execution of the application. 3. A Resource Broker distribute the jobs in an application to the Grid resources based on user’s QoS requirements and details of available Grid resources for further executions. 4. Grid Resources (Cluster, PC, Supercomputer, database, instruments, etc.) in the Global Grid execute the user jobs. 1. Grid Information Service system collects the details of the available Grid resources and passes the information to the resource broker. Computation result Grid application Computational jobs Details of Grid resources Processed jobs 1 2 4 3 11/12/2018

Grid Middleware Grids are typically managed by grid ware - a special type of middleware that enable sharing and manage grid components based on user requirements and resource attributes (e.g., capacity, performance) Software that connects other software components or applications to provide the following functions: Run applications on suitable available resources – Brokering, Scheduling Provide uniform, high-level access to resources – Semantic interfaces – Web Services, Service Oriented Architectures Address inter-domain issues of security, policy, etc. – Federated Identities Provide application-level status monitoring and control 11/12/2018

Middleware Globus –chicago Univ
Condor – Wisconsin Univ – High throughput computing Legion – Virginia Univ – virtual workspaces - collaborative computing IBP – Internet back pane – Tennesse Univ – logistical networking NetSolve – solving scientific problems in heterogeneous env – high throughput & data intensive 11/12/2018 44

Two Key Grid Computing Groups
The Globus Alliance ( Composed of people from: Argonne National Labs, University of Chicago, University of Southern California Information Sciences Institute, University of Edinburgh and others. OGSA/I standards initially proposed by the Globus Group The Global Grid Forum ( Heavy involvement of Academic Groups and Industry (e.g. IBM Grid Computing, HP, United Devices, Oracle, UK e-Science Programme, US DOE, US NSF, Indiana University, and many others) Process Meets three times annually Solicits involvement from industry, research groups, and academics Open Grid Services Architecture (OGSA) is a set of standards defining the way in which information is shared among diverse components of large, heterogeneous grid systems 11/12/2018

Some of the Major Grid Projects
Name URL/Sponsor Focus EuroGrid, Grid Interoperability (GRIP) eurogrid.org European Union Create tech for remote access to super comp resources & simulation codes; in GRIP, integrate with Globus Toolkit™ Fusion Collaboratory fusiongrid.org DOE Off. Science Create a national computational collaboratory for fusion research Globus Project™ globus.org DARPA, DOE, NSF, NASA, Msoft Research on Grid technologies; development and support of Globus Toolkit™; application and deployment GridLab gridlab.org Grid technologies and applications GridPP gridpp.ac.uk U.K. eScience Create & apply an operational grid within the U.K. for particle physics research Grid Research Integration Dev. & Support Center grids-center.org NSF Integration, deployment, support of the NSF Middleware Infrastructure for research & education 11/12/2018

Cloud Computing Cloud computing refers to applications and services that run on a distributed network using virtualized resources and accessed by common Internet protocols and networking standards. It is distinguished by the notion that resources are virtual and limitless and that details of the physical systems on which software runs are abstracted from the user. 11/12/2018

Cloud Computing Cloud computing takes the technology, services, and applications that are similar to those on the Internet and turns them into a self-service utility. The use of the word “cloud” makes reference to the two essential concepts: Abstraction: Cloud computing abstracts the details of system implementation from users and developers. Applications run on physical systems that aren't specified, data is stored in locations that are unknown, administration of systems is outsourced to others, and access by users is ubiquitous. Virtualization: Cloud computing virtualizes systems by pooling and sharing resources. Systems and storage can be provisioned as needed from a centralized infrastructure, costs are assessed on a metered basis, multi-tenancy is enabled, and resources are scalable with agility. 11/12/2018

Cloud Computing Cloud computing is an abstraction based on the notion of pooling physical resources and presenting them as a virtual resource. It is a new model for provisioning resources, for staging applications, and for platform-independent user access to services. To help clarify how cloud computing has changed the nature of commercial system deployment, consider these three examples: Google: In the last decade, Google has built a worldwide network of datacenters to service its search engine. In doing so Google has captured a substantial portion of the world's advertising revenue. That revenue has enabled Google to offer free software to users based on that infrastructure and has changed the market for user-facing software. This is the classic Software as a Service case. Azure Platform: By contrast, Microsoft is creating the Azure Platform. It enables .NET Framework applications to run over the Internet as an alternate platform for Microsoft developer software running on desktops. Amazon Web Services: One of the most successful cloud-based businesses is Amazon Web Services, which is an Infrastructure as a Service offering that lets you rent virtual computers on Amazon's own infrastructure. 11/12/2018

Questions and Comments?
Thank You Questions and Comments? 11/12/2018

GRID AND CLOUD COMPUTING

Similar presentations

Presentation on theme: "GRID AND CLOUD COMPUTING"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

GRID AND CLOUD COMPUTING

Similar presentations

Presentation on theme: "GRID AND CLOUD COMPUTING"— Presentation transcript:

Similar presentations

About project

Feedback