GRID AND CLOUD COMPUTING

GRID AND CLOUD COMPUTING
Open Grid Services Architecture (OGSA) Courtesy: Dr Gnanasekaran Thangavel

UNIT 3 GRID SERVICES Introduction to Open Grid Services Architecture (OGSA) – Motivation – Functionality Requirements – Practical & Detailed view of OGSA/OGSI – Data intensive grid service models – OGSA services – Grid Computing Software 11/12/2018

Grid Computing Generations
Visit 11/12/2018

What is the OGSA Standard?
Acronym for Open Grid Service Architecture OGSA define how different components in grid interact Open Grid Services Architecture (OGSA) is a set of standards defining the way in which information is shared among diverse components of large, heterogeneous grid systems. In this context, a grid system is a scalable Wide Area Network (WAN) that supports resource sharing and distribution. 11/12/2018

Architecture of OGSA Comprised of 4 main layers
OGSA Enabled Physical and Logical Resources Layer Web Services Layer OGSA Architected Services Layer Grid Applications Layer 11/12/2018

OGSA Architecture 11/12/2018

OGSA Architecture - Physical and Logical Resources Layer
Physical resources are: servers, storage, network Logical resources manage physical resources Examples of logical resources: database managers, workflow managers MIE456

OGSA Architecture - Web Services Layer
Web service is software available online that could interact with other software using XML Consists of Open Grid Services Infrastructure (OGSI) sub-layer which specifies grid services and provide consistent way to interact with grid services Also extends Web Service Capabilities Consists of 5 interfaces: Life Cycle: Manages grid service life cycles State Management: Manage grid service states Service Groups: Collection of indexed grid services Factory: Provide way for creation of new grid services Notification: Manages notification between services & resources HandleMap: Deals with service identity when Factories are used 11/12/2018

OGSA Architecture - Web Services Layer (OGSI)
11/12/2018

Open Grid Services Infrastructure (OGSI)
Gives a formal and technical specification of what a grid service is. Its a excruciatingly detailed specification of how Grid Services work. Globus Toolkit 3 (GT3) includes a complete implementation of OGSI. It is a formal and technical specification of the concepts described in OGSA. Some other implementations are OGSI::Lite (Perl)1 and the UNICORE OGSA demonstrator2 (Uniform Interface to Computing Resources) from the EU GRIP (Grid Interoperability Project). OGSI specification defines grid services and builds upon web services. 1. 2. 11/12/2018

OGSA Architecture – OGSA Architected Services - Layer
OGSA Architected Services are classified into 3 service categories: Grid Core Services Grid Program Execution Services Grid Data Services 11/12/2018

OGSA Architecture – OGSA Architected Services - Layer
11/12/2018

OGSA Architected Services – Grid Core Services
Grid Core services are composed of 4 main types of services: Management Services: assist in installation, maintenance, & troubleshooting tasks in grid system Communication Services: include functions that allow grid services to communicate Policy Services: Provide framework for creation, administration & management of policies for system operation Security Services: provide authentication & authorization mechanisms to ensure systems interoperate securely 11/12/2018

OGSA Architected Services – Grid Program Execution Services
Grid Program Execution Services support unique grid systems in high performance computing, collaboration and parallelism Support virtualization of resource processing 11/12/2018

OGSA Architected Services – Grid Data Services
Grid Data Services support data virtualization Provide mechanism for access to distributed resources such as databases and files 11/12/2018

OGSA Architecture – Grid Applications Layer
Grid Applications Layer comprises of applications that use the grid architected services 11/12/2018

Conclusion Grid-Computing allows networked resources to be combined and used Grid-Computing offers great benefit to an organization OGSA are comprehensive standards which governs grid-computing There are 3 types of Grids namely Compute Grid, Scavanging Grid and Data Grid

Data intensive grid service models
Applications in the grid are normally grouped into two categories i.e. Computation-intensive and Data intensive Data intensive applications deals with massive amounts of data. The grid system must specially be designed to discover, transfer and manipulate the massive data sets. Transferring the massive data sets is a time consuming task. Data access method is also known as caching, which is often applied to enhance data efficiency in a grid environment. By replicating the same data block and scattering them in multiple regions in a grid, users can access the same data with locality of references. 11/12/2018

Data intensive grid service models
Replication strategies determine when and where to create a replica of the data. The strategies of replications can be classified into dynamic and static Static method The locations and number of replicas are determined in advance and will not be modified. Replication operation require little overhead Static strategic cannot adapt to changes in demand, bandwidth and storage variability Optimization is required to determine the location and number of data replicas. Dynamic strategies Dynamic strategies can adjust locations and number of data replicas according to change in conditions Frequent data moving operations can result in much more overhead the static strategies Optimization may be determined based on whether the data replica is being created, deleted or moved. 11/12/2018

Grid data Access models
In general there are four access models for organizing a data grid as listed here Monadic model Hierarchical model Federation model Hybrid model 11/12/2018

Monadic model This is a centralized data repository model. All data is saved in central data repository. When users want to access some data they have no submit request directly to the central repository. No data is replicated for preserving data locality. For a larger grid this model is not efficient in terms of performance and reliability. Data replication is permitted in this model only when fault tolerance is demanded. 11/12/2018

Hierarchical model It is suitable for building a large data grid which has only one large data access directory Data may be transferred from the source to a second level center. Then some data in the regional center is transferred to the third level centre. After being forwarded several times specific data objects are accessed directly by users. Higher level data center has a wider coverage area. PKI security services are easier to implement in this hierarchical data access model 11/12/2018

Federation model It is suited for designing a data grid with multiple source of data supplies. It is also known as a mesh model The data is shared the data and items are owned and controlled by their original owners. Only authenticated users are authorized to request data from any data source. This mesh model cost the most when the number of grid intuitions becomes very large 11/12/2018

Hybrid model This model combines the best features of the hierarchical and mesh models. Traditional data transfer technology such as FTP applies for networks with lower bandwidth. High bandwidth are exploited by high speed data transfer tools such as GridFTP developed with Globus library. The cost of hybrid model can be traded off between the two extreme models of hierarchical and mesh-connected grids. 11/12/2018

Parallel versus Striped Data Transfers
Parallel data transfer opens multiple data streams for passing subdivided segments of a file simultaneously. Although the speed of each stream is same as in sequential streaming, the total time to move data in all streams can be significantly reduced compared to FTP transfer. Striped data transfer a data objects is partitioned into a number of sections and each section is placed in an individual site in a data grid. When a user requests this piece of data, a data stream is created for each site in a data gird. When user requests this piece of data, data stream is created for each site, and all the sections of data objects ate transected simultaneously. 11/12/2018

Grid Services and OGSA Facilitate use and management of resources across distributed, heterogeneous environments Deliver seamless QoS Define open, published interfaces in order to provide interoperability of diverse resources Exploit industry-standard integration technologies Develop standards that achieve interoperability Integrate, virtualize, and manage services and resources in a distributed, heterogeneous environment Deliver functionality as loosely coupled, interacting services aligned with industry- accepted web service standards 11/12/2018

Grid Services and OGSA OGSA services fall into seven broad areas, defined in terms of capabilities frequently required in a grid scenario. Figure shows the OGSA architecture. These services are summarized as follows: 11/12/2018

OGSA services - seven broad areas
Infrastructure Services Refer to a set of common functionalities, such as naming, typically required by higher level services. Execution Management Services Concerned with issues such as starting and managing tasks, including placement, provisioning, and life-cycle management. Tasks may range from simple jobs to complex workflows or composite services. Data Management Services Provide functionality to move data to where it is needed, maintain replicated copies, run queries and updates, and transform data into new formats. These services must handle issues such as data consistency, persistency, and integrity. An OGSA data service is a web service that implements one or more of the base data interfaces to enable access to, and management of, data resources in a distributed environment. The three base interfaces, Data Access, Data Factory, and Data Management, define basic operations for representing, accessing, creating, and managing data. 11/12/2018

Resource Management Services Provide management capabilities for grid resources: management of the resources themselves, management of the resources as grid components, and management of the OGSA infrastructure. For example, resources can be monitored, reserved, deployed, and configured as needed to meet application QoS requirements. I t also requires an information model (semantics) and data model (representation) of the grid resources and services. Security Services Facilitate the enforcement of security-related policies within a (virtual) organization, and supports safe resource sharing. Authentication, authorization, and integrity assurance are essential functionalities provided by these services. 11/12/2018

Information Services Provide efficient production of, and access to, information about the grid and its constituent resources. The term “information” refers to dynamic data or events used for status monitoring; relatively static data used for discovery; and any data that is logged. Troubleshooting is j ust one of the possible uses for information provided by these services. Self-Management Services Support service-level attainment for a set of services (or resources), with as much automation as possible, to reduce the costs and complexity of managing the system. These services are essential in addressing the increasing complexity of owning and operating an I T infrastructure. 11/12/2018

Grid Computing Software
Condor / HTCondor NASA Intel BOINC (Berkeley Open Infrastructure for Network Computing) 11/12/2018

https://research.cs.wisc.edu/htcondor/
HTCondor HTCondor is an open-source high-throughput computing software framework for coarse-grained distributed parallelization of computationally intensive tasks. It can be used to manage workload on a dedicated cluster of computers, and/or to farm out work to idle desktop computers – so-called cycle scavenging. HTCondor runs on Linux, Unix, Mac OS X, FreeBSD, and contemporary Windows operating systems. HTCondor can seamlessly integrate both dedicated resources (rack-mounted clusters) and non-dedicated desktop machines (cycle scavenging) into one computing environment. HTCondor was formerly known as Condor; the name was changed in October 2012 to resolve a trademark lawsuit. HTCondor is developed by the HTCondor team at the University of Wisconsin–Madison and is freely available for use. HTCondor follows an open source philosophy (it's licensed under the Apache License 2.0). It can be downloaded from the HTCondor web site or by installing the Fedora Linux Distribution. It is also available on other platforms, like Ubuntu from the repositories. 11/12/2018

Downloading HTCondor 11/12/2018
Download HTCondor from: 11/12/2018

Downloading HTCondor 11/12/2018

Installing HTCondor 11/12/2018

Testing HTCondor After Install Execute /etc/init.d/condor restart
To ensure that HTCondor is running, you can run: ps -ef | egrep condor_ ps -ef | egrep condor_ condor :26 ? :00:00 /usr/sbin/condor_master -f root :26 ? :00:00 condor_procd -A /var/run/condor/procd_pipe -L /var/log/condor/ProcLog -R S 60 -C 132 condor :26 ? :00:00 condor_collector -f condor :26 ? :00:00 condor_startd -f condor :26 ? :00:00 condor_schedd -f condor :26 ? :00:00 condor_negotiator -f condor :27 ? :00:00 condor_kbdd adeel :58 pts/ :00:00 grep -E --color=auto condor_ 11/12/2018

Quick Starting HTCondor
11/12/2018

Creating a script and a submit description file
Unix/Linux Windows Shell Script Batch file Submit Description File 11/12/2018

Typical Commands of HTCondor
To start/restart condor use /etc/init.d/condor restart To submit a job use condor_submit sleep.sub To check job queue use condor_q -submitter adeel To remove a job use condor_rm 1.0 Log of execution is saved in sleep.log 11/12/2018

https://boinc.berkeley.edu/
BOINC The Berkeley Open Infrastructure for Network Computing (BOINC) is an open-source middleware system, supports volunteer and grid computing. Originally developed to support the project, it became generalized as a platform for other distributed applications in areas as diverse as mathematics, linguistics, medicine, molecular biology, climatology, environmental science, and astrophysics, among others. BOINC aims to enable researchers to tap into the enormous processing resources of multiple personal computers around the world. BOINC development originated with a team based at the Space Sciences Laboratory (SSL) at the University of California, Berkeley and led by David Anderson, who also leads As a high-performance distributed computing platform, BOINC brings together about 311,742 active participants and 834,343 active computers (hosts) worldwide processing on average PetaFLOPS as of 13 January 2017. 11/12/2018

Downloading BOINC 11/12/2018

Selecting BOINC Projects
11/12/2018

BOINC Projects 11/12/2018

GridRepublic 11/12/2018

References Kai Hwang, Geoffery C. Fox and Jack J. Dongarra, “Distributed and Cloud Computing: Clusters, Grids, Clouds and the Future of Internet”, First Edition, Morgan Kaufman Publisher, an Imprint of Elsevier, 2012. 11/12/2018

Related Videos on Youtube
IBM Grid Computing Demo Introduction to Grid Computing Grid Computing : History and Evolution of Grid Computing with generations Comparison of Cloud with Grid Computing 11/12/2018

Assignment #2 Setup your BOINC Project at and submit the Status Report of utilization of your Grid Node as Assignment #2 Next Week Differentiate between Grid and Cloud Computing after watching the videos on slide 51 11/12/2018

Questions and Comments?
Thank You Questions and Comments? 11/12/2018

GRID AND CLOUD COMPUTING

Similar presentations

Presentation on theme: "GRID AND CLOUD COMPUTING"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

GRID AND CLOUD COMPUTING

Similar presentations

Presentation on theme: "GRID AND CLOUD COMPUTING"— Presentation transcript:

Similar presentations

About project

Feedback