Grid Computing Overview and Research Issues Peter Kelly Adelaide University, Australia Supervisors:Paul Coddington Andrew Wendelborn.

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

COM vs. CORBA.
Web Service Ahmed Gamal Ahmed Nile University Bioinformatics Group
A Dynamic World, what can Grids do for Multi-Core computing? Daniel Goodman, Anne Trefethen and Douglas Creager
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
Distributed, parallel web service orchestration using XSLT Peter Kelly Paul Coddington Andrew Wendelborn.
Distributed components
Technical Architectures
J2ME Web Services Specification.  With the promise to ease interoperability and allow for large scale software collaboration over the Internet by offering.
A New Computing Paradigm. Overview of Web Services Over 66 percent of respondents to a 2001 InfoWorld magazine poll agreed that "Web services are likely.
1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.
Client/Server Architecture
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
INTRODUCTION TO CLOUD COMPUTING Cs 595 Lecture 5 2/11/2015.
SOA, BPM, BPEL, jBPM.
THE NEXT STEP IN WEB SERVICES By Francisco Curbera,… Memtimin MAHMUT 2012.
Computer System Architectures Computer System Software
A Simplified Approach to Web Service Development Peter Kelly Paul Coddington Andrew Wendelborn.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
T Network Application Frameworks and XML Web Services and WSDL Sasu Tarkoma Based on slides by Pekka Nikander.
NReduce: A Distributed Virtual Machine for Parallel Graph Reduction Peter Kelly Paul Coddington Andrew Wendelborn Distributed and High Performance Computing.
Web services: Why and How OOPSLA 2001 F. Curbera, W.Nagy, S.Weerawarana Nclab, Jungsook Kim.
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
95-843: Service Oriented Architecture 1 Master of Information System Management Service Oriented Architecture Lecture 10: Service Component Architecture.
1 HKU CSIS DB Seminar: HKU CSIS DB Seminar: Web Services Oriented Data Processing and Integration Speaker: Eric Lo.
Enterprise Java Beans Java for the Enterprise Server-based platform for Enterprise Applications Designed for “medium-to-large scale business, enterprise-wide.
CS 390- Unix Programming Environment CS 390 Unix Programming Environment Topics to be covered: Distributed Computing Fundamentals.
Lecture 15 Introduction to Web Services Web Service Applications.
Composing Adaptive Software Authors Philip K. McKinley, Seyed Masoud Sadjadi, Eric P. Kasten, Betty H.C. Cheng Presented by Ana Rodriguez June 21, 2006.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
1 Introduction to Middleware. 2 Outline What is middleware? Purpose and origin Why use it? What Middleware does? Technical details Middleware services.
Web Services interoperability and standards. Infrastructure Challenge ● Applied bioinformatics need various computer resources ● The amount and size of.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
Semantic Web Technologies Research Topics and Projects discussion Brief Readings Discussion Research Presentations.
95-843: Service Oriented Architecture 1 Master of Information System Management Service Oriented Architecture Lecture 7: BPEL Some notes selected from.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
Distribution and components. 2 What is the problem? Enterprise computing is Large scale & complex: It supports large scale and complex organisations Spanning.
Chapter 4 – Threads (Pgs 153 – 174). Threads  A "Basic Unit of CPU Utilization"  A technique that assists in performing parallel computation by setting.
Web Service Future CS409 Application Services Even Semester 2007.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
Kemal Baykal Rasim Ismayilov
CSIT 220 (Blum)1 Remote Procedure Calls Based on Chapter 38 in Computer Networks and Internets, Comer.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
7. Grid Computing Systems and Resource Management
Dr. Rebhi S. Baraka Advanced Topics in Information Technology (SICT 4310) Department of Computer Science Faculty of Information Technology.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Slide 1 Service-centric Software Engineering. Slide 2 Objectives To explain the notion of a reusable service, based on web service standards, that provides.
Introduction to OOP CPS235: Introduction.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
1 TCS Confidential. 2 Objective : In this session we will be able to learn:  What is Cloud Computing?  Characteristics  Cloud Flavors  Cloud Deployment.
Compilation of XSLT into Dataflow Graphs for Web Service Composition Peter Kelly Paul Coddington Andrew Wendelborn.
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
1 Seminar on SOA Seminar on Service Oriented Architecture BPEL Some notes selected from “Business Process Execution Language for Web Services” by Matjaz.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
A Web Services Journey on the .NET Bus
WEB SERVICES.
T Network Application Frameworks and XML Web Services and WSDL Sasu Tarkoma Based on slides by Pekka Nikander.
University of Technology
Service-centric Software Engineering
Inventory of Distributed Computing Concepts
Distributed Systems Bina Ramamurthy 12/2/2018 B.Ramamurthy.
Component--based development
Distributed Systems through Web Services
Introduction to Web Services
Presentation transcript:

Grid Computing Overview and Research Issues Peter Kelly Adelaide University, Australia Supervisors:Paul Coddington Andrew Wendelborn

What is grid computing? Grid computing is many things to many people At its core, it’s about Sharing computing resources between organisations Enabling more complex and demanding applications by providing widespread access to powerful computers and storage Integrating existing systems together

What is grid computing? In some respects it’s similar to cluster computing, however each computer may Be located in a different country Use a different CPU architecture Run a different operating system Be owned by a different organisation Have a different amount of memory, disk space, and computing power, and network bandwidth Not be available all of the time Thus grids are much more complex than clusters!

Why is it useful? Demand for computing power is growing rapidly – In industry, science, government, engineering, entertainment, defence… everywhere Need ways to harness the large amount of computing power available around the world Organisations often want to collaborate on projects and share resources with each other Grids provide the infrastructure to integrate different applications that need to collaborate with each other to get useful work done

Types of grid computing Service Oriented Architecture (SOA) Job submission (supercomputer access) Cycle stealing

Service Oriented Architecture (SOA) Applications are exposed as services, which provide a well-defined interface and are accessed through standard protocols Clients use remote procedure calls to access these services ClientService Request Response

Benefits of SOA SOA is platform agnostic – Client doesn’t need to know how service is implemented – Service doesn’t need to know how client is implemented SOA is vendor independent – Based on open standards – no “lock in” – All SOA vendors support the same standards to enable interoperability SOA is widely supported – Many companies are getting behind it – Being adopted widely in commercial and scientific organisations

Job submission Many organisations have large supercomputers (SMP or clusters) that they want users to be able to submit jobs to This can be achieved by installing middleware on each supercomputer which interfaces to the local job queue – e.g. Globus GRAM - allows users to submit to job queues such as PBS, LSF, etc. Users submit jobs to a superscheduler which manages a “higher level” queue and dispatches jobs to resources The grid middleware handles tasks such as copying files to and from the execution node, monitoring job progress, and abstracts the details of these away from clients

Job submission SMP machineCluster Superscheduler Client

Benefits of Job Submission Grids Users do not have to worry about differences between job submission systems running on different resources Superschedulers make it possible to automatically find resources that will execute the job quicker A user submits a job to a grid, it runs, and they get the results back later Job submission can be implemented on top of SOA by providing a service with methods for submitting and monitoring jobs, as well as notifying clients of failures or completion – e.g. Globus MMJFS – provides a web service interface to allow users to submit jobs

Cycle stealing The use of large numbers of desktop PCs to run “embarrassingly parallel” applications A master node coordinates execution and hands out tasks to workers The worker process on each machine polls the master for work to do, and then executes the tasks as they become ready Worker detects when the machine is being used by a user and suspends/aborts the active task This model is inherently fault tolerant; if a machine dies or a task is aborted it can just be sent to another worker

Cycle stealing Worker Master Worker

Benefits of cycle stealing Organisations can use their existing infrastructure to run computationally demanding applications – No need to invest in large SMP systems or clusters Large-scale internet projects can get free computing power – …provided they can convince users to donate CPU time – e.g. Cheap supercomputing Generally easy to deploy

So what really is grid computing? Not really one specific technology or concept More of an umbrella term, like “networking” or “operating system” Any (concrete) discussion of grid computing requires all parties involved to agree on a definition of what features they are focusing on Very much dependent upon what you want to do – different types of organisations have different requirements Sometimes the lines are blurred and numerous systems support multiple “types” of grid computing Lots of hype – can be very confusing at first! – it took me about a year to understand it enough to be able to figure out what I wanted to do in my project

Web services Web services are a particular type of SOA Based on standards from W3C and others: – WSDL - language for defining service interfaces – SOAP - format used for exchange of messages – UDDI - directory mechanism for locating services – XML - used as standard encoding mechanism used by WS protocols – … and many more Web services are supported by all major programming languages – either as part of built-in APIs or add-on libraries Today web services are the most popular mechanism for integrating systems together in and between organisations

Web service composition A programming model based on composing together functionality provided by multiple web services Similar to the use of shared libraries/DLL files – common functionality provided by shared entity (service) – composition program builds additional functionality by making use of one or more services Service composition programs can themselves be exposed as web services – Can then be accessed by clients – Or used as part of even higher-level service compositions Most popular language at present is BPEL (Business Process Execution Language)

SOA programming vs. remote execution Web services allow you to invoke programs already installed on a remote machine Remote code execution allows you to execute arbitrary code on a remote machine The latter is used for job submission and cycle stealing systems Our research investigates a combination of these approaches – Provides ability to invoke and expose web services – Provides a distributed execution environment

Execution Environments Problem: Need a standard way of executing arbitrary code remotely SOA doesn’t give you this – it only standardises the protocols for different applications to interact with each other Job submission systems don’t give you this – only standardise the means of submitting and monitoring jobs – but not how they are actually executed Cycle stealing requires this – existing cycle stealing systems these days typically specify Java or.NET, or use app-specific worker code – but there is no standard which allows us to do this on an internet scale

What is an execution environment Instruction set – e.g. x86, PPC, SPARC, Java bytecode,.NET bytecode API library – e.g. WIN32, POSIX, Java class libraries,.NET class libraries Applications are always compiled for a specific execution environment Can have different implementations of that environment – x86 - AMD, Intel – Java - Sun, IBM, various open source efforts –.NET - Microsoft, Mono project Applications compiled for a particular execution environment can run on any implementation of it

Virtual machines Common way of implementing an execution environment Abstracts away from underlying hardware/OS, providing platform independence In a grid containing machines of different CPU architectures and operating systems this is necessary to provide seamless access To enable code to be executed anywhere, each machine on the grid must provide the same execution environment Currently popular virtual machines: – Java Virtual Machine (JVM) –.NET Common Language Runtime (CLR)

A grid execution environment? Problem: No standard execution environment supported by the popular grid middleware Standardisation efforts (GGF) to date have focused only service interfaces, not implementation Each grid middleware system provides its own set of APIs, and is targeted at different VMs/OSs Applications are not yet portable between different middleware systems – At least not in the same sense that bytecode-compiled code is portable – Compatibility exists only at the service interface level

Standardisation? My belief: We won’t see the full potential of grid computing until we have agreement on a standard execution environment Currently only SOA aspects are standardised – But this goes only half way to solving the problem This is is very much an open research issue Obvious candidates are Java and.NET – But are they sufficient? Should they be extended? What about other alternatives? – Much research already done into VM technology – But not so much in the grid community – IMHO a very important issue! More research needed here

Standardisation? It’s just like the web Early web pages were static, as there was no support for executing code in the browser; code only ran server-side – In the grid world this corresponds to SOA Then came early versions of JavaScript/DHTML – Lack of standardisation, browsers were incompatible Now we have a standard, widely supported, platform independent execution environment on 300+ million computers worldwide (JavaScript/ECMAScript) – And look what happened… client side web apps, AJAX, Google maps, “Web 2.0” and the rest I predict grid will go through the same evolution

Our current research Investigating how to combine SOA and remote code execution programming models Development of a new virtual machine + language implementation targeted at grid applications GridXSLT An implementation of the XSLT programming language – Supports web service composition – Executes programs across a grid in parallel – Provides a natural way to deal with XML data

Why XSLT? Ideal for manipulating XML data Has a “semantic match” with many properties of web services Is a functional language and can be automatically parallelised W3C standard with a sizable existing user base – We wanted to avoid the challenges of trying to design a new language and introduce it to the world – Better to just develop a new implementation of an existing one which is already popular

Support for XML data XSLT is specifically designed for dealing with XML data All web services exchange data in XML format Java, C#, C++ etc. are less suitable for manipulating XML because they are not designed for this (and in fact pre-date XML) – XML data is a “second class citizen” in these languages and must be accessed through library functions or converted into objects – APIs like DOM, SAX, etc. are less intuitive than built-in language constructs – Conversion to objects carries significant overheads and risks losing information (e.g. element ordering) We argue that XSLT is therefore a useful approach to developing composite web services

Pass by value semantics Another mismatch between OO languages and web services is the way in which function arguments are handled OO languages use pass by reference semantics - allowing a function to modify its arguments and the caller to see those changes Web services use pass by value - where a new copy of each argument is made and a function can only transfer information to its caller through the return value When using an OO language for WS development, the programmer must be aware of this and it can sometimes lead to mistakes As a side effect-free functional language, XSLT uses pass by value, avoiding this problem

Parallel execution XSLT is a functional language Functions and loops do not have side effects - there is no global state that can be modified This enables automatic parallelisation – All arguments to a function call can be evaluated in parallel – All iterations of a loop can be evaluated in parallel The programmer never needs to even know that their program will be run in parallel – No dealing with threads, synchronisation, critical sections, message passing, race conditions etc… – The underlying runtime system deals with all these issues

Implementing XSLT We use a technique called graph reduction, a common way if implementing functional languages A program is represented as a graph Execution proceeds by performing a series of transformations on the graph

Graph reduction: @

Graph reduction: @

Graph reduction: 7 * 2

Graph reduction: 7 * 2

Graph reduction: Example 14

Parallel graph reduction Graph reduction permits the possibility of parallel execution by allowing multiple parts of the graph to be reduced in parallel Each processor in a parallel computer or cluster can manipulate a separate portion of the graph

Parallel graph reduction + (nprime 2000) nprime2001

Parallel graph reduction + (nprime 2000) nprime2001 Processor 1 Processor 2

Parallel graph reduction

Parallel graph reduction 34782

Functional programming for grids? It permits Automatic, seamless parallelism Automatic, seamless fault tolerance Automatic, seamless distribution But… Some programs are based on state, which is in conflict with the pure functional programming model – Although there are ways to get around this, e.g. monads Different programming style to what most people are used to – Involves a learning curve – But might be worth it to get the above benefits – …depending on your needs

Summary Grid computing is a very diverse area – Many different types of systems – Many different requirements – Useful in many areas Different “types” of grid computing – SOA, job submission, cycle stealing – Others as well that I haven’t discussed here Lots of challenges and open research questions – e.g. defining a suitable execution environment for grid applications – This is just one of many!

Summary Our research project - GridXSLT An attempt to combine different grid computing models – SOA – Remote code execution/cycle stealing Aims to make the programmer’s job easier – Parallelisation handled by the compiler – Suited to dealing with XML data exchanged by web services and stored in XML databases – High-level language which hides underlying details

Websites of interest Global Grid Forum – Grid Café (introduction to grid computing) – IBM - grid computing – GridXSLT – Updates on my research –