An Evolutionary Approach to Realizing the Grid Vision

An Evolutionary Approach to Realizing the Grid Vision
Marvin Theimer, Savas Parastatidis, Tony Hey Microsoft Corporation Marty Humphrey University of Virginia Geoffrey Fox Indiana University

OGSA and related working groups are already doing many things
Broad architectural exploration Identification and work on horizontal areas Specific specs, such as BES and JSDL

Propose that they also start doing “vertical profiles”
What’s the value of these? Precise descriptions of how to interoperably implement specific use cases “Close the design loop”: identify issues and holes in existing activities Suggest they should be done for all (major) use cases This talk covers a specific instance: Basic HPC use case: Remotely executing programs on compute resources managed by a batch job scheduling service.

Approach Taken Start from “first principles”
Most likely to flush out issues Not intended to ignore or denigrate existing work Mapping onto existing work represents a follow-up step Evolutionary, incremental approach Start with a minimalist design Enables broadest level of interop among existing systems Lowest level of entry cost Incrementally add functionality in an evolutionary manner based on extensible, composable services Enables specific system instances to only implement/export the functionality they need Support for evolution is required in any case Avoid “silo” effect by Publishing early and often Public debates and cross-fertilization in the context of the overall OGSA architecture and related activities Emphasis on composability

Key design principles Composable, service-oriented approach: Design infrastructure to enable selective use of composable, extensible services Narrowly scoped interaction protocols: Easier to define, implement, combine, and evolve Align with existing infrastructure and industry initiatives: Don't reinvent the wheel and don’t get run over Avoid controversial solutions when possible Acknowledge and exploit boundaries: Don’t need global standards (at least initially) for things that can be bounded to local scopes

Basic grid architecture for HPC
Use case: batch job scheduling Simplest base case: Job scheduling service for compute resources within an organization Specify what kinds of resources are needed; execute a program on resources obtained Incremental extensions: Provisioning (i.e. program installation & cleanup) Suspension Migration … Needed functionality: “Standard” distributed system infrastructure that is not Grid-specific Distributed communications (WS-I) Security Systems management Job scheduling support: i.e. a standard way of interacting with fungible resources Resource reservation Provisioning Execution of programs Data transfer support May be large in volume A means to discover available services

Leveraging existing standards
Lots of non-Grid-specific things are already being pursued by the wider IT industry Distributed communications Security … Should leverage these industry initiatives Don’t reinvent the wheel Don’t get run over An example of something not needed for the basic HPC use case: Identity/naming support Basic DNS-based names have been very successful and should be used Higher-level naming is still an area in which there is not a single universally accepted solution Avoid the controversy by deferring the issue Expect it to be added as the HPC design evolves and expands

Job scheduling Principle of factorization: split resource reservation, provisioning, and execution into separate composable pieces. Examples of benefits gained: QoS and SLAs via repeated use of reserved resources Support for a variety of different provisioning scenarios: pre-installed, “file-copy”, “installer/uninstaller, … Amortization of provisioning overheads Support for interactive and debugging scenarios Interop among existing schedulers: Can’t expect existing schedulers to implement the superset of all their functionalities Principle of minimal, evolutionary design: Identify common functional denominators Define simplest possible “universal” scheduling base “Extension” profiles for well-defined, composable functionalities that clients and schedulers selectively interoperate on Two examples: “migration” and “suspension” of programs Must be careful to make framework plus extensions properly composable

Data transfer Base case requires the following things:
Means of transferring data between two service endpoints Means of specifying how the transferred data should be interpreted Data format (e.g. files & directories vs. row-sets and tables) Where in the receiving party’s storage name space the data should be placed and how programs can access it These should be separate, composable capabilities Example of minimal, evolutionary design principle: no need for a global file system for the base case Not strictly necessary Has significant additional complexity; hence would reduce scope for interop Could eventually become a part of an evolved, extensible HPC design

Discovering HPC services
Not strictly necessary: plenty of existing job scheduling systems that rely on clients simply “knowing” about them Well-known DNS name(s) Static config files Simple directory services enable significantly more powerful discovery paradigm: rendezvous point for dynamic discovery Examples of existing infrastructures already in use: UDDI and LDAP Proposed approach: Minimal base case should NOT include any discovery services Incremental addition: add support for one – or more – of the existing directory service infrastructures Reduced cost of entry Increased scope for interop: people are already using (and have invested in) these

System management Fundamental part of any distributed system, including Grid systems Simplest HPC base case: Job scheduling may occur between organizations System management restricted to scope of resources managed by a job scheduling service  Opaque to clients Don’t need a system management standard Support for virtual organizations will require global system management standards Virtual organizations should be part of an extended HPC use case Avoids controversy by postponing system management questions until the wider industry settles on standards for this area Enables wider scope for interop

Using Web Services Protocols
This is the way the industry is going; therefore Grid must go the same way Not all core Web services specs have settled down Still in the process of becoming standardized Competing sets Non-controversial specs: WS-I WS-Addressing … Controversial specs: WSRF vs WS-Transfer et al. WS-Notification vs WS-Evening WSDM vs WS-Management HPC design focus should identify the “abstract” operations needed; e.g. “QueryJob” Defer the issue of how to map such operations to competing specifications “WS-I profile” Could also choose to do two separate “sub-profiles” that map to the relevant competing protocol specs

Conclusions Propose the definition of “vertical profiles” that map protocols and services onto use cases Advocate an evolutionary approach Start with a minimal design Incrementally expand via narrowly-scoped, extensible, composable service interaction protocols Advocate using only well-established Web services protocols and deferring use of controversial ones until the wider industry sorts them out Have illustrated our design approach with the specific instance of a basic HPC use case: Use case: batch job scheduling Outlined design: Relies on existing industry standards, infrastructure, and initiatives Introduces factored job scheduling and data transfer protocols/services Bounds and defers the topic of system management

An Evolutionary Approach to Realizing the Grid Vision

Similar presentations

Presentation on theme: "An Evolutionary Approach to Realizing the Grid Vision"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

An Evolutionary Approach to Realizing the Grid Vision

Similar presentations

Presentation on theme: "An Evolutionary Approach to Realizing the Grid Vision"— Presentation transcript:

Similar presentations

About project

Feedback