Presentation is loading. Please wait.

Presentation is loading. Please wait.

Http://www.grid2002.org Grid Technology Implications for ACES and SERVOGrid Brisbane Australia June 5 2003 Geoffrey Fox Marlon Pierce Community Grids.

Similar presentations


Presentation on theme: "Http://www.grid2002.org Grid Technology Implications for ACES and SERVOGrid Brisbane Australia June 5 2003 Geoffrey Fox Marlon Pierce Community Grids."— Presentation transcript:

1 Grid Technology Implications for ACES and SERVOGrid Brisbane Australia June Geoffrey Fox Marlon Pierce Community Grids Lab Indiana University

2 What is Grid Technology?
Grids support distributed collaboratories or virtual organizations integrating concepts from The Web Distributed Objects (CORBA Java/Jini COM) Globus Legion Condor NetSolve Ninf and other High Performance Computing activities Peer-to-peer Networks With perhaps the Web being the most important for “Information Grids” and Globus for “Compute Grids” Information Grids are basis of SERVOGrid Organization includes People, Computers, Observational Data and results of thought and data processing

3 Paradigms Protocols Platforms and Hosting
We can start from the Web view where the basic Grid paradigm is Meta-data rich Web Services communicating via messages These have some basic support from some runtime such as .NET, Jini (pure Java), Apache Tomcat+Axis (Web Service toolkit), Enterprise JavaBeans, WebSphere (IBM) or GT3 (Globus Toolkit 3) These are the distributed equivalent of operating system functions as in UNIX Shell Called Hosting Environment or platform

4 Taxonomy of Grid Functionalities
Name of Grid Type Description of Grid Functionality Compute/File Grid Run multiple jobs with distributed compute and data resources (Global “UNIX Shell”) Desktop Grid “Internet Computing” and “Cycle Scavenging” with secure sandbox on large numbers of untrusted computers Information Grid Grid service access to distributed information, data and knowledge repositories Complexity or Hybrid Grid Hybrid combination of Information and Compute/File Grid emphasizing integration of experimental data, filters and simulations Campus Grid Grid supporting University community computing Enterprise Grid Grid supporting a company’s enterprise infrastructure Note: Term Data Grid not used consistently in community so avoided

5 SERVOGrid Caricature Database Repositories Federated Databases
Closely Coupled Compute Nodes Analysis and Visualization Repositories Federated Databases Sensor Nets Streaming Data Loosely Coupled Filters

6 SERVOGrid (Complexity)Computing Model
Data Filter Data Filter OGSA-DAI Grid Services Analysis Control Visualize Grid Data Filter This Type of Grid integrates with Parallel computing Multiple HPC facilities but only use one at a time Many simultaneous data sources and sinks Grid Data Assimilation HPC Simulation Filter Data Other Grid and Web Services Distributed Filters massage data For simulation Data Filter SERVOGrid (Complexity)Computing Model

7 Taxonomy of Grid Operational Style
Name of Grid Style Description of Grid Operational or Architectural Style Semantic Grid Integration of Grid and Semantic Web meta-data and ontology technologies Peer-to-peer Grid Grid built with peer-to-peer mechanisms Lightweight Grid Grid designed for rapid deployment and minimum life-cycle support costs Collaboration Grid Grid supporting collaborative tools like the Access Grid, whiteboard and shared applications. R3 or Autonomic Grid Fault tolerant and self-healing Grid Robust Reliable Resilient R3

8 SERVOGrid Grid Requirements
Seamless Access to Data repositories and large scale computers Integration of multiple data sources including sensors, databases, file systems with analysis system Including filtered OGSA-DAI Rich meta-data generation and access with SERVOGrid specific Schema extending industry standards Portals with component model for user interfaces and web control of all capabilities Collaboration to support world-wide work Basic Grid tools: workflow and notification

9 What is a Web Service I A web service is a computer program running on either the local or remote machine with a set of well defined interfaces (ports) specified in XML (WSDL) In principle, computer program can be in any language (Fortran .. Java .. Perl .. Python) and the interfaces can be implemented in any way what so ever Interfaces can be method calls, Java RMI Messages, CGI Web invocations, totally compiled away (inlining) but The simplest implementations involve XML messages (SOAP) and programs written in net friendly languages like Java and Python Web Services separate the meaning of a port (message) interface from its implementation Enhances/Enables Re-usable component model of ANY electronic resource

10 Raw Resources Clients Raw Data Raw Data Render to XML Display Format
(Virtual) XML Data Interface Web Service (WS) WS WS WS etc. XML WS to WS Interfaces WS (Virtual) XML Knowledge (User) Interface Render to XML Display Format (Virtual) XML Rendering Interface Clients

11 What are System and Application Services?
There are generic Grid system services: security, collaboration, workflow, notification OGSA (Open Grid Service Architecture) is implementing these as extended Web Services An Application Web Service is a capability used either by another service or by a user It has input and output ports – data is from sensors or other services Consider Satellite-based Sensor Operations as a Web Service Satellite management (with a web front end) Each tracking station is a service Image Processing is a pipeline of filters – which can be grouped into different services Data storage is an important system service Big services built hierarchically from “basic” services Portals are the user (web browser) interfaces to Web services

12 Application Web Services
` Filter1 WS Filter2 WS Filter3 WS Workflow builds as multiple Filter Web Services Prog1 WS Prog2 WS or as multiple interdisciplinary Programs Data Analysis WS Simulation WS Visualization WS Note Service model integrates sensors, sensor analysis, simulations and people An Application Web Service is a capability used either by another service or by a user It has input and output ports – data is from users, sensors or other services Big services built hierarchically using workflow from “basic” services Sensor Data as a Web service (WS) Data Analysis WS Sensor Management WS Visualization WS Simulation WS

13 Grid Politics There is a Global Grid Forum meeting 3 times per year with about 700 attendees per meeting Exchange information and define standards for “everything” not done in W3C and OASIS e.g. Grid Service, Security, What is a Job, Database, Computer, How to build portals …. There is a large project called Globus developing software largely for “compute/file” Grids There are some 50 Grid projects (mainly in Europe and USA) developing software and applications as well as installing infrastructure Some are “deployment”: EDG NMI VDT ….. There are related initiatives called CyberInfrastructure (NSF USA) and e-Science (UK) There is a proposed OMII (Open Middleware Infrastructure Institute) – an international Alliance of separately funded projects with common coordination

14 OGSA/OGSI Top Level View
Web Services and OGSI Broadly applicable services: registry, authorization, monitoring, data access, etc., etc. Hosting Environment Models for resources & other entities More specialized services: data replication, workflow, etc., etc. Domain - specific services Other models Network OGSA is the set of “core” Grid services Stuff you can’t live without If you built a Grid you would need to invent these things

15 OGSI Open Grid Service Interface
It is a “component model” for web services. It defines a set of behavior patterns that each OGSI service must exhibit. Every “Grid Service” portType extends a common base type. Defines an introspection model for the service You can query it (in a standard way) to discover What methods/messages a port understands What other port types does the service provide? If the service is “stateful” what is the current state? A set of standard portTypes for Message subscription and notification Service collections Each service is identified by a URI called the “Grid Service Handle” GSHs are bound dynamically to Grid Services References (typically wsdl docs) A GSR may be transient. GSHs are fixed. Handle map services translate GSHs into GSRs.

16 Two-level Programming I
The paradigm implicitly assumes a two-level Programming Model We make a Service (same as a “distributed object” or “computer program” running on a remote computer) using conventional technologies C++ Java or Fortran Monte Carlo module Data streaming from a sensor or Satellite Specialized (JDBC) database access Such nuggets accept and produce data from users files and databases The Grid is built by coordinating such nuggets assuming we have solved problem of programming the nugget Nugget Data

17 Two-level Programming II
The Grid is discussing the linkage and distribution of the nuggets with the only addition runtime interfaces to Grid as opposed to UNIX data streams Familiar from use of UNIX Shell, PERL or Python scripts to produce real applications from core programs Such interpretative environments are the single processor analog of Grid Programming and this tends to be called workflow Workflow is the composition of multiple services (programs) together to make a new service Includes “Software Bus”, “Application Integration”, “Co-ordination Languages” etc. Nugget1 Nugget2 Nugget3 Nugget4

18 Workflow Workflow has at least 4 parts
“Programming Environment” – typically GUI to drag and drop services and their linkages (familiar from AVS etc. which was workflow for visualization) Language – from XML to extended Python Compiler – converting Language into executable Runtime controlling flow of information and notification events Can use Python, Mathematica, Matlab, JavaSpaces, IBM BPEL4WS, DoE CCA etc. Don’t think current systems are very near “what we will want” but expect much progress over next 3 years and plenty of systems to work with Metadata critical to tell you how to combine services in a sensible way – so workflow engines must interface with metadata service

19 e-Science and the Data Deluge Particle Physics
2006/7: First pp collisions at TeV energies at the Large Hadron Collider at CERN in Geneva ATLAS/CMS Experiments involve physicists from 200 organizations in US, EU, Asia Need to store,access, process, analyse 10 Petabytes/yr with 200 Teraflop/s distributed computation Building hierarchical Grid infrastructure to distribute data and computation Many 10’s of million $ funding for global particle physics Grid – GryPhyN, PPDataGrid, iVDGL, EU DataGrid, EU DataTag, UK GridPP projects Need Exabytes and Petaflop/s by 2015

20 Astronomy and its Data Deluge
Virtual Observatories – NVO, AVO, AstroGrid Store all wavelengths, need distributed joins NVO 500 TB/yr from 2004 Laser Interferometer Gravitational Observatory Search for direct evidence for gravitational waves LIGO 250 TB/yr, random streaming from 2002 VISTA Visible and IR Survey Telescope in 2004 250 GB/night, 100 TB/yr, Petabytes in 10 yrs New phase of astronomy, storing, searching and analysing Petabytes of data The total area of astronomical telescopes in m2, and CCDs measured in Gigapixels, over the last 25 years. The number of pixels and the data double every year.

21 Engineering, Chemistry, Environmental BioInformatics and Medical Applications
Real-Time Industrial Health Monitoring UK DAME project for Rolls Royce Aero Engines 1 GB sensor data/flight, 100,000 engine hours/day Combinatorial Chemistry – experiments on demand Earth Observation ESA satellites generate 100 GB/day NASA 15 PB by 2007 Bioinformatics Tens of TB of high value curated data Medical Images to Information 100 MB/mammogram, UK 3M/yr, US 26M/yr

22 Importance of Metadata
Metadata is ‘data about data’ e.g. cataloges, indices, directory structures Librarians work with books which have same basic ‘schema’ e.g. title, author(s), publisher, date, etc Need for hierarchical, community-based approach to defining metadata and schemas e.g. CML, SERVOGridML …….. Metadata important for interoperability of databases/federated archives, and for construction of intelligent search agents Grid and Semantic Web communities should provide core infrastructure for generating, storing and access to meta-data

23 Simulation Output as Digital Library
Digital Libraries usually for archiving of text, audio and video data Scientific data require transformation, data-mining and visualisation tools For distributed collaborations need simulation output to be available as new kind of digital library, complete with catalogues and finding aids as well as data itself

24 Emergence of a new research methodology?
Traditional scientific methodologies are theory and experiment Last half of 20th century saw emergence of scientific simulation as a third methodology This century will see emergence of a fourth methodology - collection-based research scientists will reduce, mine and sift data ‘published’ in ways not possible with paper journals with their tables and graphs

25 OGSA-DAI (Malcolm Atkinson Edinburgh) UK e-Science Grid Core Programme
Development of Data Access and Integration Services for OGSA - Access to XML Databases - - Access to Relational Databases - - Distributed Query Processing - - XML Schema Support for e-Science - Project details

26 DAI Key Services Integrated Structured Data Transport
GridDataService GDS Access to data & DB operations GridDataServiceFactory GDSF Makes GDS & GDSF GridDataServiceRegistry GDSR Discovery of GDS(F) & Data GridDataTranslationService GDTS Translates or Transforms Data GridDataTransportDepot GDTD Data transport with persistence Integrated Structured Data Transport Relational & XML models supported Role-based Authorisation Binary structured files (later)

27 Interface transparency: one GDS supports multiple database types
Relational database Client XML database Grid Data Service Client Client Directory / File system

28 Integration of Data and Filters
One has the OGSA-DAI Data repository interface combined with WSDL of the (Perl, Fortran, Python …) filter User only sees WSDL not data syntax Some non-trivial issues as to where the filtering compute power is Microsoft says filter next to data DB Filter WSDL Of Filter OGSA-DAI Interface

29 OGSA OGSI & Hosting Environments
Start with Web Services in a hosting environment Add OGSI to get a Grid service and a component model Add OGSA to get Interoperable Grid “correcting” differences in base platform and adding key functionalities OGSI on Web Services Broadly applicable services: registry, authorization, monitoring, data access, etc., etc. Hosting Environment for WS Models for resources & other entities More specialized services: data replication, workflow, etc., etc. Domain - specific services Other models Network OGSA Environment Possibly OGSA Not OGSA Given to us from on high

30 Permeating Principles and Policies
Meta-data rich Message-linked Web Services as the permeating paradigm “User” Component Model such as “Enterprise JavaBean (EJB)” or .NET. Service Management framework including a possible Factory mechanism High level Invocation Framework describing how you interact with system components. This could for example be used to allow the system to built from either W3C or GGF style (OGSI) Web Services and to protect the user from changes in their specifications. Security is a service but the need for fine grain selective authorization encourages Policy context that sets the rules for each particular Grid. Currently OGSA supports policies for routing, security and resource use. The Grid Fabric or set of resources needs mechanisms to manage them. This includes automatic recording of meta-data and configuration of software. Quality of service (QoS) for the Network and this implies performance monitoring and bandwidth reservation services. Challenging as end-to-end and not just backbone QoS is needed. Messaging systems like MQSeries from IBM provide robustness from asynchronous delivery and can abstract destination and allow customization of content such as converting between different interface specifications. Messaging is built on transport mechanisms which can be used to support mechanisms to implement QoS and to virtualize ports

31 Virtualization The Grid could and sometimes does virtualize various concepts Location: URI (Universal Resource Identifier) virtualizes URL Replica management (caching) virtualizes file location generalized by GriPhyn virtual data concept Protocol: message transport and WSDL bindings virtualize transport protocol as a QoS request P2P or Publish-subscribe messaging virtualizes matching of source and destination services Semantic Grid virtualizes Knowledge as a meta-data query Brokering virtualizes resource allocation Virtualization implies references can be indirect

32 Interfaces and Functionality and Semantics I
The Grid platform tries to minimize detail in protocols and maximize detail in interfaces to enhance scaling However rich meta-data and semantics are critical for correct and interesting operation Put as much semantic interpretation as you can into specific services Lack of Semantic interoperation is in fact main weakness of today’s Grids and Web services Everything becomes a service whether system or application level There are some very important “Global Services” Discovery (look up) and Registration of service metadata Workflow MetaSchedulers

33 Interfaces and Functionality and Semantics II
There are many other generally important services OGSA-DAI The Database Service Portal Service linked to by WSRP (Web services for Remote Portals) Notification of events Job submission Provenance – interpret meta-data about history of data File Interfaces Sensor service – satellites … Visualization Basic brokering/scheduling

34 Web Services as a Portlet
Each Web Service naturally has a user interface specified as “just another port” Customizable for universal access This gives each Web Service a Portlet view specified (in XML as always) by WSRP (Web services for Remote Portals) So component model for resources “automatically” gives a component model for user interfaces When you build your application, you define portlet at same time Application as a WS General Application Ports Interface with other Web Services Application or Content source WSDL Web Service W P S R User Face of Web Service WSRP Ports define WS as a Portlet Web Services have other ports (Grid Service) to be OGSI compliant

35 Online Knowledge Center built from Portlets
A set of UI Components Web Services provide a component model for the middleware (see large “common component architecture” effort in Dept. of Energy) Should match each WSDL component with a corresponding user interface component Thus one “must use” a component model for the portal with again an XML specification (portalML) of portal component

36 proxy credential manager, submission, monitoring
Sample page with several portlets: proxy credential manager, submission, monitoring

37 Administer Grid Portal
Provide information about application and host parameters Select application to edit

38 Categories of Worldwide Grid Services to be exploited by SERVOGrid
1) Types of Grid R3 Lightweight P2P Federation and Interoperability 2) Core Infrastructure and Hosting Environment Service Management Component Model Service wrapper/Invocation Messaging 3) Security Services Certificate Authority Authentication Authorization Policy 4) Workflow Services and Programming Model Enactment Engines (Runtime) Languages and Programming Compiler Composition/Development 5) Notification Services 6) Metadata and Information Services Basic including Registry Semantically rich Services and meta-data Information Aggregation (events) Provenance 7) Information Grid Services OGSA-DAI/DAIT Integration with compute resources P2P and database models 8) Compute/File Grid Services Job Submission Job Planning Scheduling Management Access to Remote Files, Storage and Computers Replica (cache) Management Virtual Data Parallel Computing 9) Other services including Grid Shell Accounting Fabric Management Visualization Data-mining and Computational Steering Collaboration 10) Portals and Problem Solving Environments 11) Network Services Performance Reservation Operations

39 What should SERVOGrid do ?
Make use of Grid technologies and architecture from around the world Coordinate with broad community through Global Grid Forum and OMII Decide on domain specific standards SERVOGridML Agree on particular approach within choices in international suite (use GT3 or not?, use portlets or not?, choose meta-data technology) and define SERVOGrid community practice Develop software system infrastructure and applications specific to solid earth science Worry about network interconnection between earthquake scientists and sensors

40 Proposed OMII Activities: Central Gaps: Gaps in Grid Styles and Execution Environment
Need for both robust (fault tolerant) and lightweight (suitable for small groups) Grid styles identified Peer-to-peer style supports smaller decentralized virtual organizations Note opportunities for modern middleware ideas to be used – lightweight, message-based Note that Enterprise JavaBeans not optimized for Science which has high volume dataflow Federated Grid Architecture natural for integration of heterogeneous functionality, style and security Bioinformatics and other fields require integration of Information and Compute/File Grids

41 Overlapping Heterogeneous
Dynamic light-weight Peer-to-peer Collaboration Training Grid Enterprise Grid Students Information Grid Compute Grid R2 R1 Campus Grid Teacher Overlapping Heterogeneous Dynamic Grid Islands

42 (b) Federated OGSA Grid
(a) Layered OGSA Grid Core Service Application OGSA Interface OGSA Mediation Core Service Appl. Grid-1 Grid-2 OGSA or non OGSA Interface-2 OGSA or non OGSA Interface-1 (b) Federated OGSA Grid

43 Many Gaps in Generic Services
Some gaps like Workflow and Notification are to make production versions of current projects Just in UK workflow from DAME, DiscoveryNet, EDG, Geodise, ICENI, myGrid, Unicore plus Cardiff, NEReSC …. RGMA and Semantic Grid offer improved meta-data and Information services compared to UDDI and MDS (Globus) Need comprehensive federated Information service Security requires architecture supporting dynamic fine-grain authorization UK e-Science has pioneered Information Grids but gap is continuation of OGSA-DAI, integration with other services and P2P decentralized models Functionality of Compute/File Grids quite advanced but services probably not robust enough for LCG or Campus Grids

44 Gaps in Other Grid services
Portals and User Interfaces – Noted gap that many not using Grid Computing Environment “best practice” with component based user-interfaces matching component-based middleware Programming Models (using workflow runtime) Fabric Management (should be integrated with central service management and Information system), Computational Steering, Visualization, Datamining, Accounting, Gridmake, Debugging, Semantic Grid tools (consistent with Information system), Collaboration, provenance Application-specific services Note new production central Infrastructure can support both research and production services of this type


Download ppt "Http://www.grid2002.org Grid Technology Implications for ACES and SERVOGrid Brisbane Australia June 5 2003 Geoffrey Fox Marlon Pierce Community Grids."

Similar presentations


Ads by Google