Presentation is loading. Please wait.

Presentation is loading. Please wait.

E-Business e-Science and the Grid Geoffrey Fox Professor of Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University.

Similar presentations


Presentation on theme: "E-Business e-Science and the Grid Geoffrey Fox Professor of Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University."— Presentation transcript:

1 e-Business e-Science and the Grid Geoffrey Fox Professor of Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 Chief Technologist for Anabas Corporation gcf@indiana.edu http://www.infomall.org http://www.grid2002.org

2 Grid Computing: Making The Global Infrastructure a Reality Based on work done in preparing book edited with Fran Berman and Anthony J.G. Hey, ISBN: 0-470-85319-0 Hardcover 1080 Pages Published March 2003 http://www.grid2002.org

3 e-Business e-Science and the Grid e-Business captures an emerging view of corporations as dynamic virtual organizations linking employees, customers and stakeholders across the world. The growing use of outsourcing is one example e-Science is the similar vision for scientific research with international participation in large accelerators, satellites or distributed gene analyses. The Grid integrates the best of the Web, traditional enterprise software, high performance computing and Peer- to-peer systems to provide the information technology infrastructure for e-moreorlessanything. A deluge of data of unprecedented and inevitable size must be managed and understood. People, computers, data and instruments must be linked. On demand assignment of experts, computers, networks and storage resources must be supported

4 So what is a Grid? Supporting human decision making with a network of at least four large computers, perhaps six or eight small computers, and a great assortment of disc files and magnetic tape units - not to mention remote consoles and teletype stations - all churning away. (Licklider 1960) Coordinated resource sharing and problem solving in dynamic multi-institutional virtual organizations Infrastructure that will provide us with the ability to dynamically link together resources as an ensemble to support the execution of large-scale, resource-intensive, and distributed applications. Realizing thirty year dream of science fiction writers that have spun yarns featuring worldwide networks of interconnected computers that behave as a single entity.

5 e-Science e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it. This is a major UK Program e-Science reflects growing importance of international laboratories, satellites and sensors and their integrated analysis by distributed teams CyberInfrastructure is the analogous US initiative Grid Technology supports e-Science and CyberInfrastructure

6 Global Terabit Research Network The Grid software and resources run on top of high performance global networks

7 Resources-on-demand Computing-on-demand uses dynamically assigned (shared) pool of resources to support excess demand in flexible cost-effective fashion Program A Computer 1 Program Z Computer 26 Program A Computer 27 Program Z Computer 52 Spares Pool Computer 1 Pool Computer N <52 Program A Program Z Static Assignment with redundancy Dynamic on-demand Assignment

8 e-Business and (Virtual) Organizations Enterprise Grid supports information system for an organization; includes “university computer center”, “(digital) library”, sales, marketing, manufacturing … Outsourcing Grid links different parts of an enterprise together (Gridsourcing) Manufacturing plants with designers Animators with electronic game or film designers and producers Coaches with aspiring players (e-NCAA or e-NFL etc.) Customer Grid links businesses and their customers as in many web sites such as amazon.com e-Multimedia can use secure peer-to-peer Grids to link creators, distributors and consumers of digital music, games and films respecting rights Distance education Grid links teacher at one place, students all over the place, mentors and graders; shared curriculum, homework, live classes …

9 e-Defense and e-Crisis Grids support Command and Control and provide Global Situational Awareness Link commanders and frontline troops to themselves and to archival and real-time data; link to what-if simulations Dynamic heterogeneous wired and wireless networks Security and fault tolerance essential System of Systems; Grid of Grids The command and information infrastructure of each ship is a Grid; each fleet is linked together by a Grid; the President is informed by and informs the national defense Grid Grids must be heterogeneous and federated Crisis Management and Response enabled by a Grid linking sensors, disaster managers, and first responders with decision support

10 Some Important Classes of Grids Computational Grids were origin of concepts and link computers across the globe – high latency stops this from being used as parallel machine Knowledge and Information Grids link sensors and information repositories as in Virtual Observatories or BioInformatics More detail on next slide Education Grids link teachers, learners, parents as a VO with learning tools, distant lectures etc. e-Science Grids link multidisciplinary researchers across laboratories and universities Community Grids focus on Grids involving large numbers of peers rather than focusing on linking major resources – links Grid and Peer-to-peer network concepts Semantic Grid links Grid, and AI community with Semantic web (ontology/meta-data enriched resources) and Agent concepts

11 Information/Knowledge Grids Distributed (10’s to 1000’s) of data sources (instruments, file systems, curated databases …) Data Deluge: 1 (now) to 100’s petabytes/year (2012) Moore’s law for Sensors Possible filters assigned dynamically (on-demand) Run image processing algorithm on telescope image Run Gene sequencing algorithm on compiled data Needs decision support front end with “what-if” simulations Metadata (provenance) critical to annotate data Integrate across experiments as in multi-wavelength astronomy Data Deluge comes from pixels/year available

12 2.4 Petabytes Today

13 Database Closely Coupled Compute Nodes Analysis and Visualization Repositories Federated Databases Sensor Nets Streaming Data Loosely Coupled Filters SERVOGrid – Solid Earth Research Virtual Observatory will link Australia, Japan, USA ……

14 In flight data Airline Maintenance Centre Ground Station Global Network Such as SITA Internet, e-mail, pager Engine Health (Data) Center DAME Rolls Royce and UK e-Science Program Distributed Aircraft Maintenance Environment ~ Gigabyte per aircraft per Engine per transatlantic flight ~5000 engines

15 NASA Aerospace Engineering Grid It takes a distributed virtual organization to design, simulate and build a complex system like an aircraft

16 Virtual Observatory Astronomy Grid Integrate Experiments RadioFar-InfraredVisible Visible + X-ray Dust Map Galaxy Density Map

17 e-Chemistry Laboratory Experiments-on-demand Grid Resources Grid-enabled Output Streams

18 CERN LHC Data Analysis Grid

19 Raw (HPC) Resources Middleware Database Portal Services System Services Application Service System Services User Services “Core” Grid Typical Grid Architecture

20 SERVOGrid Requirements Seamless Access to Data repositories and large scale computers Integration of multiple data sources including sensors, databases, file systems with analysis system Including filtered OGSA-DAI (Grid database access) Rich meta-data generation and access with SERVOGrid specific Schema extending openGIS (Geography as a Web service) standards and using Semantic Grid Portals with component model for user interfaces and web control of all capabilities Collaboration to support world-wide work Basic Grid tools: workflow and notification

21 Sources of Grid Technology Grids support distributed collaboratories or virtual organizations integrating concepts from The Web Agents Distributed Objects (CORBA Java/Jini COM) Globus, Legion, Condor, NetSolve, Ninf and other High Performance Computing activities Peer-to-peer Networks With perhaps the Web and P2P networks being the most important for “Information Grids” and Globus for “Compute Grids”

22 The Essence of Grid Technology? We will start from the Web view and assert that basic paradigm is Meta-data rich Web Services communicating via messages These have some basic support from some runtime such as.NET, Jini (pure Java), Apache Tomcat+Axis (Web Service toolkit), Enterprise JavaBeans, WebSphere (IBM) or GT3 (Globus Toolkit 3) These are the distributed equivalent of operating system functions as in UNIX Shell Called Hosting Environment or platform W3C standard WSDL defines IDL (Interface standard) for Web Services

23 A typical Web Service In principle, services can be in any language (Fortran.. Java.. Perl.. Python) and the interfaces can be method calls, Java RMI Messages, CGI Web invocations, totally compiled away (inlining) The simplest implementations involve XML messages (SOAP) and programs written in net friendly languages like Java and Python Payment Credit Card Warehouse Shipping control WSDL interfaces SecurityCatalog Portal Service Web Services

24 Services and Distributed Objects A web service is a computer program running on either the local or remote machine with a set of well defined interfaces (ports) specified in XML (WSDL) Web Services (WS) have many similarities with Distributed Object (DO) technology but there are some (important) technical and religious points (not easy to distinguish) CORBA Java COM are typical DO technologies Agents are typically SOA (Service Oriented Architecture) Both involve distributed entities but Web Services are more loosely coupled WS interact with messages; DO with RPC (Remote Procedure Call) DO have “factories”; WS manage instances internally and interaction- specific state not exposed and hence need not be managed DO have explicit state (statefull services); WS use context in the messages to link interactions (statefull interactions) Claim: DO’s do NOT scale; WS build on experience (with CORBA) and do scale

25 Details of Web Service Protocol Stack UDDI finds where programs are remote (distributed) programs are just Web Services (not a great success) WSFL links programs together (under revision as BPEL4WS) WSDL defines interface (methods, parameters, data formats) SOAP defines structure of message including serialization of information HTTP is negotiation/transport protocol TCP/IP is layers 3-4 of OSI Physical Network is layer 1 of OSI UDDI or WSIL WSFL WSDL SOAP or RMI HTTP or SMTP or IIOP or RMTP TCP/IP Physical Network

26 Education as a Web Service “Learning Object” XML standards already exist Web Services for virtual university include: Registration Performance (grading) Authoring of Curriculum Online laboratories for real and virtual instruments Homework submission Quizzes of various types (multiple choice, random parameters) Assessment data access and analysis Synchronous Delivery of Curricula including Audio/Video Conferencing and other synchronous collaborative tools as Web Services Scheduling of courses and mentoring sessions Asynchronous access, data-mining and knowledge discovery Learning Plan agents to guide students and teachers

27 Classic Grid Architecture Database Netsolve Computing Security Collaboration Composition Content Access Resources ClientsUsers and Devices Middle Tier Brokers Service Providers Middle Tier becomes Web Services

28 Some Observations “Traditional “ Grids manage and share asynchronous resources in a rather centralized fashion Peer-to-peer networks are “just like” Grids with different implementations of message-based services like registration and look-up Collaboration systems like WebEx/Placeware (Application sharing) or Polycom (audio/video conferencing) can be viewed as Grids Computers are fast and getting faster. One can afford many strategies that used to be unrealistic including rich usually XML based messaging Web Services interact with messages Everything (including applications like PowerPoint) will be a Web Service? Grids, P2P Networks, Collaborative Environments are (will be) managed message-linked Web Services

29 Peer to Peer Grid Database Peers Peer to Peer GridA democratic organization User Facing Web Service Interfaces Service Facing Web Service Interfaces Event/ Message Brokers

30 System and Application Services? There are generic Grid system services: security, collaboration, persistent storage, universal access OGSA (Open Grid Service Architecture) is implementing these as extended Web Services An Application Web Service is a capability used either by another service or by a user It has input and output ports – data is from sensors or other services Consider Satellite-based Sensor Operations as a Web Service Satellite management (with a web front end) Each tracking station is a service Image Processing is a pipeline of filters – which can be grouped into different services Data storage is an important system service Big services built hierarchically from “basic” services Portals are the user (web browser) interfaces to Web services

31 Satellite Science Grid Environment

32 What is Happening? Grid ideas are being developed in (at least) two communities Web Service – W3C, OASIS Grid Forum (High Performance Computing, e-Science) Service Standards are being debated Grid Operational Infrastructure is being deployed Grid Architecture and core software being developed Particular System Services are being developed “centrally” – OGSA framework for this in Lots of fields are setting domain specific standards and building domain specific services There is a lot of hype Grids are viewed differently in different areas Largely “computing-on-demand” in industry (IBM, Oracle, HP, Sun) Largely distributed collaboratories in academia

33 OGSA OGSI & Hosting Environments Start with Web Services in a hosting environment Add OGSI to get a Grid service and a component model Add OGSA to get Interoperable Grid “correcting” differences in base platform and adding key functionalities OGSI on Web Services Broadly applicable services: registry, authorization, monitoring, data access, etc., etc. Hosting Environment for WS More specialized services: data replication, workflow, etc., etc. Domain- specific services Network OGSA Environment Possibly OGSA Not OGSA Given to us from on high

34 Technical Activities of Note Look at different styles of Grids such as Autonomic (Robust Reliable Resilient) New Grid architectures hard due to investment required Critical Services Such as –Security – build message based not connection based –Notification – event services –Metadata – Use Semantic Web, provenance –Databases and repositories – instruments, sensors –Computing – Submit job, scheduling, distributed file systems –Visualization, Computational Steering –Fabric and Service Management –Network performance Program the Grid – Workflow Access the Grid – Portals, Grid Computing Environments

35 Issues and Types of Grid Services 1) Types of Grid –R3 –Lightweight –P2P –Federation and Interoperability 2) Core Infrastructure and Hosting Environment –Service Management –Component Model –Service wrapper/Invocation –Messaging 3) Security Services –Certificate Authority –Authentication –Authorization –Policy 4) Workflow Services and Programming Model –Enactment Engines (Runtime) –Languages and Programming –Compiler –Composition/Development 5) Notification Services 6) Metadata and Information Services –Basic including Registry –Semantically rich Services and meta-data –Information Aggregation (events) –Provenance 7) Information Grid Services –OGSA-DAI/DAIT –Integration with compute resources –P2P and database models 8) Compute/File Grid Services –Job Submission –Job Planning Scheduling Management –Access to Remote Files, Storage and Computers –Replica (cache) Management –Virtual Data –Parallel Computing 9) Other services including –Grid Shell –Accounting –Fabric Management –Visualization Data-mining and Computational Steering –Collaboration 10) Portals and Problem Solving Environments 11) Network Services –Performance –Reservation –Operations

36 Data Technology Components of (Services in) a Computing Grid 1: Job Management Service (Grid Service Interface to user or program client) 2: Schedule and control Execution 1: Plan Execution4: Job Submittal Remote Grid Service 6: File and Storage Access 3: Access to Remote Computers Data 7: Cache Data Replicas 5: Data Transfer 10: Job Status 8: Virtual Data 9: Grid MPI

37 Conclusions Grids are inevitable and pervasive Can expect Web Services and Grids to merge with a common set of general principles but different implementations with different scaling and functionality trade-offs Enough is known that one can start today We will be flooded with data, information and purported knowledge One should be preparing Grid strategies; understanding relevant Web and Grid standards and developing new domain specific standards Note many existing (standards) efforts assume client- server and not a brokered service model; these will need to change!

38 Grid Computing: Making The Global Infrastructure a Reality Fran Berman, Anthony J.G. Hey, Geoffrey Fox ISBN: 0-470-85319-0 Hardcover 1080 Pages Published March 2003 http://www.grid2002.org


Download ppt "E-Business e-Science and the Grid Geoffrey Fox Professor of Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University."

Similar presentations


Ads by Google