Presentation on theme: "Page 1 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Geospatial Service Workflow Concepts and Tools Liping Di Laboratory."— Presentation transcript:
Page 1 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Geospatial Service Workflow Concepts and Tools Liping Di Laboratory for Advanced Information Technology and Standards (LAITS) George Mason University email@example.com
Page 2 LAITS Laboratory for Advanced Information Technology and Standards Contents What are Service oriented architecture and web services? What is a workflow tool? What does it do? Why do we need one in the Grid? What are some common workflow tools used by the Grid community and web service community?
Page 3 LAITS Laboratory for Advanced Information Technology and Standards The Service-Oriented Architecture (SOA) The key component in the service-oriented architecture is services A service is a well-defined set of actions. It is self-contained, stateless, and does not depend on the state of other services. Stateless means that each time a consumer interacts with a Web Service, an action is performed. After the results of the service invocation have been returned, the action is finished. There is no assumption that subsequent invocations are associated with prior ones. In the service-oriented architecture, the description of a service is essentially a description of the messages that are exchanged between the consumer and the service. Standard-based individual services can been chained together to solve complex tasks. The implementation of SOA in the web environment is called Web services.
Page 4 LAITS Laboratory for Advanced Information Technology and Standards Web Services Web Services are self-contained, self-describing, modular applications that can be published, located, and dynamically invoked across the Web. Web services perform functions, which can be anything from simple requests to complicated business processes. Once a Web service is deployed, other applications (and other Web services) can discover and invoke the deployed service. The real power of web services relies on –Everyone on the Internet can set up a web service to provide service to anyone who wantsmany services will be available. –The standard-based services can be chained together dynamically to solve complicated tasks – Just in-time integration.
Page 5 LAITS Laboratory for Advanced Information Technology and Standards Globus Toolkit 3.0 (GT3) -- OGSA, OGSI, and GT3 the architect the engineer the workers
Page 6 LAITS Laboratory for Advanced Information Technology and Standards Difference between Web Service and Open Grid Service Globus 3.0 implemented the Open Grid Service Architecture. The fundamental concepts of services in the Grid are the same as Web services. The differences between Grid and Web services include –A Web service can be invoked by any consumer over the Web while a Grid service can only be invoked by consumers within the virtual organization, similar to the difference between Internet and Intranet. –Web services practice has been extended in Grid to accommodate the additional requirements of Grid services Stateful interactions between consumers and services Exposure of a web services publicly visible state Access to (possibly large amounts of) identifiable data Service lifetime management Currently the Grid and Web communities are merging through the Web Service Resource Framework (WSRF).
Page 7 LAITS Laboratory for Advanced Information Technology and Standards Service operations
Page 8 LAITS Laboratory for Advanced Information Technology and Standards Service Operations Publish – advertise (or remove) data and services to a broker (e.g., a registry, catalog or clearinghouse). Find – Service requestors and service brokers collaborate to perform the find operation. Service requestors describe the kinds of services theyre looking for to the broker and the broker delivers the results that match the request. Bind – A service requestor and a service provider negotiates as appropriate so the requestor can access and invoke services of the provider. Chain – The chain operation binds a sequence of services.
Page 9 LAITS Laboratory for Advanced Information Technology and Standards Service Chaining A Service Chain is defined as: a sequence of services where, for each adjacent pair of services, occurrence of the first action is necessary for the occurrence of the second action. When services are chained, they are combined in a dependent series to achieve larger tasks. Three types of chaining defined in ISO 19119 and OGC: User-defined (transparent) – the Human user defines and manages the chain. Workflow-managed (translucent) – the Human user invokes a service that manages and controls the chain, where the user is aware of the individual services in the chain. Aggregate (opaque) – the Human user invokes a service that carries out the chain, where the user has no awareness of the individual services in the chain.
Page 10 LAITS Laboratory for Advanced Information Technology and Standards Construction of Service Chains The first type of chaining allows users to construct a geospatial model to be run in the system –Require domain knowledgefor expert to contribute their domain knowledge. –The knowledge is kept in the Geo-tree/service chain. The second type of chaining basically is to use existing geo-tree to materialize a virtual object. –Anyone can use this type of chaining to produce a virtual product on demand. –Anyone can use but it is not able to produce a product whos geo-tree doesnt already exist in a data/information system. The third type of chaining require the system to be intelligent enough to automatically form a geo-tree/service chain by decomposing users query. –require the domain knowledge –require the automated reasoning. –Anyone can use and can produce a new product based on users query automatically. The first two types of chains do not require significant machine intelligence. –Current technology is enough for implementing such chaining approach. The third one requires significant machine intelligence –Current technologies are not able to provide such kind of chaining. –Significant research is needed.
Page 11 LAITS Laboratory for Advanced Information Technology and Standards Workflows and workflow tools What we mean: –The executable scripts representing the service chains. –The total composition and orchestration of an experimental run, including all the details of post-processing, data-mining, visualization. –What the high-end user (scientist) needs to do in order to get the underlying computational code to produce accessible and usable results somewhere. –What in the past was usually done through shell-scripting, but more (e.g., rpcs). –Previous examples: not a single workflow, but a number of decoupled, cooperating, communicating workflows. Workflows, in most cases, are encoded in BPEL4WS, a OASIS standard. Any tools dealing with creation, management, and execution of workflows are called workflow tools. –The most significant one is the workflow engines that manage the execution of workflows.
Page 12 LAITS Laboratory for Advanced Information Technology and Standards Steps from Geospatial process model to a user defined product (User geo-object) Geospatial Model Virtual geo-object Logical Workflow Concrete Workflow execution user geo-object Knowledge Capture phase User query Phase User retrieval phase
Page 13 LAITS Laboratory for Advanced Information Technology and Standards Availability of Workflow Tools for Geospatial Services Tools are needed for every steps from the creation of geospatial models to the materialization of virtual geospatial products. General workflow tools are being developed both in Grid and Web service communities. Most of the tools are not tested in geospatial environment.
Page 14 LAITS Laboratory for Advanced Information Technology and Standards Workflow Tools built by GriPhyN Using Virtual Data Language (VDL) from Globus team to encode both abstract and concrete workflows. Build an abstract workflow based on VDL descriptions (Chimera) Build an executable workflow based on the abstract workflows (Pegasus) Execute the workflow (Condors DAGMan) Those tools run under Globus 2
Page 15 LAITS Laboratory for Advanced Information Technology and Standards Alliance Science Portal Expedition Workflow Tools Development Objective –Provide a workflow tool (engine + interface) through which all of this can be accomplished without any knowledge of: XML Jython, Java, or any particular PL –Provide a tool which is reusable in the sense of not being specific to any one scientific research domain Approach 1. Templated Patterns (a repertoire of pre-defined, parameterized workflow scripts) Just as with designing software systems in general... High-level (Sequence, Branch/Merge, Parallel,...) Extend these down several levels, e.g.: –STAGE = [ make dirs, get files, set permissions ] 2. An Environment through which the high-level user can create and manipulate workflow scripts.
Page 16 LAITS Laboratory for Advanced Information Technology and Standards O.G.R.E.: An Extension to Apache Ant O.G.R.E. = Open Grid Computing Environments Runtime Engine What Ant lacked, but we needed: 1.Broader conditional execution, Ant: based on write-once String properties. 2.A general loop structure for Task execution. 3.Data-communication between Tasks (and with their containers). 4.Specialized tasks 1.File reading and writing 2.Local and remote file management (gridftp) 3.Web service related tasks 4.Event- and process-monitoring-tasks
Page 17 LAITS Laboratory for Advanced Information Technology and Standards Workflow Execution Engines in Web Services We are examining two workflow execution engines –IBM BPWS4J –//http://www.alphaworks.ibm.com/tech/bpws4j –The Collaxa BPEL Server The IBM BPWS4J is a free software while Collaxa BPEL server is commercial software. –Collaxa BPEL Server, Developer Edition $2K per developer –Collaxa BPEL Server, Enterprise Edition $20K per CPU Both Engines work under web service environment. Questions need to be answered: –Are the engines good enough for geospatial Grid/Web services? –Can make those engine works under Grid environment? –What is the evolution of Grid Workflow standards and the execution engine?
Page 18 LAITS Laboratory for Advanced Information Technology and Standards BPWS4J -- The BPEL Engine for Execution What is BPWS4J? The IBM Business Process Execution Language for Web Services Java Runtime provides a platform upon which business processes written using BPEL4WS may execute. BPWS4j-engine-2.0 version supports the BPEL4WS v1.1 specification. How does it work? For each process, the engine takes in a BPEL4WS document which describes the process, a WSDL document (without binding information) which describes the interface that the process will present to clients, and WSDL documents (with binding information) which describe the services that the process may/will invoke during its execution. After deployment the process will be made available to outside consumers through a SOAP interface. The engine has been tested on WebSphere Application Server 5.0 and on Apache Tomcat under both Linux and Windows. ** Note: This and the next slide are from BEPL4J documentation.
Page 19 LAITS Laboratory for Advanced Information Technology and Standards Developing and Deploying a Process Step 1: Create a BPEL4WS document and the corresponding WSDL document. The WSDL document describes the interface of the process that will be presented to the outside world. (This includes the description of all receive and onMessage elements.) The WSDL document should not contain any bindings; the SOAP binding will be added by the engine during deployment. One service element must be present within the WSDL file (the name of the process is taken from the name attribute on the service element.) Step 2: If the process invokes another Web service (i.e. if the process contains an invoke activity), then create/obtain the WSDL document(s) that describe the service which is to be invoked. These WSDL documents must have bindings and endpoint information that describe where and how the service may be invoked. The engine supports SOAP, EJB, JMS, and direct Java class bindings. Step 3: Deploy the process to the engine. When deploying the process, you will need to specify the WSDL documents which fulfill the partner roles. Step 4: Create the SOAP client. The client interaction with the service is defined by the process's WSDL document that you provided during deployment. Additional Notes: All imports within the WSDL documents must be absolute. If you are deploying on Tomcat and have WSDL documents which have imports, you must make sure that you have defined the.wsdl extension and text/xml MIME type to Tomcat, otherwise it will complain about not being able to resolve the imports. You can do so either by modifying the conf/web.xml under Tomcat, or by modifying the WEB-INF/web.xml file within your WAR file. See the web.xml file in the engine's WAR for an example.
Page 20 LAITS Laboratory for Advanced Information Technology and Standards The Collaxa BPEL Server Native BPEL 1.1 Implementation Easy-to-Use Modeling Tool Rich and Flexible Binding Framework (Web Services but also JCA, JMS, Email, EDI) Unparalleled Management and Monitoring (In-flight Instance Management, Auditing, Debugging) High Performance and Scalability (Throughput, Clustering, Large XML Documents) Easy-to-deploy/Non-intrusive (Get up and running in less than 15 minutes)
Page 21 LAITS Laboratory for Advanced Information Technology and Standards The Collaxa BPEL Server JAVA PLATFORM BPEL Eclipse BPEL DESIGNER DESIGN BPEL TaskService TASKS, PORTAL BPEL CONSOLE MONITOR JCAJMSEmail WSDL BINDING FRAMEWORK CONNECT BPEL SERVER DEHYDRATE