Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design and Evaluation of an Autonomic Workflow Engine Thomas Heinis, Cesare Pautasso, Gustavo Alsonso Dept. of Computer Science Swiss Federal Institute.

Similar presentations


Presentation on theme: "Design and Evaluation of an Autonomic Workflow Engine Thomas Heinis, Cesare Pautasso, Gustavo Alsonso Dept. of Computer Science Swiss Federal Institute."— Presentation transcript:

1 Design and Evaluation of an Autonomic Workflow Engine Thomas Heinis, Cesare Pautasso, Gustavo Alsonso Dept. of Computer Science Swiss Federal Institute of Technology (ETHZ) The 2 nd IEEE International Conference on Autonomic Computing (UCAC-05) March 15th, 2008 Seo, Dongmahn

2 2/47 Contents Introduction System Background System Architecture Autonomic Capabilities System evaluation Conclusion

3 3/47 Contents Introduction Introduction System Background System Architecture Autonomic Capabilities System evaluation Conclusion

4 4/47 Introduction Motivation Related Work Contribution

5 5/47 Motivation Workflow management systems e-commerce virtual laboratories DNA sequencing scientific computing Grid computing idea of process-based Web service composition

6 6/47 Motivation (cont.) Workflow engines open environment unknown workload difficult to choose a centralized solution a distributed implementation of the engine problem of configuring the system in an optimal way NOT feasible solution considering the number of parameters involved the variability of the workload having a system administrator in charge of manually monitoring reconfiguring the system

7 7/47 Related Work Decentralization of workflow process execution important area of research support business processes lead to higher scalability introduces several problems lack of a global view over the process scalability and reliability problems per se To address the problem GOLIAT,autonomic computing techniques, self-optimizing computer systems autonomic computing principles in the context of distributed workflow engines

8 8/47 Contribution Goal self-tuning self-configuration capabilities self-healing capabilities

9 9/47 Contribution (cont.) System extension to the JOpera engine Java based service composition tool combines a workflow engine with an open architecture to provide support for Web service composition, Grid computing and specialized workflow engines flexible architecture, components Key system modules can be replicated to handle large workloads. Other modules can be paired with a backup to achieve fault tolerance. The autonomic controller can be configured by selecting different reconfiguration strategies.

10 10/47 Contribution (cont.) the key contributions of the paper the novel system architecture generic can be adopted by many engines operating under different models and languages the resulting scalability and fault tolerance flexible enough to support the very large loads present in computational applications and large scale Web service composition the independence of the underlying workflow model easily extensible to support many different kinds of services

11 11/47 Contents Introduction System Background System Background System Architecture Autonomic Capabilities System evaluation Conclusion

12 12/47 System Background Requirements Workload Assumptions Deployment Environment

13 13/47 Requirements the workflow execution engine to support autonomic behavior must feature self-configuration, self-tuning and self healing capabilities Self-configuration switching the systems configuration on the fly without manual intervention and disrupting the system requires the workflow execution engine to support dynamically and efficiently change the configuration

14 14/47 Requirements (cont.) self-tuning system reconfiguration to optimal given the current workload the workflow engine must give access to its internal state control algorithms can analyze current and past performance information to plan configuration changes in respose to the current workload assumption the characteristics of the workload affect the systems performance the self-tuning algorithm can optimally adapt the system to the workload by monitoring key performance indicators

15 15/47 Requirements (cont.) self-healing able to detect configuration changes due to external events failures of nodes recovery action requires mechanisms for detecting failures and configuration changes of the cluster to query the workflow execution state

16 16/47 Workload Assumptions the workload is assumed to be a collection of concurrent workflow processes a worst case scenario not deal with workload prediction issues future work

17 17/47 Deployment Environment [Assumption] JOpera runs on a dedicated cluster of computers can use these resources exclusively main goal of the autonomic features to ensure the optimal configuration of the cluster efficient resource utilization good allocation of the available nodes to the different system components cluster configuration is NOT static the system could be extended to use shared nodes that are also used for other purposes.

18 18/47 Contents Introduction System Background System Architecture System Architecture Autonomic Capabilities System evaluation Conclusion

19 19/47 System Architecture Workflow Execution Distributed Workflow Execution Scalable Workflow Execution

20 20/47 Workflow Execution Workflow processes model interactions btw different tasks by defining the data flow and control flow btw them

21 21/47 Distributed Workflow Execution

22 22/47 Scalable Workflow Execution scalability bottleneck use several layers of caching btw tuple space and threads producing and consuming tuples

23 23/47 Contents Introduction System Background System Architecture Autonomic Capabilities Autonomic Capabilities System evaluation Conclusion

24 24/47 Autonomic Capabilities Self-Tuning Information Strategy Optimization Strategy Selection Strategy Self-Configuration Reconfiguration Actions Self-Healing

25 25/47 Self-tuning Information Strategy detect imbalances in the systems configuration to sample the current space size Optimization Strategy to establish a configuration such that the number of navigator and dispatcher threads is balanced Selection Strategy prioritizing nodes according to how well suited they are for a configuration change

26 26/47 Self-Configuration a closed feedback-loop controller Reconfiguration Actions Starting Threads the JOpera API Stopping Navigator Threads migrating the state of the processes the navigator thread is working on and redirecting associated events by flushing the locally cached state into the global tuple space

27 27/47 Self-Configuration (cont.) Stooping Dispatcher Threads more difficult task may involve the invocation of a local application or the interaction with a remote service provider on the Web metadata kill method immediately stops all active task executions ensures all task invocations will be repeated on a differend dispatcher thread stop method immediately ceases to take tuples from the task space

28 28/47 Self-Healing periodically monitors the nodes of the cluster Handling Dispatcher Thread Failures the task that were managed by it are lost and have to be restarted very similar to self-configuration component kills a dispatcher Handling Navigator Thread Failures the state of the execution of the process is still the available in the global process execution state space simply removing their entries in the tuple routing table which point to the failed navigator

29 29/47 Contents Introduction System Background System Architecture Autonomic Capabilities System evaluation System evaluation Conclusion

30 30/47 System evaluation Experimental Setup Base line Autonomic Behavior Self-Configuration Reconfiguration Overhead Self-Healing Discussion

31 31/47 Experimental Setup a cluster of up to 20 nodes 1.0GHz dual P-III, 1GB of RAM, Linux (Kernel version ) and Suns Java Development Kit version one additional node the global tuple space server IBMs T-Spaces v2.1.3

32 32/47 Base Line two different workloads 1000 concurrent processes containing 10 parallel tasks of duration of 0 seconds (workload 0) 1000 processes containing 10 parallel tasks of duration of 20 seconds (workload 20) total 15 nodes 14 navigators and 1 dispatcher up to 14 dispatchers and 1 navigator

33 33/47 Base Line (cont.)

34 34/47 Base Line (cont.)

35 35/47 Autonomic Behavior Self-Configuration

36 36/47 Autonomic Behavior (cont.)

37 37/47 Autonomic Behavior (cont.)

38 38/47 Autonomic Behavior (cont.) Reconfiguration Overhead

39 39/47 Self-Healing initially to use 15 nodes to replace 5 of the nodes assigned workload consists of four peaks of 500 processes occurring every 100 seconds each of the processes consist of 10 parallel tasks of 10 seconds duration change nodes grow to 20 nodes at t=90 reduced by 5 nodes at t = 140 again by 5 nodes at t=230

40 40/47 Self-Healing (cont.)

41 41/47 Self-Healing (cont.)

42 42/47 Self-Healing (cont.)

43 43/47 Self-Healing (cont.)

44 44/47 Discussion to find an optimal static configuration for a given workload very difficult different characteristics lead to different optimal configurations autonomic controller was able to adapt the configuration of the workflow engine according to the variable characteristics of the workload self-healing experiment common situation in the lifetime of a cluster-based system

45 45/47 Contents Introduction System Background System Architecture Autonomic Capabilities System evaluation Conclusion Conclusion

46 46/47 Conclusion the design of an autonomic workflow engine demonstrated its self-managing behavior and evaluated its performance show how to apply the autonomic computing paradigm to greatly simplify the deployment and the maintenance of such systems homogeneous workload more complex characteristics as part of future work

47 47/47


Download ppt "Design and Evaluation of an Autonomic Workflow Engine Thomas Heinis, Cesare Pautasso, Gustavo Alsonso Dept. of Computer Science Swiss Federal Institute."

Similar presentations


Ads by Google