Presentation on theme: "Towards Intelligent Workflow Planning for Neuroimaging Analyses Irfan Habib, Ashiq Anjum, Peter Bloodsworth, Richard McClatchey Centre for Complex Cooperative."— Presentation transcript:
Towards Intelligent Workflow Planning for Neuroimaging Analyses Irfan Habib, Ashiq Anjum, Peter Bloodsworth, Richard McClatchey Centre for Complex Cooperative Systems, BIT, University of the West of England, Bristol
Introduction Recent progress in neuroimaging techniques and data formats has led to an explosive growth in neuroimaging data Analysis of this data can facilitate research in neuro- degenerative diseases.
Neuroimaging datasets are generally processed through Neuroimaging pipelines
CIVET produces 1100% more data than it consumes, and intermediate data usage is more than 4000%. Without optimisation runtime of a single workflow is 8 hrs
CIVET Pipeline 85% of All Tasks in CIVET execute in less than 512 secs
CIVET Pipeline These 85% of tasks in CIVET perform just 8% of the computation
Existing Approaches State-of-the-art approaches for workflow planning include: Data-based Methods: Data elimination, data diffusion Task-based Approaches: Task Clustering Scheduling-based Approaches
Task Clustering Normalised Workflow turnaround time (with respect to standard CIVET on SGE Cluster) CIVET
Task Clustering Normalised Cumulative Data Retrieval (with respect to standard CIVET on SGE Cluster) CIVET
What are the issues? Different clustering strategies work for different types of workflows. A specific automated horizontal task clustering strategy created a computationally efficient workflow in this case.
Higher Data Affinity More Coarse Grained Tasks Fine-grained Tasks with Low-level of data-interdependencies Coarse-grained Tasks with High-level of data-interdependencies What are the issues?
Creating an efficient workflow plan involves consideration of several trade- offs! Various parameters need to be optimised: Data efficiency, scheduling latency, workflow turn-around time, network latencies. Hence workflow planning is a multi- dimensional optimisation problem. What are the issues?
This paper proposes an initial single- objective genetic algorithm based workflow planning approach.
Provenance Data Fitness Calculation SelectionSelection Genetic operators Pipeline Service Planner
The workflow planning approach will first be simulated in SimGRID. Various parameters for the planning approach will be tweaked and evaluated Type of selection producing the quickest convergence towards efficiency Extending fitness functions for multi- objectives Implementation of the Approach
Conclusion Several workflow planning techniques exist, however prior knowledge about the nature of the workflow is required to select an appropriate technique. This paper proposes a single-objective evolutionary workflow planning approach to optimise workflow turn- around times. The approach will be first implemented in a SimGrid environment and results will be shared in future publications.