Parameter Sweep and Resources Scaling Automation in Scalarm Data Farming Platform J. Liput, M. Paciorek, M. Wrona, M. Orzechowski, R. Slota, and J. Kitowski.

Slides:



Advertisements
Similar presentations
Polska Infrastruktura Informatycznego Wspomagania Nauki w Europejskiej Przestrzeni Badawczej Institute of Computer Science AGH ACC Cyfronet AGH The PL-Grid.
Advertisements

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Automated Evaluation of Runtime Object States Against Model-Level States for State-Based Test Execution Frank(Weifeng) Xu, Gannon University Dianxiang.
SLA-Oriented Resource Provisioning for Cloud Computing
© EADS 2010 – All rights reserved Force Protection Call 4 A-0938-RT-GC EUSAS European Urban Simulation for Asymmetric Scenarios Scalarm: Massively Self-Scalable.
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
Authors: Thilina Gunarathne, Tak-Lon Wu, Judy Qiu, Geoffrey Fox Publish: HPDC'10, June 20–25, 2010, Chicago, Illinois, USA ACM Speaker: Jia Bao Lin.
SWiM Panel on Engine Implementation Jennifer Widom.
Security-Driven Heuristics and A Fast Genetic Algorithm for Trusted Grid Job Scheduling Shanshan Song, Ricky Kwok, and Kai Hwang University of Southern.
GHS: A Performance Prediction and Task Scheduling System for Grid Computing Xian-He Sun Department of Computer Science Illinois Institute of Technology.
Next Generation Domain-Services in PL-Grid Infrastructure for Polish Science. Numerical Simulations of Metal Forming Production Processes and Cycles by.
WORKFLOWS IN CLOUD COMPUTING. CLOUD COMPUTING  Delivering applications or services in on-demand environment  Hundreds of thousands of users / applications.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
Towards auto-scaling in Atmosphere cloud platform Tomasz Bartyński 1, Marek Kasztelnik 1, Bartosz Wilk 1, Marian Bubak 1,2 AGH University of Science and.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Towards scalable, semantic-based virtualized storage.
Business Process Performance Prediction on a Tracked Simulation Model Andrei Solomon, Marin Litoiu– York University.
Polish Infrastructure for Supporting Computational Science in the European Research Space Policy Driven Data Management in PL-Grid Virtual Organizations.
Polish Infrastructure for Supporting Computational Science in the European Research Space QoS provisioning for data-oriented applications in PL-Grid D.
Łukasz Skitał 2, Renata Słota 1, Maciej Janusz 1 and Jacek Kitowski 1,2 1 Institute of Computer Science AGH University of Science and Technology, Mickiewicza.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
Next Generation Domain-Services in PL-Grid Infrastructure for Polish Science Daniel Bachniak 1, Jakub Liput 2, Łukasz Rauch 1, Renata Słota 2,3, Jacek.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
In each iteration macro model creates several micro modules, sends data to them and waits for the results. Using Akka Actors for Managing Iterations in.
Budget-based Control for Interactive Services with Partial Execution 1 Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research.
A Survey of Distributed Task Schedulers Kei Takahashi (M1)
Supercomputing Cross-Platform Performance Prediction Using Partial Execution Leo T. Yang Xiaosong Ma* Frank Mueller Department of Computer Science.
Scientific Workflow Scheduling in Computational Grids Report: Wei-Cheng Lee 8th Grid Computing Conference IEEE 2007 – Planning, Reservation,
BOF: Megajobs Gracie: Grid Resource Virtualization and Customization Infrastructure How to execute hundreds of thousands tasks concurrently on distributed.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
DataNet – Flexible Metadata Overlay over File Resources Daniel Harężlak 1, Marek Kasztelnik 1, Maciej Pawlik 1, Bartosz Wilk 1, Marian Bubak 1,2 1 ACC.
Polish Infrastructure for Supporting Computational Science in the European Research Space FiVO/QStorMan: toolkit for supporting data-oriented applications.
New Mexico Computer Science For All Algorithm Analysis Maureen Psaila-Dombrowski.
Scalarm: Scalable Platform for Data Farming D. Król, Ł. Dutka, M. Wrzeszcz, B. Kryza, R. Słota and J. Kitowski ACC Cyfronet AGH KU KDM, Zakopane, 2013.
KUKDM’2011, Zakopane Semantic Based Storage QoS Management Methodology Renata Słota, Darin Nikolow, Jacek Kitowski Institute of Computer Science AGH-UST,
Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.
Sep 13, 2006 Scientific Computing 1 Managing Scientific Computing Projects Erik Deumens QTP and HPC Center.
Uppsala, April 12-16th 2010EGEE 5th User Forum1 A Business-Driven Cloudburst Scheduler for Bag-of-Task Applications Francisco Brasileiro, Ricardo Araújo,
Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,
Towards large-scale parallel simulated packings of ellipsoids with OpenMP and HyperFlow Monika Bargieł 1, Łukasz Szczygłowski 1, Radosław Trzcionkowski.
1 On-line Parallel Tomography Shava Smallen UCSD.
Rafał Słota, Michał Wrzeszcz, Renata G. Słota, Łukasz Dutka, Jacek Kitowski ACC Cyfronet AGH Department of Computer Science, AGH - UST CGW 2015 Kraków,
Next Generation Domain-Services in PL-Grid Infrastructure for Polish Science Górecki 1,2, Bachniak 1,2, Liput 2, Rauch 1,2, Kitowski 2,3, Pietrzyk 1,2.
LQCD Workflow Project L. Piccoli October 02, 2006.
Chapter 8 System Management Semester 2. Objectives  Evaluating an operating system  Cooperation among components  The role of memory, processor,
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
1 Phase Testing. Janice Regan, For each group of units Overview of Implementation phase Create Class Skeletons Define Implementation Plan (+ determine.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
Konrad Zemek, Łukasz Opioła, Michał Wrzeszcz, Renata G. Słota, Łukasz Dutka, Jacek Kitowski ACC Cyfronet AGH Department of Computer Science, AGH - UST.
T EST T OOLS U NIT VI This unit contains the overview of the test tools. Also prerequisites for applying these tools, tools selection and implementation.
Metadata Organization and Management for Globalization of Data Access with Michał Wrzeszcz, Krzysztof Trzepla, Rafał Słota, Konrad Zemek, Tomasz Lichoń,
E-GRANT: EGI Resource Allocation Tool Current status and development roadmap Roksana Różańska, Tomasz Szepieniec ACC Cyfronet AGH, Poland EGI CF 2014,
Mobile Analyzer A Distributed Computing Platform Juho Karppinen Helsinki Institute of Physics Technology Program May 23th, 2002 Mobile.
INDIGO – DataCloud WP5 introduction INFN-Bari CYFRONET RIA
Www3.informatik.uni-wuerzburg.de Institute of Computer Science Chair of Communication Networks Prof. Dr.-Ing. P. Tran-Gia Simulation Framework for Live.
Resource Allocation and Scheduling for Workflows Gurmeet Singh, Carl Kesselman, Ewa Deelman.
1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
OPERATING SYSTEMS CS 3502 Fall 2017
Demo of the Model Execution Environment WP2 Infrastructure Platform
Hydrodynamic Galactic Simulations
PROCESS - H2020 Project Work Package WP6 JRA3
Graduation Project Kick-off presentation - SET
Design Space Exploration
1ACC Cyfronet AGH, Kraków, Poland
System Concept Simulation for Concurrent Engineering
What’s New from Platform Computing
Dtk-tools Benoit Raybaud, Research Software Manager.
Overview of Workflows: Why Use Them?
Quality-aware Middleware
SOA initiatives at Istat
Presentation transcript:

Parameter Sweep and Resources Scaling Automation in Scalarm Data Farming Platform J. Liput, M. Paciorek, M. Wrona, M. Orzechowski, R. Slota, and J. Kitowski ACC Cyfronet AGH Department of Computer Science, AGH UST CGW’15 Krakow, Poland October 2015

Agenda Data processing in modern science Problem description Scalarm overview Scalarm approach to automation Results Conclusions and future work

Data processing in modern science Scientific research methods often rely on executing numerous simulations each with different input parameter values One such approach is called data farming Cloud Grid Creation of simulation model Experiment creation by input space definition Analysis and visualisation ? Initial question Response or change Simulation idea Running and monitoring of HPC resources Change of input space Cloud Private Grid Implementation of simulation Experiment plan Results collection

Data processing in modern science Scientific research methods often rely on executing numerous simulations each with different input parameter values One such approach is called data farming Cloud Grid Creation of simulation model Experiment creation by input space definition Analysis and visualisation ? Initial question Response or change Simulation idea Running and monitoring of HPC resources Change of input space Cloud Private Grid Implementation of simulation Experiment plan Results collection

Data processing in modern science Scientific research methods often rely on executing numerous simulations each with different input parameter values One such approach is called data farming Cloud Grid Creation of simulation model Experiment creation by input space definition Analysis and visualisation ? Initial question Response or change Simulation idea Running and monitoring of HPC resources Change of input space Cloud Private Grid Implementation of simulation Experiment plan Results collection

Data processing in modern science Scientific research methods often rely on executing numerous simulations each with different input parameter values One such approach is called data farming Cloud Grid Creation of simulation model Experiment creation by input space definition Analysis and visualisation ? Initial question Response or change Simulation idea Running and monitoring of HPC resources Change of input space Cloud Private Grid Implementation of simulation Experiment plan Results collection

Data processing in modern science Scientific research methods often rely on executing numerous simulations each with different input parameter values One such approach is called data farming Cloud Grid Creation of simulation model Experiment creation by input space definition Analysis and visualisation ? Initial question Response or change Simulation idea Running and monitoring of HPC resources Change of input space Cloud Private Grid Implementation of simulation Experiment plan Results collection

Data science computation often requires input space adjustments according to collected partial results Need of resources management according to changing computational power requirements Problem description Cloud Grid Creation of simulation model Experiment creation by input space definition Analysis and visualisation ? Initial question Response or change Simulation idea Running and monitoring of HPC resources Change of input space Cloud Private Grid Implementation of simulation Experiment plan Results collection

Scalarm overview Scalarm - a platform for data farming, allows user to execute experiments in convenient way Unified management of heterogeneous resources Partial results analysis with numerous methods Input space extension during experiment User Input Space Computational Resources Extension Results Statistics Adjustments Execution User

Scalarm approach to automation Input Space Extension Results Analysis User Waiting for results Results Extension needed Input Space Extension Results Statistics Adjustments Resources Management Loop Level 1 Level 2 Execution Two levels of experiment automation: Input space adjustment Resources management Computational Resources

Input space adjustment Input space management algorithm: Input space extension Analysis of requested data Dedicated simulated annealing algorithm Input Space Extension Results Analysis Waiting for results Results Extension needed Input Space Extension Results Statistics Adjustments Resources Management Loop Level 1 Level 2 Execution Computational Resources User

Resources management Resources management algorithm: Pulling metrics about current state Metrics analysis Increasing or decreasing amount of workers Input Space Extension Results Analysis Waiting for results Results Extension needed Input Space Extension Results Statistics Adjustments Resources Management Loop Level 1 Level 2 Execution Computational Resources User

Resources management metrics Metrics used during resources management workers throughput system throughput target throughput makespan [time]

Evaluation Test 1: Automated input space extension evaluation Input space controlled by simulated annealing algorithm Fixed time of simulation execution - 10 seconds Fixed number of workers - 10 Test 2: Automated resources management evaluation Resources controlled by our resources management algorithm Fixed time of simulation execution - 20 seconds Manual input space extensions Experiment execution time constrained to 10 minutes

Automated input space extensions

Automated resources management

Conclusions Two levels of automation - input space adjustments and resources management Plugin-based architecture allows an easy extension with new algorithms Integration of these levels of automation is challenging Automated input space extension requires calculation of all simulation from ‘bundle’ before scheduling next one Resources management algorithm must take into account input space extension by bundle of simulations

Future Work Resources management algorithm better suited to data farming experiments Predicting amount of simulation yet to be scheduled based on available data Metrics extension Additional dedicated input space management algorithms, e.g. genetic algorithm