SimCity Building Blocks at the DICE team

Slides:



Advertisements
Similar presentations
Polska Infrastruktura Informatycznego Wspomagania Nauki w Europejskiej Przestrzeni Badawczej Institute of Computer Science AGH ACC Cyfronet AGH The PL-Grid.
Advertisements

UrbanFlood Towards a framework for creation, deployment and reliable operation of distributed, time-critical applications Marian Bubak and Marek Kasztelnik.
Scientific Workflow Support in the PL-Grid Infrastructure with HyperFlow Bartosz Baliś, Tomasz Bartyński, Kamil Figiela, Maciej Malawski, Piotr Nowakowski,
The ADAMANT Project: Linking Scientific Workflows and Networks “Adaptive Data-Aware Multi-Domain Application Network Topologies” Ilia Baldine, Charles.
Polish Infrastructure for Supporting Computational Science in the European Research Space GridSpace Based Virtual Laboratory for PL-Grid Users Maciej Malawski,
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space User Oriented Provisioning of Secure Virtualized.
Course Instructor: Aisha Azeem
European Organization for Nuclear Research Virtualization Review and Discussion Omer Khalid 17 th June 2010.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
Windows.Net Programming Series Preview. Course Schedule CourseDate Microsoft.Net Fundamentals 01/13/2014 Microsoft Windows/Web Fundamentals 01/20/2014.
DESIGN OF A PLATFORM OF VIRTUAL SERVICE CONTAINERS FOR SERVICE ORIENTED CLOUD COMPUTING Carlos de Alfonso Andrés García Vicente Hernández.
Towards auto-scaling in Atmosphere cloud platform Tomasz Bartyński 1, Marek Kasztelnik 1, Bartosz Wilk 1, Marian Bubak 1,2 AGH University of Science and.
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
Opensource for Cloud Deployments – Risk – Reward – Reality
Distributed Cloud Environment for PL-Grid Applications Piotr Nowakowski, Tomasz Bartyński, Tomasz Gubała, Daniel Harężlak, Marek Kasztelnik, J. Meizner,
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
CIRRUS Workshop, Vienna, Austria119 Nov 2013 Security in the Cloud Platform for VPH Applications Marian Bubak Department of Computer Science and Cyfronet,
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Environment for Management of Experiments on the Grid Master of Science Thesis AGH University of Science and Technology, Krakow, Poland Faculty of Electrical.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
CGW 2003 Institute of Computer Science AGH Proposal of Adaptation of Legacy C/C++ Software to Grid Services Bartosz Baliś, Marian Bubak, Michał Węgiel,
+ CS 325: CS Hardware and Software Organization and Architecture Cloud Architectures.
Introduction to Cloud Technology StratusLab Tutorial (Orsay, France) 28 November 2012.
DISTRIBUTED COMPUTING
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
Cluster Reliability Project ISIS Vanderbilt University.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
Presented by: Sanketh Beerabbi University of Central Florida COP Cloud Computing.
© DATAMAT S.p.A. – Giuseppe Avellino, Stefano Beco, Barbara Cantalupo, Andrea Cavallini A Semantic Workflow Authoring Tool for Programming Grids.
Experience with the OpenStack Cloud for VPH Applications Jan Meizner 1, Maciej Malawski 1,2, Piotr Nowakowski 1, Paweł Suder 1, Marian Bubak 1,2 AGH University.
Issues in (Financial) High Performance Computing John Darlington Director Imperial College Internet Centre Fast Financial Algorithms and Computing 4th.
Magellan: Experiences from a Science Cloud Lavanya Ramakrishnan.
DataNet – Flexible Metadata Overlay over File Resources Daniel Harężlak 1, Marek Kasztelnik 1, Maciej Pawlik 1, Bartosz Wilk 1, Marian Bubak 1,2 1 ACC.
Distributed Computing Environment (DCE) Presenter: Zaobo He Instructor: Professor Zhang Advanced Operating System Advanced Operating System.
EC-project number: Universal Grid Client: Grid Operation Invoker Tomasz Bartyński 1, Marian Bubak 1,2 Tomasz Gubała 1,3, Maciej Malawski 1,2 1 Academic.
EC-project number: ViroLab Virtual Laboratory Marian Bubak ICS / CYFRONET AGH Krakow virolab.cyfronet.pl.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Atmosphere: A Platform for Development, Execution and Sharing of Applications in Federated Clouds Marian Bubak Piotr Nowakowski, Marek Kasztelnik, Tomasz.
Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,
Towards large-scale parallel simulated packings of ellipsoids with OpenMP and HyperFlow Monika Bargieł 1, Łukasz Szczygłowski 1, Radosław Trzcionkowski.
Federating PL-Grid Computational Resources with the Atmosphere Cloud Platform Piotr Nowakowski, Marek Kasztelnik, Tomasz Bartyński, Tomasz Gubała, Daniel.
Workflow scheduling and optimization on clouds
Aneka Cloud ApplicationPlatform. Introduction Aneka consists of a scalable cloud middleware that can be deployed on top of heterogeneous computing resources.
Web Technologies Lecture 13 Introduction to cloud computing.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
Development, Execution and Sharing of VPH Applications in the Cloud with the Atmosphere Platform Piotr Nowakowski, Tomasz Bartyński, Marian Bubak, Tomasz.
WP5 – Infrastructure Operations Test and Production Infrastructures StratusLab kick-off meeting June 2010, Orsay, France GRNET.
Support for Taverna Workflows in VPH-Share Cloud Platform Marek Kasztelnik 1, Marian Bubak 2,1, Maciej Malawski 2,1, Piotr Nowakowski 1, Ernesto Coto 3,
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
PLG-Data and rimrock Services as Building
PaaS services for Computing and Storage
Distributed Computing Environments (DICE) team – product portfolio
Accessing the VI-SEEM infrastructure
Introduction to Cloud Technology
Department of Computer Science AGH
Organizations Are Embracing New Opportunities
From VPH-Share to PL-Grid: Atmosphere as an Advanced Frontend
Model Execution Environment for Investigation of Heart Valve Diseases
StratusLab Final Periodic Review
StratusLab Final Periodic Review
DICE - Distributed Computing Environments Team
Tools and Services Workshop Overview of Atmosphere
PROCESS - H2020 Project Work Package WP6 JRA3
1ACC Cyfronet AGH, Kraków, Poland
* Introduction to Cloud computing * Introduction to OpenStack * OpenStack Design & Architecture * Demonstration of OpenStack Cloud.
Mariusz Sterzel1 , Lukasz Dutka1, Tomasz Szepieniec1
Final Review 27th March Final Review 27th March 2019.
Presentation transcript:

SimCity Building Blocks at the DICE team Marian Bubak, Bartosz Baliś, Marek Kasztelnik, Maciej Malawski, Piotr Nowakowski {bubak,balis, malawski}@agh.edu.pl, {m.kasztelnik, p.nowakowski}@cyfronet.pl Department of Computer Science and ACC Cyfronet AGH Krakow, PL dice.cyfronet.pl

Example application: Flood threat assessment Scenario: levee-protected area endangered by flood due to high water levels The user selects an area for flood threat assessment

Flood threat assessment: our solution

Data sets GIS data (levees, sensor locations) Sensor data Simulated sensor data Computation results Lots of metadata

(Virtual) Experiments Support for conducting virtual experiments: Flooding a reservoir protected by an artificial levee Observation of a levee during heavy rain Simulation of a flooding scenario Experiment lifecycle Creating a new experiment and defining its context Collecting information during the experiment Concluding the experiment Reusing experiment results in future experiments

HyperFlow: programming and execution of workflow-based scientific applications Innovative programming approach and enactment engine for scientific workflows Combines declarative workflow description with low-level programming in JavaScript / node.js for implementing workflow activities Simple and concise syntax + mainstream scripting language & runtime platform = increased programming productivity Based on a formal model of computation (Process Networks) Supports a rich set of complex workflow patterns <workflow.json> { "processes": [ {   "name": "Sqr",     "function": "sqr",     "type": "dataflow",     "parlevel": 0, // level of parallelism (unlimited)     "ordering": true, // ordering of results     "ins": [ "number" ],     "outs": [ "square" ]   }, {     "name": "Sum",     "function": "sum",     "ins": [ "square:3" ],     "outs": [ "sum" ]   } ],   "signals": [ {     "name": "number",     "data": [ 1, 2, 3, 4, 5, 6 ]     "name": "square"     "name": "sum"   "ins": [ "number" ],   "outs": [ "sum" ] } <functions.js> function sqr(ins, outs, config, cb) {     var n = Number(ins.number.data[0]);     outs.square.data = [n * n];     cb(null, outs); } function sum(ins, outs, config, cb) {     var sum=0.0;     ins.square.data.forEach(function (n) { sum += n; });     outs[0].data = [ sum ]; B. Baliś, Increasing Scientific Workflow Programming Productivity with HyperFlow. In Proceedings of the 9th Workshop on Workflows in Support of Large-Scale Science. 2015 (In Print).

SimCity Requirements for “CIS” Start a simulation based on user input. Let an automated component start workflows within CIS with new parameter sets, and receive results asynchronously. Workflow generator is a kind of an automated components generating and starting workflows. Results are received asynchronously by the user interface The user specifies “the what” (in this case, which are should be evaluated), the system is automatically generated (“the how”) All functions should be accessible from a user-friendly interface that is not concerned with how something is computed but with what is computed.

SimCity Requirements for “CIS” Start an external component (for parameter exploration) via an external (custom) user interface. Ideally, parameter exploration is exposed as a web service, so different algorithms can be started from a custom web interface. The parameter exploration algorithm should be modifiable or at least selectable by the user.

SimCity Requirements for “CIS” Stage data to and from clusters or cloud infrastructure (virtual machines). For each job, the input and output files should end up at the correct places. Construct input files based on parameter sets. TRANSIMS, for example, works with control files to determine the parameters, but also with data files with parameters. Both would have to be editable from CIS.

SimCity Requirements for “CIS” Automatically schedule workflows on cluster or cloud infrastructure. Preferably, the user should be able to select their own cluster. We have the Optimizer/Scheduler for this.

SimCity Requirements for “CIS” Provide feedback to a component based on user input. Parameter exploration may be guided by the user. In this case, ideally, a web service is provided for real-time interaction with the parameter exploration component. CIS gives feedback to a component based on sensor data, which may then start new workflows. Towards manual steering: Workflow processes run continuously Workflow processes expose a REST API which could be invoked from the User Interface Message queue could be used for “steering” the component. Towards automated steering: We could add another component to perform the automated decision making We’re planning something similar in ISMOP, e.g. automatic increasing of sensor data collection frequency based on current values. New workflows can certainly be automatically generated and started, without the user intervention.

Hybrid cloud as a means of provisioning computing power for virtual experiments – the Atmosphere framework Cloud Management Portlets GUI host (provisions end-user features and access options) Provide GUI elements which enable service developers and end users to interact with the Atmosphere platform and create/deploy services on the available cloud resources Worker Node Worker Node Worker Node Worker Node 96 CPU cores 184 GB RAM 4 TB storage private IP space Head Node Worker Node Worker Node Worker Node Worker Node Image store OpenStack cloud site at ACC CYFRONET AGH Atmosphere Core Services Host user accounts Atmosphere Registry (AIR) available cloud sites services and templates Atmosphere Core Secure RESTful API (Cloud Facade) Authentication and authorization logic Communication with underlying computational clouds Launching and monitoring service instances Creating new service templates Billing and accounting Logging and administrative services Worker node w/large resource pool („fat node”) 128 CPU cores 256 GB RAM 4 TB storage private IP space Head Node Worker node w/large resource pool („fat node”) Image store VPH-Share cloud site at UNIVIE Worker Node Massive (functionally limitless) hardware resource pool public IP space API host Worker Node Image store Amazon Elastic Compute Cloud (EC2) – European availability zone

Atmosphere platform interfaces End user A full range of user-friendly GUIs is provided to enable service creation, instantiation and access. A comprehensive online user guide is also available. The GUIs work by invoking a secure RESTful API which is exposed by the Atmosphere host. We refer to this API as the Cloud Facade. Atmosphere Registry (AIR) Atmosphere Ruby on Rails controller layer (core Atmosphere logic) Cloud sites Application -- or -- Workflow environment Any operation which can be performed using the GUI may also be invoked programmatically by tools acting on behalf of the platform user – this includes standalone applications and workflow management environments (which VPH-Share also provides). All operations on cloud hardware are abstracted by the Atmosphere platform which exposes a unified RESTful API (with a suitable set of developer’s documentation available). For end users, the API is concealed by a layer of platform GUIs embedded in the VPH-Share portal and providing a user-friendly work environment - for domain scientists and service developers alike. The API can also be directly invoked by external services as long as they possess the required security credentials (Atmosphere relies on the well-known OpenID authentication standard).

Shared and scalable services – smart utilization of hardware resources Scientist Developer Atmosphere Cloud Platform Cloud Service Published Published services become visible to non-developers and can be instantiated using the Generic Invoker. Developers are free to spawn „snapshot” images of their cloud services (e.g. for backup purposes) without exposing them to external users. A Shared service is backended by a single virtual machine which „mimics” multiple instances from the users’ point of view. Shared services greatly conserve hardware resources and can be instantiated quickly. Atmosphere Cloud Service Shared Cloud WN Shared VM Scientist When a Scalable service is overloaded with requests Atmosphere can spawn additional instances in the cloud to handle the additional load. The process is transparent from the user’s perspective. Atmosphere Cloud Service Scalable Cloud WN Separate VM Scientist

More information about the hybrid computational cloud platform A more detailed introduction to the Atmosphere cloud platform (including user manuals) can be found at https://vph.cyfronet.pl/tutorial The DIstributed Computing Environments (DICE) team homepage at http://dice.cyfronet.pl has information on projects which use Atmosphere for cloud resource provisioning

Cost optimization of applications on clouds Infrastructure model Multiple compute and storage clouds Heterogeneous instance types Application model Bag of tasks Multi-level workflows Modeling with AMPL (A Modeling Language for Mathematical Programming) and CMPL Cost optimization under deadline constraints Mixed integer programming Bonmin, Cplex solvers M. Malawski, K. Figiela, J. Nabrzyski: Cost minimization for computational applications on hybrid cloud infrastructures, Future Generation Computer Systems, Volume 29, Issue 7, September 2013, Pages 1786-1794, ISSN 0167-739X, http://dx.doi.org/10.1016/j.future.2013.01.004 Maciej Malawski, Kamil Figiela, Marian Bubak, Ewa Deelman, Jarek Nabrzyski: Cost Optimization of Execution of Multi-level Deadline-Constrained Scientific Workflows on Clouds. PPAM (1) 2013: 251-260 http://dx.doi.org/10.1007/978-3-642-55224-3_24 https://github.com/kfigiela/optimization-models

Simulation and scheduling of large-scale scientific workflows on IaaS clouds Large-scale scientific workflows from Pegasus workflow management system Workflows of 100,000 tasks Workflow ensembles: schedule as many workflows as possible within a budget and deadline Cloud infrastructure simulated using CloudSim M. Malawski, G. Juve, E. Deelman, J. Nabrzyski: Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds. SC 2012: 22 https://github.com/malawski/cloudworkflowsimulator

Cloud performance evaluation Performance of VM deployment times Virtualization overhead Evaluation of open source cloud stacks (Eucalyptus, OpenNebula, OpenStack) Survey of European public cloud providers Performance evaluation of top cloud providers (EC2, RackSpace, SoftLayer) A grant from Amazon has been obtained M. Bubak, M. Kasztelnik, M. Malawski, J. Meizner, P. Nowakowski and S. Varma: Evaluation of Cloud Providers for VPH Applications, poster at CCGrid2013 - 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Delft, the Netherlands, May 13-16, 2013

DICE team - http://dice.cyfronet.pl Main research interests: investigation of methods for building complex scientific collaborative applications and large-scale distributed computing infrastructures elaboration of environments and tools for e-Science development of knowledge-based approach to services, components, and their semantic composition and integration CrossGrid 2002-2005 interactive compute- and data-intensive applications K-Wf Grid 2004-2007 knowledge-based composition of grid workflow applications CoreGRID 2004-2008 problem solving environments, programming models GREDIA 2006-2009 grid platform for media and banking applications ViroLab GridSpace virtual laboratory PLGrid series 2009-2015 advanced virtual laboratory, DataNet gSLM 2010-2012 service level management for grids and clouds UrbanFlood 2009-2012 Common Information Space for Early Warning Systems MAPPER 2010-2013 computational strategies, software and services for distributed multiscale simulations VPH-Share 2011-2015 federating cloud resources for development and execution of VPH compute- and data-intensive applications Collage 2011-2013 Executable Papers; 1st award in the Elsevier Grand Challenge competition at ICCS2011 ISMOP 2013-2016 cloud resource management and optimization, big data storage and analysis tools PaaSage federating cloud resources, workflow composition, optimization of cloud resources, porting existing applications to cloud infrastructure