Environment for Management of Experiments on the Grid Master of Science Thesis AGH University of Science and Technology, Krakow, Poland Faculty of Electrical.

Slides:



Advertisements
Similar presentations
Building Portals to access Grid Middleware National Technical University of Athens Konstantinos Dolkas, On behalf of Andreas Menychtas.
Advertisements

LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
MIT Lincoln Laboratory A Service-Oriented Approach to Application Development Robert Darneille & Gary Schorer WPI MQP Presentations ICS Group 10 October.
Database System Concepts and Architecture
1 OBJECTIVES To generate a web-based system enables to assemble model configurations. to submit these configurations on different.
Polish Infrastructure for Supporting Computational Science in the European Research Space GridSpace Based Virtual Laboratory for PL-Grid Users Maciej Malawski,
Computer Monitoring System for EE Faculty By Yaroslav Ross And Denis Zakrevsky Supervisor: Viktor Kulikov.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
GridScape Ding Choon Hoong Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia WW Grid.
Java Programming, 3e Concepts and Techniques Chapter 1 An Introduction to Java and Program Design.
Integrate into existing systems with PowerShell integration modules Extend by building PS modules to enable integrating into other systems Optimize.
Dagstuhl, February 16, 2009 Layers in Grids Uwe Schwiegelshohn 17. Februar 2009 Layers in Grids.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Talend 5.4 Architecture Adam Pemble Talend Professional Services.
UNIT-V The MVC architecture and Struts Framework.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
DESIGN OF A PLATFORM OF VIRTUAL SERVICE CONTAINERS FOR SERVICE ORIENTED CLOUD COMPUTING Carlos de Alfonso Andrés García Vicente Hernández.
Java Programming, 2E Introductory Concepts and Techniques Chapter 1 An Introduction to Java and Program Design.
DIANE Overview Germán Carrera, Alfredo Solano (CNB/CSIC) EMBRACE COURSE Monday 19th of February to Friday 23th. CNB-CSIC Madrid.
Polish Infrastructure for Supporting Computational Science in the European Research Space Policy Driven Data Management in PL-Grid Virtual Organizations.
A Scalable Application Architecture for composing News Portals on the Internet Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta Famagusta.
Central Online Grading System COGS Dec15-21 dec1521.sd.ece.iastate.edu.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
DIRAC Web User Interface A.Casajus (Universitat de Barcelona) M.Sapunov (CPPM Marseille) On behalf of the LHCb DIRAC Team.
CGW 2003 Institute of Computer Science AGH Proposal of Adaptation of Legacy C/C++ Software to Grid Services Bartosz Baliś, Marian Bubak, Michał Węgiel,
Security in Virtual Laboratory System Jan Meizner Supervisor: dr inż. Marian Bubak Consultancy: dr inż. Maciej Malawski Master of Science Thesis.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
Through the development of advanced middleware, Grid computing has evolved to a mature technology in which scientists and researchers can leverage to gain.
SOFTWARE DESIGN (SWD) Instructor: Dr. Hany H. Ammar
T.Jadczyk, Bioinformatics Applications in the Virtual Laboratory Bioinformatics Applications in the Virtual Laboratory Tomasz Jadczyk AGH University of.
A Proposal of Application Failure Detection and Recovery in the Grid Marian Bubak 1,2, Tomasz Szepieniec 2, Marcin Radecki 2 1 Institute of Computer Science,
Cracow Grid Workshop, October 27 – 29, 2003 Institute of Computer Science AGH Design of Distributed Grid Workflow Composition System Marian Bubak, Tomasz.
INFSO-RI Module 01 ETICS Overview Etics Online Tutorial Marian ŻUREK Baltic Grid II Summer School Vilnius, 2-3 July 2009.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
The PROGRESS Grid Service Provider Maciej Bogdański Portals & Portlets 2003 Edinburgh, July 14th-17th.
Selected Topics in Software Engineering - Distributed Software Development.
Application portlets within the PROGRESS HPC Portal Michał Kosiedowski
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Cracow Grid Workshop October 2009 Dipl.-Ing. (M.Sc.) Marcus Hilbrich Center for Information Services and High Performance.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
Migrating Desktop Marcin Płóciennik Marcin Płóciennik Kick-off Meeting, Santander, Graphical.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
Ashley Montebello – CprE Katie Githens – SE Wayne Rowcliffe – SE Advisor/Client: Akhilesh Tyagi.
EC-project number: Universal Grid Client: Grid Operation Invoker Tomasz Bartyński 1, Marian Bubak 1,2 Tomasz Gubała 1,3, Maciej Malawski 1,2 1 Academic.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
37 Copyright © 2007, Oracle. All rights reserved. Module 37: Executing Workflow Processes Siebel 8.0 Essentials.
The EDGeS project receives Community research funding 1 Porting Applications to the EDGeS Infrastructure A comparison of the available methods, APIs, and.
The Process Manager in the ATLAS DAQ System G. Avolio, M. Dobson, G. Lehmann Miotto, M. Wiesmann (CERN)
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
Grid Computing Environment Shell By Mehmet Nacar Las Vegas, June 2003.
Migrating Desktop Bartek Palak Bartek Palak Poznan Supercomputing and Networking Center The Graphical Framework.
Connecting with Computer Science2 Objectives Learn how software engineering is used to create applications Learn some of the different software engineering.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Federating PL-Grid Computational Resources with the Atmosphere Cloud Platform Piotr Nowakowski, Marek Kasztelnik, Tomasz Bartyński, Tomasz Gubała, Daniel.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Tool Integration with Data and Computation Grid “Grid Wizard 2”
INFSO-RI JRA2 Test Management Tools Eva Takacs (4D SOFT) ETICS 2 Final Review Brussels - 11 May 2010.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
Collection and storage of provenance data Jakub Wach Master of Science Thesis Faculty of Electrical Engineering, Automatics, Computer Science and Electronics.
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
V7 Foundation Series Vignette Education Services.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
Overview on the work performed during EPIKH Training Faiza MEDJEK /INFN, CATANIA 1.
Working in the Forms Developer Environment
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Module 01 ETICS Overview ETICS Online Tutorials
The ViroLab Virtual Laboratory for Viral Diseases
Presentation transcript:

Environment for Management of Experiments on the Grid Master of Science Thesis AGH University of Science and Technology, Krakow, Poland Faculty of Electrical Engineering, Automatics, Computer Science and Electronics Institute of Computer Science Paweł Charkowski Supervisor: dr inż. Marian Bubak Consultancy: dr inż. Maciej Malawski

Outline Goals of the thesis Introduction to the ViroLab Overview of related works (DIANE, Nimrod, Askalon, Zenturio) Experiment management system requirements EMGE system architecture Testing and integration of EMGE Summary

Goals of the thesis Analysis of the problem of grid experiments management environment Identification of available experiment management solutions –Research and discussion of related works to gain better problem view Design and development of the Environment for Management of Grid Experiments adapted to the ViroLab Virtual Laboratory –Design of appriopriate database model –Implementation of system modules Proving correctness and usefulness of the developed system –Unit tests, integration tests, execution of sample test experiment

ViroLab Virtual Laboratory Research project of the 6th EU Framework Program –Virtual laboratory for infectious disease treatment support (mainly HIV) –Experiment developement is located at ACK Cyfronet AGH The ViroLab Virtual Laboratory is an infrastructure for transparent data access, experiment execution, and collaboration support for distributed analysis Works on grid infrastructure The system designed in this thesis is located in the „interfaces” layer

Motivation for the Environment for Management of Experiments on the Grid ViroLab lacks a management environment for complex experiments Each single task has to be executed separately by the user (EPE, EMI) –Problem when the same operation has to be performed for several parameters: long execution time (when using loop) or the experiment user has to schedule several task instances manually Issues with tasks having long execution time: –When something fails the whole task has to be rescheduled –Dividing such task to several part requires user to manually manage execution sequence and data passing Creating a experiment management environment would: –solve issues described above –be a more user-friendly solution –allow system administrator to gain better knowledge about executed tasks (logs)

Overwiev of related works 1/2 DIANE –Master/worker architecture –Each worker agent must be started manually –Execution of parameter-study experiments –Application adapter package required to launch non-paramter-study experiments –Python experiment scripts for DIANE must be written in python Nimrod –Manages execution of parameter-study experiments –Experiment description in a plan file parameter section and tasks section –Experimentator must provide console command, used to launch task, during experiment scheduling problem with ViroLab security credentials validation timeout –Tasks are executed as console commands using GSEngine client would require it to be installed at every host that nimrod/g launches tasks on

Overwiev of related works 2/2 Askalon –Service-oriented architecture –UML-based graphical tool for workflow modeling -used also to monitor task status -launched on user host –Requires programatic skills form users - virolab user do not always has: -learn AGWL language -know how to model in UML Zenturio –Interacts with user through a web portal -interface for submitting, monitoring, controlling the experiement and analyzing experiments results –ZEN - directive based language used to specify application parameters -directives hidden in „comment” lines – independent from programming language -Script modification needed before each execution -Requires user to learn ZEN language –Consists of several Grid services

Requirements Functional requirements –Management of experiments execution in ViroLab taking care of all aspects connected with scheduling, like: security credentials management, experiment failure recovery, proper task sequence –Workflow composition support Enable end-users to specify experiment workflows, defining task dependencies, passing task outputs as input to another tasks –Provide UI for ViroLab users interface for monitoring experiment status and submiting new experiments Non-functional requirements –Use the GSEngine for job execution –Provided user interface needs to be intuitive and easy to use –Resource usage minimalization minimize grid resources usage and database size –Easily configurable –Limited number of simultaneosuly scheduled user tasks to prevent one user from overloading the system,and blocking other users’ task execution

System concepts Communication with users through a web portal Independent modules –Experiment Scheduler and User Portal independent from each other Database oriented architecture –experiment information stored in database

Architecture – User Portal Experiment Monitor - displays user’s experiments structure - shows current task status - displays task execution information Experiment Creator -enables submiting new experiments

Architecture – Scheduling Manager Task Scheduler –manages & schedules task execuiton –uses GSEngine for task execution –callbacks update task current state in database Security Handle Provider –manages shibboleth handles –requests new handles if necessary using IdpClient SuperTask Completion Listener –listens for task execution completion –super task results stored using ResMan –spawns new tasks –input for new tasks passed as rID’s (ResMan id’s)

Database model Database model reflects structure of experiments Storers all information required for execution of a task Each table has corresponding bean class Tables accessed through dedicated Data Access Objects Object-relational mapping using Hibernate

EMGE Implementation Implementation details –task input data read from file – each line used as execution arguments, number of tasks equals number of lines in file –scripts code uploaded from user host –super task results shown as ResMan links –task execution log available to experiment owner –new tasks periodically scheduled for execution, and on task completion notification –results between dependant tasks passed as ResMan links Technologies used : –Core of EMGE: Java SE 6.0 –User Portal implemented using Google Web Toolkit (GWT) –Databse access using Hibernate 3.0 –Apache Tomcat Web Server 6.0 –Spring Framework IoC container used in Scheduling Manger –EMGE tests: Junit testing framework

Testing and Integration Unit tests –All implemented classes are covered with unit tests –All unit tests passed Integration –Intergration with GSEngine, ResMan and IdpClient tested and works correctly –Internal components communication works correctly Deployment –Application deployed and launched –Example experiment of protein folding composed of ower 1000 jobs successfully executed

Summary The main goal of the this: providing an experiment management environment for ViroLab, has been successfully achieved. Performed research of related works gave knowledge about strong and weak points of solutions used in those works. Executed unit and integration tests proved correctness of the developed system. EMGE has been successfully deployed on a web server and operates correctly for real experiments

Future work Drag&drop interface for workflow composition Drag&drop mechanism is more user friendly. It is also less error prone that current interface, as it will be easier for users to notice workflow composition error on a block diagram. Adaptation to use experiments requiring input at runtime Many existing experiment scripts available in ViroLab require user input at runtime. Such experiments are not supported by current version of EMGE.

Web sites visit following web sites: