Workflow Systems for LQCD SciDAC LQCD Software meeting, Boston, Feb 2008 Fermilab, IIT, Vanderbilt.

Slides:



Advertisements
Similar presentations
A MapReduce Workflow System for Architecting Scientific Data Intensive Applications By Phuong Nguyen and Milton Halem phuong3 or 1.
Advertisements

Pegasus on the Virtual Grid: A Case Study of Workflow Planning over Captive Resources Yang-Suk Kee, Eun-Kyu Byun, Ewa Deelman, Kran Vahi, Jin-Soo Kim Oracle.
Ewa Deelman, Integrating Existing Scientific Workflow Systems: The Kepler/Pegasus Example Nandita Mangal,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Nadia Ranaldo - Eugenio Zimeo Department of Engineering University of Sannio – Benevento – Italy 2008 ProActive and GCM User Group Orchestrating.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Software Factory Assembling Applications with Models, Patterns, Frameworks and Tools Anna Liu Senior Architect Advisor Microsoft Australia.
1 Scaling Up Classifiers to Cloud Computers Christopher Moretti, Karsten Steinhaeuser, Douglas Thain, Nitesh V. Chawla University of Notre Dame.
Cloud based Dynamic workflow with QOS for Mass Spectrometry Data Analysis Thesis Defense: Ashish Nagavaram Graduate student Computer Science and Engineering.
Asst.Prof.Dr.Ahmet Ünveren SPRING Computer Engineering Department Asst.Prof.Dr.Ahmet Ünveren SPRING Computer Engineering Department.
WORKFLOWS IN CLOUD COMPUTING. CLOUD COMPUTING  Delivering applications or services in on-demand environment  Hundreds of thousands of users / applications.
Project Proposal: Academic Job Market and Application Tracker Website Project designed by: Cengiz Gunay Client: Cengiz Gunay Audience: PhD candidates and.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 18 Slide 1 Software Reuse.
Software Engineering Muhammad Fahad Khan
Software Reuse Prof. Ian Sommerville
Using Globus to Scale an Application Case Study 4: Scientific Workflow for Computational Economics Tiberiu Stef-Praun, Gabriel Madeira, Ian Foster, Robert.
Assignment 3: A Team-based and Integrated Term Paper and Project Semester 1, 2012.
Christopher Jeffers August 2012
Database Design for DNN Developers Sebastian Leupold.
Promoting Open Source Software Through Cloud Deployment: Library à la Carte, Heroku, and OSU Michael B. Klein Digital Applications Librarian
Click to add text TWA Cloud Integration with Tivoli Service Automation Manager TWS Education.
A Metadata Catalog Service for Data Intensive Applications Presented by Chin-Yi Tsai.
Cluster Reliability Project ISIS Vanderbilt University.
Workflow Project Luciano Piccoli Illinois Institute of Technology.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life UC DAVIS Department of Computer Science The Kepler/pPOD Team Shawn.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
Combining the strengths of UMIST and The Victoria University of Manchester Utility Driven Adaptive Workflow Execution Kevin Lee School of Computer Science,
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
Scientific Workflow Scheduling in Computational Grids Report: Wei-Cheng Lee 8th Grid Computing Conference IEEE 2007 – Planning, Reservation,
Workflow Project Status Update Luciano Piccoli - Fermilab, IIT Nov
Chapter 6 Architectural Design.
James Akrigg Microsoft Ltd Integrating InfoPath Forms Into Workflow Solutions And Business Processes.
Ganga A quick tutorial Asterios Katsifodimos Trainer, University of Cyprus Nicosia, Feb 16, 2009.
Wrapping Scientific Applications As Web Services Using The Opal Toolkit Wrapping Scientific Applications As Web Services Using The Opal Toolkit Sriram.
LQCD Workflow Execution Framework: Models, Provenance, and Fault-Tolerance Luciano Piccoli 1,3, Abhishek Dubey 2, James N. Simone 3, James B. Kowalkowski.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
CEN5011, Fall CEN5011 Software Engineering Dr. Yi Deng ECS359, (305)
9 Systems Analysis and Design in a Changing World, Fourth Edition.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
The GriPhyN Planning Process All-Hands Meeting ISI 15 October 2001.
S. Shumilov – Zürich Analytical Visualization Framework - a visual data processing and knowledge discovery system Ivan Denisovich, Serge Shumilov Department.
Your name here SPA: Successes, Status, and Future Directions Terence Critchlow And many, many, others Scientific Process Automation PNNL.
The EDGeS project receives Community research funding 1 Porting Applications to the EDGeS Infrastructure A comparison of the available methods, APIs, and.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
LQCD Workflow Project L. Piccoli October 02, 2006.
Status Report on the Validation Framework S. Banerjee, D. Elvira, H. Wenzel, J. Yarba Fermilab 15th Geant4 Collaboration Workshop 10/06/
Tool Integration with Data and Computation Grid “Grid Wizard 2”
LSF Universus By Robert Stober Systems Engineer Platform Computing, Inc.
T EST T OOLS U NIT VI This unit contains the overview of the test tools. Also prerequisites for applying these tools, tools selection and implementation.
V7 Foundation Series Vignette Education Services.
Managing LIGO Workflows on OSG with Pegasus Karan Vahi USC Information Sciences Institute
CMS Experience with the Common Analysis Framework I. Fisk & M. Girone Experience in CMS with the Common Analysis Framework Ian Fisk & Maria Girone 1.
Esri UC 2014 | Technical Workshop | Address Maps and Apps for State and Local Government Allison Muise Nikki Golding Scott Oppmann.
Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.
Reliability and Workflow projects Jim Kowalkowski.
Al Lilianstrom and Dr. Olga Terlyga NLIT 2016 May 4 th, 2016 Under the Hood of Fermilab’s Identity Management Service.
U.S. ATLAS Grid Production Experience
Provisioning of RAC Database on configured Stack
CMPE419 Mobile Application Development
AIMS Equipment & Automation monitoring solution
Mats Rynge USC Information Sciences Institute
CMPE419 Mobile Application Development
GGF10 Workflow Workshop Summary
Presentation transcript:

Workflow Systems for LQCD SciDAC LQCD Software meeting, Boston, Feb 2008 Fermilab, IIT, Vanderbilt

What is a workflow? A workflow is a reliably repeatable pattern of activity enabled by a systematic organization of resources, defined roles and mass, energy and information flows, into a work process that can be documented and learned. Workflows are always designed to achieve processing intents of some sort, such as physical transformation, service provision, or information processing. Hmm... mass? … energy? Must be physics related! A complex LQCD workflow: an LQCD analysis campaign to compute hadronic 2- and 3-pt functions for each configuration of a gauge ensemble.

LQCD 2-pt workflow Foreach config in ensemble :

Scientific workflow management systems Choices: Swift, Askalon, Kepler, Pegasus, Karajan, Triana, Taverna, et al. "A Taxonomy of Workflow Management Systems for Grid Computing" JoG.pdf JoG.pdf Systems designed for Grid computing. Are systems easy to use? NO – Learning curve, added software complexity (and organization!). But I'm happy writing shell scripts, why change? –We envision compelling benefits in using a workflow system...

Key workflow system requirements Users can exploit capacity computing while minimizing the burden of manually monitoring hundreds of batch jobs. Efficient use of LQCD compute facilities by monitoring and managing resources at the workflow level rather than only at the batch job level. Make a workflow specification easy to comprehend, reuse and extend. Eliminate the need for re-running any completed work upon resuming a stopped (by plan or by exception) analysis campaign. Campaign execution histories; use to plan future campaigns. Record provenance of scientific results. LQCD workflow management system requirements document Static dump of our wiki:

Highlights of the past year (or so) Wrote a functional requirements document On-going survey workflow systems: –Kepler tutorial at SC06, video meeting with Kepler team. –Face-to-face meetings with Swift team (ANL/U Chicago) –Video and face-to-face meeting with Askalon team (Innsbruck) –Phone meeting with Pegasus team (USC) –Presentation by developer from Condor and DAGman team Selected two promising systems for a detailed evaluation –Swift and Askalon –With help from developers, installed on Fermilab clusters –Coded a 2-pt LQCD workflow with help from developers –Proposed a simpler collection of workflow “unit tests” –Wrote a report detailing the evaluation A toy model workflow system in python –Database schema to persist state and data provenance

From the Swift/Askalon evaluation executive summary None of the currently available workflow systems can be quickly adapted for production use by LQCD, as several key requirements of the LQCD projects are not addressed by any of those systems. Additional development of specialized modules needs to be performed in-house to adopt these generic workflow systems for production use by the LQCD project. The advantage of using a currently available workflow system is that we could skip the initial and costly framework design and coding phase, thus spending out coordinated efforts on developing extensions as opposed to a completely custom solution. In making use of any existing workflow system there is an associated learning curve, but it is critical that we understand each workflow system’s architecture so that we can be modify and/or effectively use those workflow system’s APIs.

Front- and back-end systems work Workflow composition and management –Management of workflow templates –Management of participants and their mapping to instances of runable code –Management of parameter sets (physics and system related) Scheduling and execution –Persistent workflow state –Fault tolerance (cluster reliability project) –Multiple workflows –Workflow and job (two-level) scheduling Workflow histories and data provenance –Management and user query facilities

Virtual Node(s) ‏ SwiftScript Abstract computation Virtual Data Catalog SwiftScript Compiler SpecificationExecution Worker Nodes Provenance data Provenance data Provenance collector launcher file1 file2 file3 App F1 App F2 Scheduling Execution Engine (Karajan w/ Swift Runtime) ‏ Swift runtime callouts C CCC Status reporting Swift Architecture Provisionin g Falkon Resource Provisioner Amazon EC2

Askalon architecture database

Features overview FeatureSwiftAskalon modeling languageswiftscript (karajan XML)UML graphs (AGWL) execution modeldata flowcontrol flow workflow schedulerinternal to swift / karajaninternal or ext. to GUI users per sessionsingle workflows / sessionone QoS / fault toleranceno / retry participants limited workflow restarts perform. prediction / retry limited workflow restarts execution historylog filesrelational DB data provenancedot filesDB?

HL 2-pt in swiftscript String[] confList=["000102","000108"]; String kappaQ = "0.127"; String cSW = "1.75"; String mass = "0.005,0.007,0.01,0.02,0.03"; String[] fn = ["m0.005_ m0.007_ m0.01_ m0.02_ m0.03_000102", "m0.005_ m0.007_ m0.01_ m0.02_ m0.03_000108"]; String[] sources = ["local,0,0,0,0", "wavefunction,0,1S", "wavefunction,0,2S"]; foreach config,i in conflist { Template template ; # gauge template name Gauge gauge = stageIn(template, config); Stag stags[] ; stags = stagSolve(gauge, mass, source); StagArchive stagTar ; stagTar = archiveStag(mass, stags); Quark[] q; QuarkArchive[] cvtArch; for i in [0:2] { Clover clover = cloverSolve(kappaQ, cSW, gauge, sources[i]); q[i] = cvt12x12(clover); cvtArch[i] = archive(q[i]); } Quark antiQ = q[0]; file HH2ptcorr = twoPtHH(gauge, antiQ, q[0], q[1], q[2]); foreach stag in stags { file SH2ptcorr = twoPtSH(gauge, stag, q[0], q[1], q[2]); }

HL 2-pt in Askalon

Live demo: Chroma regression test Code: