Condor Project Computer Sciences Department University of Wisconsin-Madison Running Interpreted Jobs.

Slides:



Advertisements
Similar presentations
Building a secure Condor ® pool in an open academic environment Bruce Beckles University of Cambridge Computing Service.
Advertisements

Overview of Wisconsin Campus Grid Dan Bradley Center for High-Throughput Computing.
Greg Thain Computer Sciences Department University of Wisconsin-Madison Condor Parallel Universe.
Dealing with real resources Wednesday Afternoon, 3:00 pm Derek Weitzel OSG Campus Grids University of Nebraska.
More HTCondor 2014 OSG User School, Monday, Lecture 2 Greg Thain University of Wisconsin-Madison.
1 Using Stork Barcelona, 2006 Condor Project Computer Sciences Department University of Wisconsin-Madison
Dr. David Wallom Use of Condor in our Campus Grid and the University September 2004.
Post ASP 2012: Protein Docking at the IU School of Medicine.
Open Science Grid: More compute power Alan De Smet
Intermediate HTCondor: Workflows Monday pm Greg Thain Center For High Throughput Computing University of Wisconsin-Madison.
CVMFS: Software Access Anywhere Dan Bradley Any data, Any time, Anywhere Project.
Jaeyoung Yoon Computer Sciences Department University of Wisconsin-Madison Virtual Machine Universe in.
Jaeyoung Yoon Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.
Zach Miller Condor Project Computer Sciences Department University of Wisconsin-Madison Flexible Data Placement Mechanisms in Condor.
Minerva Infrastructure Meeting – October 04, 2011.
Condor Project Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.
When and How to Use Large-Scale Computing: CHTC and HTCondor Lauren Michael, Research Computing Facilitator Center for High Throughput Computing STAT 692,
Zach Miller Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
Alain Roy Computer Sciences Department University of Wisconsin-Madison An Introduction To Condor International.
High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.
Workflow Management in Condor Gökay Gökçay. DAGMan Meta-Scheduler The Directed Acyclic Graph Manager (DAGMan) is a meta-scheduler for Condor jobs. DAGMan.
1 Integrating GPUs into Condor Timothy Blattner Marquette University Milwaukee, WI April 22, 2009.
National Alliance for Medical Image Computing Grid Computing with BatchMake Julien Jomier Kitware Inc.
High Throughput Computing with Condor at Purdue XSEDE ECSS Monthly Symposium Condor.
HTCondor workflows at Utility Supercomputing Scale: How? Ian D. Alderman Cycle Computing.
Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.
HTCondor and BOINC. › Berkeley Open Infrastructure for Network Computing › Grew out of began in 2002 › Middleware system for volunteer computing.
Condor Tugba Taskaya-Temizel 6 March What is Condor Technology? Condor is a high-throughput distributed batch computing system that provides facilities.
Hao Wang Computer Sciences Department University of Wisconsin-Madison Security in Condor.
Peter Keller Computer Sciences Department University of Wisconsin-Madison Quill Tutorial Condor Week.
High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, UWisc Condor Week April 13, 2010.
Compiled Matlab on Condor: a recipe 30 th October 2007 Clare Giacomantonio.
Workflows: from Development to Production Thursday morning, 10:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.
Parallel Optimization Tools for High Performance Design of Integrated Circuits WISCAD VLSI Design Automation Lab Azadeh Davoodi.
Greg Thain Computer Sciences Department University of Wisconsin-Madison cs.wisc.edu Interactive MPI on Demand.
1 The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison
Linux Operations and Administration
Condor Project Computer Sciences Department University of Wisconsin-Madison A Scientist’s Introduction.
Condor Project Computer Sciences Department University of Wisconsin-Madison Condor-G Operations.
Grid job submission using HTCondor Andrew Lahiff.
Building a Real Workflow Thursday morning, 9:00 am Greg Thain University of Wisconsin - Madison.
Workflows: from development to Production Thursday morning, 10:00 am Greg Thain University of Wisconsin - Madison.
Dealing with real resources Wednesday Afternoon, 3:00 pm Derek Weitzel OSG Campus Grids University of Nebraska.
Alain Roy Computer Sciences Department University of Wisconsin-Madison I/O Access in Condor and Grid.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor RoadMap.
The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison
Derek Wright Computer Sciences Department University of Wisconsin-Madison MPI Scheduling in Condor: An.
Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.
Condor Project Computer Sciences Department University of Wisconsin-Madison Grids and Condor Barcelona,
Derek Wright Computer Sciences Department University of Wisconsin-Madison New Ways to Fetch Work The new hook infrastructure in Condor.
Error Scope on a Computational Grid Douglas Thain University of Wisconsin 4 March 2002.
Peter F. Couvares Computer Sciences Department University of Wisconsin-Madison Condor DAGMan: Managing Job.
Greg Thain Computer Sciences Department University of Wisconsin-Madison Configuring Quill Condor Week.
11 Computers, C#, XNA, and You Session 1.1. Session Overview  Find out what computers are all about ...and what makes a great programmer  Discover.
Dan Bradley Condor Project CS and Physics Departments University of Wisconsin-Madison CCB The Condor Connection Broker.
Condor Project Computer Sciences Department University of Wisconsin-Madison Condor Job Router.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor NT Condor ported.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison Condor and Virtual Machines.
CVMFS: Software Access Anywhere Dan Bradley Any data, Any time, Anywhere Project.
Building the International Data Placement Lab Greg Thain Center for High Throughput Computing.
1 Getting Started with OSG Connect ~ an Interactive Tutorial ~ Emelie Harstad, Mats Rynge, Lincoln Bryant, Suchandra Thapa, Balamurugan Desinghu, David.
Five todos when moving an application to distributed HTC.
Job submission overview Marco Mambelli – August OSG Summer Workshop TTU - Lubbock, TX THE UNIVERSITY OF CHICAGO.
Greg Thain Computer Sciences Department University of Wisconsin-Madison HTPC on the OSG.
Harnessing the Power of Condor for Human Genetics
Condor: Job Management
Building and Testing using Condor
Understanding Supernovae with Condor
The Condor JobRouter.
Presentation transcript:

Condor Project Computer Sciences Department University of Wisconsin-Madison Running Interpreted Jobs

Overview › Many folks running Matlab, R, etc. › Interpreters complicate Condor jobs › Let’s talk about best practices.

What’s R? #!/usr/bin/R X <- c(5, 7, 9) cat (X) What could possibly go wrong?

Submit file universe = vanilla executable = foo.r output = output_file error = error_file log = log queue

What’s so hard? #!/usr/bin/R What if /usr/bin/R isn’t there? #!/usr/bin/env R isn’t good enough -- Condor doesn’t set the PATH for a Condor job.

Pre-staging: One (not-so-good) solution If you control the site, pre-stage R #!/software/R/bin/R › Fragile!

Pre-staging: If you must… “test and advertise” Use a Daemon ClassAd hook like: STARTD_CRON_JOBLIST = R_INFO STARTD_CRON_R_INFO_PREFIX = STARTD_CRON_R_INFO_EXECUTABLE = \ $(STARTD_CRON_MODULES)/r_info STARTD_CRON_R_INFO_PERIOD = 1h STARTD_CRON_R_INFO_MODE = periodic STARTD_CRON_R_INFO_RECONFIG = false STARTD_CRON_R_INFO_KILL = true STARTD_CRON_R_INFO_ARGS =

#!/bin/sh if [[ -d /path/to/r/bin && /path/to/R/bin/R –version > /dev/null ]] then echo “has_r = true” fi What about multiple installations of R ? R_info script contents

Pre-staging is bad › Limits where your job can run › Must be an administrator to set up › Difficult to change  Pre-staged files can change unexpectedly –Upgrade, new system installation, disk problems, …

Solution: take it with you › Bundle up the whole runtime › Transfer the bundle with the job › Wrapper script unbundles and runs › Downsides:  Extra time overhead to unbundle  Not so good for short* jobs

Benefits › Can run anywhere*:  Flocked, Campus Grids, OSG, etc. › Each job can have own runtime version/configuration.

Revised submit file universe = vanilla executable = wrapper.sh output = output_file error = error_file transfer_input_files = runtime.tar.gz, foo.r should_transfer_files = true when_to_transfer_output = on_exit log = log queue

wrapper.sh #!/bin/sh tar xzf runtime.tar.gz./bin/R foo.r

Downside: Those Huge Runtimes › Full R, matlab runtime 100 Mb  Adds up when running thousands of jobs › Trivia: How long to transfer 100 Mb?  Is this really a problem?

Mitigating Huge Runtimes 1. Trim the bundle down (identify unneeded files with strace) 2. Second, perhaps > 1 task per job Finally, cache with Squid

Users, not admins

Using HTTP/Squid  Change wrapper to manually wget  Set env http_proxy to squid source OSG_SQUID_LOCATION in OSG Otherwise, set with Daemon ClassAd hooks and $$  Cut runtime.tar.gz from transfer_input_files, add wget –retry-connrefused –waitretry=10 your_http_server  To the wrapper script – note retries Don’t use curl!  Or set –H pragma

Matlab complications › Licensing…  Octave (?)  Matlab compiler! › Matlab parallel toolkit  HTPC

Cross Platform submit › Many grids > 1 platform:  Unix vs. Windows; 32 vs 64 bit › Huge benefit of High Level language:  Write once, run, … well… › Use Condor $$ to expand:

executable = wrapper.$$(OPSYS).bat › Condor will expand OPSYS to LINUX or WINNT › Write both wrappers, make sure to wget correct runtime

Summary Many folks running lots of interpreted jobs Transferring runtime along beneficial, but requires set up Cross platform submits can be huge win