Download presentation
Presentation is loading. Please wait.
Published byZoe Ball Modified over 9 years ago
1
Zach Miller Computer Sciences Department University of Wisconsin-Madison zmiller@cs.wisc.edu http://www.cs.wisc.edu/condor What’s New in Condor
2
www.cs.wisc.edu/condor Overview › Condor Development Process Stable vs. Development › New Features in 6.6.0 › Significant improvements which are covered in other talks: What’s new in Condor-G covered by Todd Tannenbaum Hawkeye covered by Nick LeRoy COD (Computing On Demand) covered by Derek Wright Packaging and Testing covered by Alain Roy
3
www.cs.wisc.edu/condor Condor Development Process › We maintain two different releases at all times Stable Series Second digit is even: e.g. 6.2.2, 6.4.7, 6.6.0 Development Series Second digit is odd: e.g. 6.3.1, 6.5.2
4
www.cs.wisc.edu/condor Stable Series › Heavily tested › Runs on our production pool of nearly 1,000 CPUs › No new features, only bugfixes, are allowed into a stable series › A given stable release is always compatible with other releases from the same series 6.4.X is compatible with 6.4.Y › Recommended for production pools
5
www.cs.wisc.edu/condor Development Series › Less heavily tested › Runs on our small(er) test pool. › New features and new technology are added frequently › Versions from the same development series are not always compatible with each other
6
www.cs.wisc.edu/condor Overview of New Features Windows DAGMan Better Security Central Manager Improved Negotiation Black Holes New Utilities Smarter File Transfer Submit-time file staging New Installer ClassAd improvements And More!!
7
www.cs.wisc.edu/condor Improvements in Condor for Windows › Ability to run SCHEDULER universe jobs DAGMan Any executable or batch file › JAVA universe support JVM provided by execution site Better error management Ability to use CHIRP (Remote I/O)
8
www.cs.wisc.edu/condor Improvements in Condor for Windows (cont) › New Support for: Windows XP Foreign Language versions of Windows Legacy 16-bit app › Improved Windows-to-UNIX job submission and vice versa. › BirdWatcher, a system tray icon which gives basic status and control of Condor
9
www.cs.wisc.edu/condor New Features in DAGMan › DAGMan previously required that all jobs share one log file › Each job can now have it’s own log file › Understands XML userlogs › Can produce.dot file graphs
10
www.cs.wisc.edu/condor Better Security › GSI (X.509 Certificates) implementation more complete and customizable Each Condor daemon can have its own certificate You can run a “Personal Condor” with your user proxy › Easier configuration If you already have Globus installed, very little additional configuration of Condor is necessary to start using X.509 certificates for authentication › Improved error messages if something goes wrong Tells you if the problems was network, authentication, or authorization related
11
www.cs.wisc.edu/condor Central Manager New Features › Keeps statistics on missed updates › Can use TCP instead of UDP, if you must › Redundant central managers can be running with the SECONDARY_COLLECTOR_LIST parameter If the main central manager goes down, you may still run administrative commands › Central Manager daemons can now run on any port COLLECTOR_HOST = condor.cs.wisc.edu:9019 NEGOTIATOR_HOST = condor.cs.wisc.edu:9020
12
www.cs.wisc.edu/condor Improved Negotiation › Allows the condor_schedd (the job queue manager) to send “classes” of jobs to the Negotiator for matching › Previously, jobs were sent one at a time. › Now, 1000 of the same job will take the same time to negotiate as 100, 10 or just one job › Currently, job classes are defined in the condor_config file. Very soon, they will be automatically determined… “Buckets” will be needed
13
www.cs.wisc.edu/condor Avoiding Black Holes › Condor can keep track of the last N resource matches › This can be used to prefer the same machine if restarted › Can also be used to avoid a machine if restarted, which is a first step towards avoiding “Black Holes” – machines that consume jobs but always fail to run them
14
www.cs.wisc.edu/condor New Utilites › ‘condor_q –held’ gives you a list of held jobs and the reason they were put on hold › ‘condor_config_val –config’ tells you where (file and line number) an attribute is defined › ‘condor_rm –f’ will forcefully remove a job, which is particularily useful when the globus jobmanager is not cooperating › ‘condor_fetch_log’ will grab a log file from a remote machine: condor_fetch_log c2-15.cs.wisc.edu STARTD
15
www.cs.wisc.edu/condor Smarter File Transfer › New file transfer mechanism: ShouldTransferFiles = YES | NO | IF_NEEDED YES : Always transfer files to execution site NO : Rely on a shared filesystem IF_NEEDED : will automatically transfer the files if the submit and execute machine are not in the same FileSystemDomain › Very useful for cross-platform submitting and also for flocking
16
www.cs.wisc.edu/condor Submit-Time File Staging › When submitting a job, you can tell Condor to create a “sandbox” of all necessary input files with ‘condor_submit –s’ › After completion, job can stay in queue with ‘leave_in_queue’ expression › Output files are then fetched manually
17
www.cs.wisc.edu/condor New Installer › For Windows Based on MSI (Microsoft Software Installer) Batch Install option › For UNIX Version 6.6.0 will be available in RPMs Command line options specify the installation parameters, and no questions are asked Easier to automate
18
www.cs.wisc.edu/condor ClassAds › ClassAd attributes can be dynamically linked to external functions Example: [ label = “uptime” value = some_func_that_calls_uptime() ]
19
www.cs.wisc.edu/condor Misc New Features › Jobs can be submitted via GRAM (the Globus Gatekeeper) › Daemons do not have to run as ‘root’ or ‘condor’ to have multiple different users submitting › Rudimentary load balancing between checkpoint servers by picking one randomly from a list › More job policy expressions PERIODIC_RELEASE GLOBUS_RESUBMIT
20
www.cs.wisc.edu/condor Conclusion › Todd Tannenbaum will tell you about the roadmap for future work › Questions?
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.