Download presentation
Presentation is loading. Please wait.
Published byJovani Stearns Modified over 10 years ago
1
Open Science Grid Discovering and understanding the site environment Or, yet another site test kit
2
2007/07/26 OSG Users Meeting 2007: Chris Green, OSG User Support / FNAL 2 Open Science Grid Motivation What does an, application owner need to know about a site that VORS wont tell him? –Whether my jobs will be authorized and run (the dreaded GRAM errors 7, 47 & 22). –Whether grid-ftp works from a worker node; SQUID? srmcp? –Does MPI work? What is the compile command? –Other application libraries and utilities: DB clients, Curl, Ruby? –How many jobs are running right now? How many slots are free?
3
2007/07/26 OSG Users Meeting 2007: Chris Green, OSG User Support / FNAL 3 Open Science Grid Motivation –Is OSG_APP writable from the WN? OSG_DATA? –Will a batch job of mine ever actually start? Different discovery methods: –Fork jobs; –VORS; –ReSS ClassAds; –Batch jobs.
4
2007/07/26 OSG Users Meeting 2007: Chris Green, OSG User Support / FNAL 4 Open Science Grid Motivation How to collate and present / use information from these different sources? How to tailor for multiple combinations of requirements for different applications?
5
2007/07/26 OSG Users Meeting 2007: Chris Green, OSG User Support / FNAL 5 Open Science Grid Overview Written in Perl: –Extensible through module inheritance and dynamic code evaluation; –Good chance of familiarity for VO admins; –Fast development / test cycle. Supports multiple discovery types: –local commands including globus-job-run; –batch jobs; –VORS; –ReSS ClassAd integration (still to be developed).
6
2007/07/26 OSG Users Meeting 2007: Chris Green, OSG User Support / FNAL 6 Open Science Grid Overview Human-readable HTML summary page with links to detailed test results; also CSV for machine-readability. Obtains VO credentials (VOMS proxy) before running tests. Manages batch job submission and monitoring; time-out facility. Application owners can: –add new tests; –produce canned test list configurations for different applications.
7
2007/07/26 OSG Users Meeting 2007: Chris Green, OSG User Support / FNAL 7 Open Science Grid Details: configuration ( { command => "gridSiteTest::Ping" }, { command => "gridSiteTest::Environment" }, { command => "gridSiteTest::VORS" }, { command => "gridSiteTest::VORS", attributes => { results => [ { "attribute-name" => "sponsor_vo", "column-title" => "Sponsoring VO(s)" } ] } }, { command => "gridSiteTest::ForkCommand", args => [ "CE user check", "/usr/bin/id" ] }, … Test configuration file is parsed as a Perl array of anonymous hashes representing individual tests. Shell command or module name Module-specific control attributes Generic fork command Specific test name Command to execute
8
2007/07/26 OSG Users Meeting 2007: Chris Green, OSG User Support / FNAL 8 Open Science Grid Details: configuration { command => "gridSiteTest::CondorGTest", args => [ "Job JDL test" ], attributes => { jdl => <<'EOF' transfer_executable = false executable = /usr/bin/id EOF, results => [ { "column-title" => "User", "detail-key" => 'Condor output file', "match-operator" => '=~', "match-regex" => 'm&uid=\d+\(([^\)]+)&', "match-value" => '$1' }, Generic batch job test JDL Define results columns Get user name from /usr/bin/id output Specific test name
9
2007/07/26 OSG Users Meeting 2007: Chris Green, OSG User Support / FNAL 9 Open Science Grid Details: configuration { "column-title" => "Group", "detail-key" => 'Condor output file', "match-operator" => '=~', "match-regex" => 'm&gid=\d+\(([^\)]+)&', "match-value" => '$1' } ] # End result column definitions } # End attribute definitions } # End test definition ); # End test list Get group name from /usr/bin/id output
10
2007/07/26 OSG Users Meeting 2007: Chris Green, OSG User Support / FNAL 10 Open Science Grid jdl => <<'EOF' transfer_executable = true executable = my-test-script EOF Details: configuration Different JDL can specify a user script with anything you like in it to transfer to and run on a remote WN. jdl => <<'EOF' transfer_executable = false executable = /usr/bin/id EOF Remember this JDL from the previous example? If there isn't a test you can configure to give you what you want: write your own; boiler plate is straightforward.
11
2007/07/26 OSG Users Meeting 2007: Chris Green, OSG User Support / FNAL 11 Open Science Grid Details: test results SITE: CIT_CMS_T2 GATEKEEPER: cit-gatekeeper.ultralight.org:2119/jobmanager-condor TEST: gridSiteTest::CondorGTest "Engage worker node test" DESCRIPTION: Engage worker node test DATE: Wed Jun 6 11:59:06 CDT 2007 EXIT CODE: 0 ------------------------------------------------------------------------ Test results ------------ PASS?: OK (PASS) EngageCENetworkOutbound: True EngageOSGAPPWriteWorkNode: True EngageOSGDATAWriteWorkNode: True ------------------------------------------------------------------------ Detailed results file for each test: –Basic test info –Test results: what goes into the summary
12
2007/07/26 OSG Users Meeting 2007: Chris Green, OSG User Support / FNAL 12 Open Science Grid Details: test results Test details ------------ Output ------ Submitting job(s). Logging submit event(s). 1 job(s) submitted to cluster 818944. ------------------------------------ … –Test details: module-dependent detailed information, eg: "Output", "Error", "JDL", "Condor output", "condor error", etc, etc.
13
2007/07/26 OSG Users Meeting 2007: Chris Green, OSG User Support / FNAL 13 Open Science Grid Details: summary page Here's one I prepared earlier: http://user-support.opensciencegrid.org/site_tests/Engage/latest/
14
2007/07/26 OSG Users Meeting 2007: Chris Green, OSG User Support / FNAL 14 Open Science Grid Project status Project close to ready for release; needs: –A week's solid work on documentation –Intrepid users to give comments on infrastructure, documentation. Still need to implement ReSS interface to get all information sources intergrated. Volunteers?
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.