Presentation is loading. Please wait.

Presentation is loading. Please wait.

DUNE Software and Computing News and Announcements Tom Junk DUNE Software and Computing General Meeting February 2, 2016.

Similar presentations


Presentation on theme: "DUNE Software and Computing News and Announcements Tom Junk DUNE Software and Computing General Meeting February 2, 2016."— Presentation transcript:

1 DUNE Software and Computing News and Announcements Tom Junk DUNE Software and Computing General Meeting February 2, 2016

2 New Web Sites dune-data.fnal.gov  Monte Carlo – Challenge 5.0 and future MC MC samples and tiers  Data Files from the 35-ton prototype File list – automatically updated from file transfer script samweb usage tips – tells you how to access files! dune-young.hep.net  Content copied from lbne-young.hep.net (still not up to date) lbne-dqm.fnal.gov  Online and Nearline monitoring for 35-ton 02.02.16Tom Junk | DUNE S&C General News2

3 New Build Node dunebuild01.fnal.gov 16 Cores! (AMD Opteron 6320) 32 GB of RAM, 5 GB of swap To be used for building code only (we’ll watch for misuse) mrb i –j16 now gives you a big boost in speed dCache disks are not mounted /dune/data and /dune/data2 however are still mounted. /build/ has 2.8 TB in it. Not clear how to use this effectively. Let Tom know if you need something different on it. 16 Cores was chosen based on Lynn Garren’s build speed test: https://indico.fnal.gov/contributionDisplay.py?contribId=9&confId=10257 With builds using BlueArc (/dune/app), more cores than 16 gives diminishing returns in speed due to disk i/o bottlenecks. That and the fact that machines with > 16 cores are even less available than the one we got with 16 cores. 02.02.16Tom Junk | DUNE S&C General News3

4 New Redmine Sites dunebsm Exotic Physics with DUNE dunefgt Fine-Grained Tracker dunelbl Long-Baseline Physics WG dunendk Nucleon Deay HighLAND Analysis Tool WA105 Dual-Phase protoDUNE 02.02.16Tom Junk | DUNE S&C General News4

5 CILogon Certificates Replacing OSG Grid certificates – DUNE VO user entries with OSG Grid Certificates now given entries for CILogon certificates Current OSG Grid certificates remain valid until their expiration – no need to hurry and get a replacement CILogon certificate but the next time it’s refreshed there will be a new procedure. Eileen and Anne have contacted certificate users of the docdb’s and gave instructions for obtaining and using CILogon certificates with the docdb’s. CILogon will replace KCA certificates too.  jobsub client called kx509 to generate short-lived certificates using the user’s Kerberos ticket.  other uses, like SAM, required the user to execute kx509 or get- cert.sh (which calls kx509) to get a certificate.  Jobsub use of CILogon “to be transparent to the users” 02.02.16Tom Junk | DUNE S&C General News5

6 AFS at Fermilab is being shut down Feb. 25, 2016 Web sites at /afs/fnal.gov/files/expww are migrated to the NFS storage area /web/sites/. Available on FNALU and dunegpvm01 (but not other dunegpvm’s) Home areas in /afs/fnal.gov/home/room[1,2,3]/username being replaced with other networked storage. I was never fond of our AFS home areas anyhow  Very small quotas in the home area: 500 MB (!)  Authentication token which expires after 26 hours has caused user confusion.  It has its own syntax for managing. Want to know your quota? fs lq.  Not available on grid workers (wouldn’t want that anyhow for the replacement.)  Backups in /afs/fnal.gov/files/backup/home  AFS@FNAL documentation (becoming irrelevant) https://computing.fnal.gov/unixatfermilab/html/afs.html 02.02.16Tom Junk | DUNE S&C General News6

7 New Home Areas and Web sites Used to have personal “professional” web areas in ~/public_html/index.html for example. Accessed via http://home.fnal.gov/~user/index.htmlhttp://home.fnal.gov/~user/index.html Directory listings over http disabled without a special Service Desk request. Now there are NFS web areas dunegpvm*:/publicweb/ / where is the first letter of your user ID (= kerberos principal) Backups in /publicweb/.snapshot in case you accidentally delete something Home area snapshots and backups in the post-AFS era to be defined and documented. 02.02.16Tom Junk | DUNE S&C General News7

8 lbnegpvm*.fnal.gov  dunegpvm*.fnal.gov Users were in the lbne group  active users or recently active users given new accounts in the dune group New dunegpvm11 spun up with new group and new user list. No /lbne/data, /lbne/data2, /lbne/app mounts on new dune machine. Same areas are mounted under /dune Still have /pnfs/lbne mounted (needed as some files are accessible only that way). Same with /scratch/lbne Current status: migrated lbnegpvm06 – lbnegpvm10 to dunegpvm machines. Gave back dunegpvm11. lbnegpvm01 through lbnegpvm05 (with dunegpvm convenience names) being converted as I write this. Finding missing things (like dCache mounts) and iterating with the Service Desk. 02.02.16Tom Junk | DUNE S&C General News8

9 BlueArc Dismount on Grid Workers Affects us in particular!  /lbne/data, /lbne/data2 not mounted on dunegpvm6-10 machines, but still mounted on grid worker nodes.  /dune/data, /dune/data2 not mounted on grid worker nodes (!). These mount points were made after the decision to migrate away from BlueArc on the grid was taken.  Two ways to store your data: ifdh cp it to dCache: /pnfs/dune/persistent/users and /pnfs/dune/scratch/users Ask about tape-backed space! (We prefer SAM so the files won’t get lost) ifdh cp the files to BlueArc (many people still do this). This too will be disabled! End of 2016 shutdown! 02.02.16Tom Junk | DUNE S&C General News9

10 Metadata Changes Existing data tiers: raw simulated detector-simulated full-reconstructed New data tier: sliced The slicer/stitcher input source only works on raw data – limited number of data products it has to know how to slice and stitch. A new problem: The slicer/stitcher reformats events based on a software trigger definition. Do we need to store which trigger def was used in metadata? Tack it on the end of the detector type string? 02.02.16Tom Junk | DUNE S&C General News10

11 A Good Run List Proposal So far only 35-ton has data and thus needs a good-run list. One person’s bad data is another person’s good data. Alex Himmel suggested it would make SAM dataset queries simpler if good-run status were part of the metadata Can request a new good-run metadata field: arbitrary string so we can encode various kinds of goodness or badness. CDF had good run lists that were distributed as root trees and text files. Didn’t make sense to limit public datasets to a particular good- run set because runs would be re-classified and it takes a long time to reprocess everything. Need curation of the good run list. Who decides? Shift tool? Data Quality Team needed to make judgments. For 35-ton, we probably want analyzers to be tightly coupled to the data taking. Label special data runs for special analyses and record run numbers and ranges that are intended for subsequent analyses. 02.02.16Tom Junk | DUNE S&C General News11

12 FIFE News Summer 2016 FIFE Workshop during the week of June 20 Fermilab GPGrid new features: partitionable slots, priority queueing instead of quotas: https://fermipoint.fnal.gov/organization/cs/scd/_layouts/15/WopiFrame.aspx?sourcedoc=/organization/cs/scd/CS%20Liaison%20Meet ings%20Library/CSLiaison_01_13_16.pdf&action=default Job Efficiency Links http://web1.fnal.gov/scoreboard/daily_reports/fife-efficiency.daily.latest http://web1.fnal.gov/scoreboard/weekly_reports/fife-efficiency.weekly.latest http://web1.fnal.gov/scoreboard/monthly_reports/fife-efficiency.monthly.latest 02.02.16Tom Junk | DUNE S&C General News12

13 Job Resource Limits Enforced on FNAL GPGrid Last year the grid was more forgiving about going over  time limits (not CPU, wall-clock time is what counts)  virtual memory size  disk space used But now these limits are enforced. See the page https://cdcvs.fnal.gov/redmine/projects/dune/wiki/Submitting_Jobs_at_Fermilab For examples of how to ask for resources and links to more documentation. What happens if your job goes over the limit? It doesn’t get killed, but rather gets Held. To find out what went wrong, jobsub_q --held --user= You can use fifemon.fnal.gov to monitor how many jobs you have in each state. Policy may be different on non-FNAL OSG sites. 02.02.16Tom Junk | DUNE S&C General News13

14 Very minor... Users in the LBNE VO are getting e-mails saying that their AUP (Acceptable Use Policy) signatures are expiring (1 year). Users can ignore these and use the DUNE VO instead. 02.02.16Tom Junk | DUNE S&C General News14

15 /dune/app Filled up briefly yesterday 02.02.16Tom Junk | DUNE S&C General News15

16 Reminder: DAQ Workshop at CERN Dates: Feb. 25-26 at CERN https://indico.fnal.gov/conferenceDisplay.py?confId=11372 DAQ Hardware, Software, and Offline Computing Infrastructure Ask Maxine (maxine@fnal.gov) about site access for non-CERN users.maxine@fnal.gov 02.02.16Tom Junk | DUNE S&C General News16


Download ppt "DUNE Software and Computing News and Announcements Tom Junk DUNE Software and Computing General Meeting February 2, 2016."

Similar presentations


Ads by Google