Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 James N. Bellinger University of Wisconsin-Madison CMS Week June 2010 Coordination: Things to do Coordination: James N. Bellinger CMS week 2010.

Similar presentations


Presentation on theme: "1 James N. Bellinger University of Wisconsin-Madison CMS Week June 2010 Coordination: Things to do Coordination: James N. Bellinger CMS week 2010."— Presentation transcript:

1 1 James N. Bellinger University of Wisconsin-Madison CMS Week June 2010 Coordination: Things to do Coordination: James N. Bellinger CMS week 2010

2 2 Decide on Data Location Computer cluster for central analysis – Analysis can be done elsewhere (eg. the Barrel fit), but need collection point – We all need access Disk location Subdivided by project – Input for subproject – Output from subproject Communications area Final output area – Inspection, debugging, etc James N. Bellinger CMSWeek June 2010 Can we decide today? HOME LINKENDCAPBARREL INOUTWORKCODE Who can make this available? CODE

3 3 Understand DB Location CERN DB for final results – Barrel uses its own DB for all phases of processing Grouped by project – Input for subproject – Output from subproject James N. Bellinger CMSWeek June 2010 Spell this out today.

4 4 Define Signaling needs 1.When is data available-from/agreed-on by all subgroups? 2.When is the Link fit finished? 3.When is the Z-calculator finished? 4.When is the Transfer Line fit finished? 5.When are the SLM fits finished? 6.When is the Barrel fit finished? 7.Do we need to iterate with the Barrel? 8.Do we have a complete collection? 9.Is the process complete? James N. Bellinger CMSWeek June 2010 Is this complete?

5 5 Signaling needs: Breakdown 1 When is data available from/agreed-on by all subgroups? – We don’t have this process automated, nor do we have a clear naming convention – “Available” means the inputs to Cocoa are ready – When done, start the processing When is the Link fit finished? – Link is fast and first – Transfer Line and Z-calculator can begin immediately afterwards, using MAB info. Barrel too, though that’s a design question When is the Z-calculator finished? – This doesn’t exist yet (all by-hand!) – When done, info goes to SLM models James N. Bellinger CMSWeek June 2010

6 6 Signaling needs: Breakdown 2 When is the Transfer Line fit finished? – This took only about 10 minutes – When done, info about Transfer Plate positions has to migrate to the SLM model – Barrel model could be revised to use MAB DCOPS constraints When are the SLM fits finished? – Takes about an hour – After this, recover the fit CSC chamber positions and interpolate/fit the rest: part of our deliverables James N. Bellinger CMSWeek June 2010

7 7 Signaling needs: Breakdown 3 When is the Barrel fit finished? – Takes over 24 hours – Writes to local database: need to transfer info Do we need to iterate with the Barrel? – Design question. If Barrel fit has large shifts or doesn’t agree with Transfer Line constraints, may want to iterate; redo Transfer Line et seq. Do we have a complete collection? – We could have a tentative complete collection even while iterating Is the process complete? – Write to the DB and set up testing James N. Bellinger CMSWeek June 2010

8 8 Testing the fits Compare with previous and reference – Need estimates of range expected – Count excursions, flag if above some level – Time plots of selected fit quantities? Human eyeballs needed at first Data monitoring is a different animal James N. Bellinger CMSWeek June 2010 We need to spell out what each group is doing right now

9 9 Working Details CMSSW versions are ephemeral – Need automated “build me a new release” script Work from different areas, different machines (cocoa files overwrite each other) – Want robust inter-machine communication and file transfer If all on the same afs/nfs cluster, no problem: semaphore files Don’t want to monkey with socket programming – Mother process watches for semaphore, starts jobs Can be tricky James N. Bellinger CMSWeek June 2010

10 10 Processing cycle: Given an event time or range 1.Collect data from subsystems for event and massage to fit 2.Run Link job 3.Watch for done on Link 4.Rewrite HSLM models w/ Link fits, also Z-calculator and Transfer Line: provide info to Barrel 5.Run Z-calculator, Transfer Line, Barrel 6.Watch for done on Z-calculator and Transfer Line 7.Rewrite SLM models, write info for Barrel 8.Run SLM models 9.Watch for done on SLM 10.Fetch and interpolate chamber info from various models 11.Collect and present 12.Watch for done on Barrel 13.Fetch and interpolate chamber info 14.Collect everything and check against reference – At each step be ready to abort if fails or inconsistent James N. Bellinger CMSWeek June 2010

11 11 Devil in the Details: Step 1: Collect data Specify event by time interval? – First event in the interval if more than one? What naming convention for events? – Start time? I like seconds in epoch… – All event files, both temporary and permanent, should have the same timing ID somewhere in the name Easy to get DCOPS if online, scripts exist to do all this stuff: but we probably want a program, unless “mother” is also a script Not sure how to get Endcap analog. Some jitter: do we want to use an average? Does Barrel read data directly from the DB? Need “don’t use this” flags for known bad readings; each system with its own flags and definitions and lookup file or DB Do we have good automatic sanity checks? I still eyeball plots. James N. Bellinger CMSWeek June 2010

12 12 Devil in the Details: Step 2: Run Link Spawn a process? Needs to be in the right directory, with the right arguments. Can do it, but spawning needs careful monitoring. Using a script as the mother might be better. Endcap and Transfer Lines wants – MAB fit X/Y/Z and rx/ry/rz; estimated errors would be nice – LD fit positions – What format is good? Simple text files are easy to read Barrel wants – MAB fit X/Y/Z and rx/ry/rz – ? – What format is good for Barrel? DB? James N. Bellinger CMSWeek June 2010

13 13 Devil in the Details: Step 4: Rewrite models DCOPS uses text SDF files. We can create include files like MinusLD_1276014332.include and use a soft link from the current one to MinusLD.include Thanks to the internal structure of the Transfer Line SDF there would be a lot of these for the MABs. Since the Z-calculator doesn’t exist yet we can define any input we like. Simpler is better If I understood correctly the Barrel looks to a database for input for everything, so the rewrite needs to write to a database also. Scripts can do this also. James N. Bellinger CMSWeek June 2010

14 14 Devil in the Details: Step 10: Interpolate data Sometimes a simple fit matches the fit chamber positions (or angles) obviously well, and sometimes it doesn’t and I use an interpolation. Or, as when something doesn’t fit at all, I use the disk position and orientation and apply that to the photogrammetry. – How does our automatic procedure know which to use? James N. Bellinger CMSWeek June 2010

15 15 Devil in the Details: Architecture Need to check for failures at each step Need to check for timeout failures (hangs, reboots, etc) Need to have appropriate cleanup procedures at each step If writing to DB, may need to roll-back changes on failure? Or at any rate flag the entries as BAD James N. Bellinger CMSWeek June 2010

16 16 Documentation 1.DAQ 2.DCS general controls 3.Expert-only controls 4.Data handling 5.Fitting procedures 6.What to do with changes James N. Bellinger CMSWeek June 2010 All I have is #2 and #3 for DCOPS

17 17 Additional Material James N. Bellinger CMSWeek June 2010 Sample scripts DAQ Still TODO

18 18 Sample scripts 1 getframeALL.awk – http://www.hep.wisc.edu/~jnb/cms/tools/getframeALL.awk http://www.hep.wisc.edu/~jnb/cms/tools/getframeALL.awk – If report.out was generated using the correct flags and Samir’s code modification, this retrieves the positions and angles for each component in the coordinate system of each of its mother volumes in the hierarchy. – CMS CMS/yep1/slm_p12 x dx y dy z dz rx drx ry dry z drz – Errors are only valid when using the immediate mother volume – In my jargon the output is a “frame” file James N. Bellinger CMSWeek June 2010

19 19 Sample scripts 2 makeNewTPAwkFile.com – http://www.hep.wisc.edu/~jnb/cms/tools/makeNewTPAwkFile.com http://www.hep.wisc.edu/~jnb/cms/tools/makeNewTPAwkFile.com – If the report.out file was generated by a Transfer Line fit and you created a “frame” file using getframeALL.awk, then – This generates a new awk script whose name incorporates the framework file name – You can then use the new awk script to process an ideal SLM’s SDF and create one with Transfer Plate positions as found by the Transfer Line fit. I call this re-writing, but that’s misleading: you make a new file with certain parts changed James N. Bellinger CMSWeek June 2010

20 20 Sample scripts 3 unpdbloose.awk – http://www.hep.wisc.edu/~jnb/cms/tools/unpdbloose.awk http://www.hep.wisc.edu/~jnb/cms/tools/unpdbloose.awk – This takes a text file containing the row data from an event in the DCOPS database and creates a Cocoa text input data file from it – Refitting using the root histograms gives quality info unfortunately lost in the summary stored into the database, but this works – Since not all insanities are flagged, I edit the file to increase the errors on profiles I know to be bad but which pass the simple quality cuts. This script needs to be replaced by a program which reads in a “known-bad” list. James N. Bellinger CMSWeek June 2010

21 21 James N. Bellinger CMSWeek June 2010 DCOPS DAQ TODO

22 22 DCOPS DAQ Phoenix DAQ DCS Data Quality Monitoring Data transfer to offline Transforming selected event (not really DAQ) James N. Bellinger CMSWeek June 2010

23 23 Phoenix DAQ Write out every 60’th event as root? – 1/day, 5MB/event – Gives full plots, more details from fit if required – Need to move root files offline to permanent area Read Oracle password from protected file Automatic start at boot time – Can hack this with a cron job and avoid being tied to a single machine Tools to remotely kill/restart? – Not sure if ssh permissions allow this James N. Bellinger CMSWeek June 2010

24 24 DCOPS DCS Fix fake error bug Cleanup user interface James N. Bellinger CMSWeek June 2010

25 25 Data Quality Monitoring Not sure how to integrate into overall DQM Simple job (cron?) can collect raw data for day/week/month and flag excursions in a temperature plot 1’st question: is the DAQ still running? Need database table (file at first) with the known bad readings flagged: 504 ∗ 4 possible Need tool for experts to manipulate aforementioned table Need tool to make diagnostics available Not keen on reinventing the wheel James N. Bellinger CMSWeek June 2010

26 26 DCOPS Data to Offline Data put in Online DB, never finished job of moving it offline Move root histogram files (if we want them) James N. Bellinger CMSWeek June 2010

27 27 DCOPS Event selection Easy to create a database query and rewrite the results into a Cocoa-text file – Pieces exist, combine – Need a “bad profile” reference file – Add communication details and locations and naming conventions – Partition into different input files HSLM files are special, using analog and Link data also This has to be coordinated with the rest of the group James N. Bellinger CMSWeek June 2010

28 28 James N. Bellinger CMSWeek June 2010 Link DAQ TODO Barrel DAQ TODO I haven’t a clue. You tell us.


Download ppt "1 James N. Bellinger University of Wisconsin-Madison CMS Week June 2010 Coordination: Things to do Coordination: James N. Bellinger CMS week 2010."

Similar presentations


Ads by Google