INFSO-RI Enabling Grids for E-sciencE SA1 Operations Manual P. Strange RAL, CCLRC UK
Enabling Grids for E-sciencE INFSO-RI ARM-7 Krakow – 15/16 May What is it called? What should this document be called? In many documents in circulation there is no common name for this…. SA1 Operations Procedure Manual Operations Manual Operations Procedure etc
Enabling Grids for E-sciencE INFSO-RI ARM-7 Krakow – 15/16 May What are things inside SA1 called? Within the document (and other docs) there are inconsistencies CIC-on-duty, Operator-on-duty, grid operator-on-duty…… CIC-portal, operations portal, on-duty portal……. COD dashboard, operations dashboard, on-duty dashboard…..
Enabling Grids for E-sciencE INFSO-RI ARM-7 Krakow – 15/16 May Escalation procedure StepDeadlineEscalation procedure COD Action Label 131st mail to site admin and ROC 2 3 2nd mail to ROC and site admin 3<5final mail to ROC followed up by a phone call notifying ROC that this will go forward to the next weekly operations meeting for discussion 4Discuss at the next weekly operations meeting 5Ask ROC to suspend site
Enabling Grids for E-sciencE INFSO-RI ARM-7 Krakow – 15/16 May Suspension of sites For normal course of operations, a site status would be in “production”. For unscheduled troubles that are not addressed by either the site or the ROC, the COD will apply an “escalation step procedure” described in section 7.6 and also section 8. The final step is suspension, and the site is taken out of the grid resources. For the site to be suspended, a given ROC would have to disregard answering several mails and phone calls over a period of more than two weeks and not join the weekly operations meeting when asked to. As soon as the site’s status is modified, the ROC would get another mail of notification. Before this happens, the ROC should make contact with the senior people in the federation, the site and the COD. After, the ROC would have to re-certify the site before its status is put into “production” again. It is well understood, that such “suspending a site” action may directly apply in emergency cases, e.g. security incidents. The escalation procedure is then by-passed totally by either the ROC or the COD. This procedure needs clarifying into simple points………