Presentation is loading. Please wait.

Presentation is loading. Please wait.

PSEN Server Balance EN/ICE Procedures Jean-Charles Tournier EN/ICE/SCD 09-September-2015.

Similar presentations


Presentation on theme: "PSEN Server Balance EN/ICE Procedures Jean-Charles Tournier EN/ICE/SCD 09-September-2015."— Presentation transcript:

1 PSEN Server Balance EN/ICE Procedures Jean-Charles Tournier EN/ICE/SCD 09-September-2015

2 Foreword This procedure lists the actions needed to be completed by EN/ICE in order to allow EN/EL/CO to update PSEN smoothly. It does not cover the operations needed to create/delete/update devices as it is under EN/EL/CO responsibility and covered by their procedures.

3 Outline Preparatory set up Procedures Common issues

4 Preparatory Setup All operations on the data servers should be done as UNICRYO On cs-ccr-psen1 and cs-ccr-psen2, open – A terminal with the “top –d 1” command to monitor the processes – The WinCC OA log viewer From cs-ccr-psen1 (but could be also cs-ccr- psen2), open the system overview panel to monitor the redundancy/split status and stop/start of manager progress

5 PSEN SERVER BALANCE PROCEDURE

6 Procedure Overview 1.Notify EN/ICE standby service 2.Perform an online backup 3.Let EN/EL/CO import the device definition 4.Stop the project on cs-ccr-psen2 5.Backup the project on cs-ccr-psen2 6.Restart the project on cs-ccr-psen2 7.Let EN/EL/CO restart all the managers 8.From psen2, go back to redu mode 9.Restart all the managers on cs-ccr-psen1 10.Verify that the update is completed 11.Notify EN/ICE standby service

7 STEP 1 – Notify EN/ICE SBS Send an email to the EN/ICE standby service to ignore all PSEN alarms in Moon during the update. From: Jean-Charles Tournier jean-charles.tournier@cern.ch> Date: Thursday 2 July 2015 08:36 To: "en-dep-ice-piquet (Members of the EN-ICE Standby Service)" en-dep-ice-piquet@cern.ch> Subject: PSEN Update Hello, Please ignore all PSEN alarms in Moon this morning as we are starting the weekly update. I’ll send an email once the operation is over. Thanks, jc

8 STEP 2 – Online Backup from PSEN2 Overview From cs-ccr-psen2, perform an online backup of the db folder. Make sure to configure the backup for cs-ccr- psen2 in the online backup panel The online backup is to be located in: –~/PVSS_projects/PVSS_backup/psen/before_balance The operation takes between 1 and 2 minutes to be completed

9 STEP 2 – Online Backup from PSEN2 Details 1 - Open System Management 2 – Open online backup 3 – Click on Host 2 4 – Make sure the target path is set correctly to “…./before_balance/” 5 – Click on “Start backup”

10 STEP 3 – Import of Devices This step is performed by EN/EL/CO For a typical update, this step takes 1h30 but can be longer if more devices need to updated/created or if problems are encountered To have an idea of the update size, connected to the ENSDM production DB and do: – select count(*) from items; – An update is considered small when it has less than 10,000 items (representing 10,000 DPs to be updated/created in PSEN).

11 STEP 4 – Stop the project on PSEN2 Once informed by EN/EL/CO that they completed the update, stop the project on PSEN2 – Open the project administration console – Stop the project psen – Wait for the project to be completely stopped “ps –eaf | grep WCC” should only give results for the log viewer and the project administration console

12 STEP 5 – Backup of psen2 From cs-ccr-psen2: –cd ~/PVSS_projects/PVSS_backup/psen –./psenBackup.sh – Wait for the backup to complete (it takes about 5 minutes) The script is also available at: – https://svnweb.cern.ch/cern/wsvn/en-ice- svn/trunk/applications/PSEN/Development/PSEN/ Misc/psenBackup.sh https://svnweb.cern.ch/cern/wsvn/en-ice- svn/trunk/applications/PSEN/Development/PSEN/ Misc/psenBackup.sh

13 STEP 6 – Restart the project on PSEN2 From cs-ccr-psen2, open the administration console and simply restart the project “psen” – Only the core managers will be restarted, i.e. data event redundancy, etc. – It takes a bit less than 10 minutes for these managers to restart Once the project is restarted, notify EN/EL/CO that they can complete the restart of the project on cs-ccr-psen2

14 STEP 7 – Let EN/EL/CO complete the project restart If you are asked to do it for them: – Open a UI manager on cs-ccr-psen2 only WCCOAui –proj psen –m gedi –data cs-ccr-psen2 –event cs-ccr- psen2 –num 127 – Open the “Manager plan” panel and apply the nominal plan to cs-ccr- psen2 (it takes about 20mn for the plan to complete)

15 STEP 8 – Back to redundancy mode Wait for a green light from EN/EL/CO – they need to export masked alarms before this step From cs-ccr-psen2 open a gedi – WCCOAui –proj psen –m gedi – Note: it is really important to start the gedi from cs-ccr-psen2 otherwise newly created DP may be lost Open the system management panel Click on the redundancy button Make sure to select the right peer to keep as reference (normally you should select cs-ccr-psen2) – In you are not able to access the system management panel (blank panel with error) or you don’t have the choice to choose which server to restart check the common issues slidesthe common issues slides cs-ccr-psen1 will be restarted and the event manager on both projects will be at 100% for a while (1-2 minutes on cs-ccr-psen2, more on cs-ccr-psen1) Wait for the complete restart of cs-ccr-psen1 and to have the message “recovery completed” from the log of psen1 Copy the LASER configuration files from cs-ccr-psen2 to cs-ccr-psen1 – From cs-ccr-psen1 do scp cs-ccr-psen2:~/PVSS_projects/psen/psen/data/LASER_CONFIG_FILES/* ~/PVSS_projects/psen/psen/data/LASER_CONFIG_FILES

16 STEP 9 – Put PSEN1 into nominal mode From a Gedi open either on cs-ccr-psen1 or cs- ccr-psen2, open the manager plan panel – Set it to cs-ccr-psen1 – Apply the nominal mode This step takes about 20 minutes

17 STEP 10 – Verify that the update is completed Open a normal PSEN session from e.g. the terminal servers – Make sure that the system integrity square is green – Open the alarm screen and check that alarms are coming/leaving – Open the DPE finder and search for all tags (about 230K tags should be found) – Do a search from the event screen – Verify with EN/EL/CO that an alarm can be masked and unmasked If it can not, simply stop the PSEN_MaskedAlarm manager on PSEN2, then on PSEN1, then restart it on PSEN1 and finally on PSEN2 (to avoid to have PSEN2 becoming the active peer while the manager is restarted)

18 STEP 11 – Notify EN/ICE Standby service Send an email to en-dep-ice-piquet@cern.ch to let them know that the update is completed and they should monitor PSEN in MOONen-dep-ice-piquet@cern.ch

19 COMMON ISSUES

20 Error during import of tags for an RTU Sometimes import of tags for one or more RTUs can fail with messages similar to The problem is fixed with acknowledging any outstanding alarms for the concerned DPE and retrying the import WCCOAui (127), 2015.09.09 10:20:49.810, CTRL, WARNING, 5/ctrl, Location of the following log entry: Module: Import DB Panel: /opt/unicryo/PVSS_projects/psen/installed_components/panels/vision/PSEN/ENSDB/PSEN_LoadRtu.pnl [] Object: 6 named: "DBLoadButton" of type: PUSH_BUTTON Script: Clicked Library: /opt/unicryo/PVSS_projects/psen/installed_components/scripts/libs/fwConfigs/fwAlertConfig.ctl Line: 1831 WCCOAui (127), 2015.09.09 10:20:49.780, PARAM,WARNING, 31, Config type change failed, DP: psen_1:psen- ETC03_slash_BE-EHI04_slash_BE-PSEN_tag_double-6144463.delay_value:_alert_hdl.._type, MAN: (SYS: 530 Ui -num 127 CONN: 2), USER: 37665

21 DIP Manager stops/crashes When going back from split mode to redu mode, the DIP manager on PSEN2 may stop. – Simply restart it once the event manager of PSEN2 is at reasonable level (it is normally at 100% for some time during this phase) Once PSEN1 is completely restarted and re- become the active peer, the DIP manager on PSEN1 may stop. – You have to stop the DIP manager on PSEN2, then start DIP on PSEN1 and finally restart DIP on PSEN2

22 Alarms are not masked Simply restart the PSEN_MaskedAlarm manager – Stop the manager on cs-ccr-psen2 – Stop the manager on cs-ccr-psen1 – Start the manager on cs-ccr-psen1

23 Blank System Management Panel - 1/2

24 Blank System Management Panel - 2/2 This is because the dpe SystemNum of _DistManager and _DistManager_2 are not equals Clear both of them The dist manager is not required to run open the system overview panel

25 Cannot Choose Which Server to restart If you get the top picture (instead of the bottom one), this is because Gedi has been started with the options “-data …. – event …” and connect to only one peer. To get the bottom picture, Gedi needs to be started normally, with simply “WCCOAui –proj psen –m gedi” Do not click OK as you don’t know which peer will be restart.


Download ppt "PSEN Server Balance EN/ICE Procedures Jean-Charles Tournier EN/ICE/SCD 09-September-2015."

Similar presentations


Ads by Google