Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grid Management Challenge - M. Jouvin

Similar presentations


Presentation on theme: "Grid Management Challenge - M. Jouvin"— Presentation transcript:

1 Grid Management Challenge - M. Jouvin
Grid Site Setup Michel Jouvin LAL, Orsay Grid Administration Training LAL, Orsay, September 2008, 15-19 Grid Management Challenge - M. Jouvin 06/04/2019

2 Grid Management Challenge - M. Jouvin
Agenda General site parameters Site BDII CE WNs SE Grid Management Challenge - M. Jouvin 06/04/2019

3 General Site Parameters
Main documentation for configuring site parameters is available at eCustomization Edit template defining site network parameters cfg/sites/tutorial/site/pro_site_global_variables.tpl Edit template defining machine IP addresses and HW cfg/sites/tutorial/site/pro_site_databases.tpl Edit template defining per machine OS version cfg/sites/tutorial/site/pro_os_version.tpl Edit gLite site parameters (review defaults) cfg/sites/tutorial/site/glite/config.tpl Create cluster glite-3.1 from examples, review cluster specific templates cfg/clusters/glite-3.1/site/pro_site_cluster_info.tpl : cluster defaults Grid Management Challenge - M. Jouvin 06/04/2019

4 Site BDII Configuration
Create a HW template for BDII box cfg/sites/tutorial/hardware/machine/… Create a profile in cluster glite-3.1 cfg/clusters/glite-3.1/profiles/grid281.lal.in2p3.fr.tpl Start with a similar profile from example cluster Compile and deploy Don’t forget svn commit Configure initial installation of the machine aii-shellfe –configure grid281.lal.in2p3.fr aii-shellfe –install grid281.lal.in2p3.fr Start grid281… Grid Management Challenge - M. Jouvin 06/04/2019

5 Grid Management Challenge - M. Jouvin
BDII Management Logs /opt/bdii/var/bdii.log /opt/bdii/var/tmp/* Restart : service bdii restart Grid Management Challenge - M. Jouvin 06/04/2019

6 Grid Management Challenge - M. Jouvin
CE Configuration Create a HW template for CE box cfg/sites/tutorial/hardware/machine/… Create a profile in cluster glite-3.1 cfg/clusters/glite-3.1/profiles/grid282.lal.in2p3.fr.tpl Start with a similar profile from example cluster Define MAUI configuration (if using it) and check gLite site parameters cfg/sites/tutorial/site/glite/maui.tpl Choose between shared and non shared home dirs Default is shared home directories No user account other than those related to grid By default account are locked (no interactive access) Compile and deploy Configure initial installation of the machine and start grid282… Grid Management Challenge - M. Jouvin 06/04/2019

7 Grid Management Challenge - M. Jouvin
CE Management Main Torque/MAUI commands: List of jobs : showq (MAUI) List of WNs (-n) and queues (-c)… : diagnose (MAUI) Detailed information about a job (MAUI) : checkjob jobid Detailed information about a job (Torque) : qstat –f jobid Main log files /var/log/globus-gatekeeper.log /var/log/globus*marshall.log Torque/MAUI configuration and logs Torque : /var/spool/pbs MAUI : /var/spool/maui + /var/log/maui.log MAUI default configuration : Fairshare : take into account VO usage in the previous days 2 job slots per CPU : 1 dedicated to dteam/ops (tests) et short deadline jobs Grid Management Challenge - M. Jouvin 06/04/2019

8 Grid Management Challenge - M. Jouvin
WN Configuration Create a HW template for WN box cfg/sites/tutorial/hardware/machine/… Create a profile in cluster glite-3.1 cfg/clusters/glite-3.1/profiles/grid283.lal.in2p3.fr.tpl Start with a similar profile from example cluster Update WN list in gLite site parameters WORKER_NODES and WN_CPUS in cfg/sites/tutorial/site/glite/config.tpl Compile and deploy Don’t forget svn commit Configure initial installation of the machine aii-shellfe –configure grid283.lal.in2p3.fr aii-shellfe –install grid283.lal.in2p3.fr Start grid283… Grid Management Challenge - M. Jouvin 06/04/2019

9 Grid Management Challenge - M. Jouvin
WN Management No daemon, no logs Except Torque client (pbs_mom) but never a problem… Source of information is job stdout/stderr Grid Management Challenge - M. Jouvin 06/04/2019

10 Grid Management Challenge - M. Jouvin
User Interface (UI) Same configuration procedure… Use 1 account per user It is not possible to share .globus Never share a certificate Unlock user accounts created by Quattor if you want to be able to log in No daemon, no logs With 64-bit OS, edit 4 scripts replacing ‘python2’ by ‘python’ /opt/glite/bin/glite-wms-jobs-xxx Grid Management Challenge - M. Jouvin 06/04/2019

11 Grid Management Challenge - M. Jouvin
SE Configuration Create a HW template for SE DPM box cfg/sites/tutorial/hardware/machine/… Create a profile in cluster glite-3.1 cfg/clusters/glite-3.1/profiles/grid284.lal.in2p3.fr.tpl Start with a similar profile from example cluster Configuration : cfg/sites/tutorial/site/dpm/config.tpl May define a specific VO list (different from CE) in a template referred by NODE_VO_CONFIG variable Compile and deploy Don’t forget svn commit Configure initial installation of the machine aii-shellfe –configure grid284.lal.in2p3.fr aii-shellfe –install grid284.lal.in2p3.fr Start grid280… Grid Management Challenge - M. Jouvin 06/04/2019

12 Grid Management Challenge - M. Jouvin
SE Management Commands to display configuration dpm-qryconf: legacy command, doesn’t display everything dpm-listspaces: new command more friendly Configuration commands : dpm-xxx : modify pools, file systems configuration Requires environment varialbe DPM_HOST=dpm_host_name dpns-xxx : management of DPM « namespace » DPM ls, rm… Very few reasons to use these commands Requires environment variable DPNS_HOST=dpm_host_name Logs : 1 file per daemon (6) /var/log/dpm/log : physical operation (main log file) /var/log/srmv1|v2.2/log : SE access (through SRM) /var/log/dpns/log : namespace operations /var/log/rfio/log et /var/log/messages : RFIO + gridftp (transfers) Grid Management Challenge - M. Jouvin 06/04/2019

13 Checking Quattor Changes
Before deployment cp build build.saved in SCDB working copy Compile changes src/utils/profiles/compare_xml [-v] After deployment: Quattor client logs /var/log/ncm-cdispd.log In case of SPMA errors : /var/log/spma.log If nothing happened, troubleshoot SCDB hook script : Running a component manually: ncm-ncd --configure [component…|-all] Checking client configuration ncm-query --component component Grid Management Challenge - M. Jouvin 06/04/2019


Download ppt "Grid Management Challenge - M. Jouvin"

Similar presentations


Ads by Google