Presentation is loading. Please wait.

Presentation is loading. Please wait.

South African Grid Training WORKER NODE Albert van Eck UFS - ICTS 17 November 2009 Slides by GIUSEPPE PLATANIA.

Similar presentations


Presentation on theme: "South African Grid Training WORKER NODE Albert van Eck UFS - ICTS 17 November 2009 Slides by GIUSEPPE PLATANIA."— Presentation transcript:

1 South African Grid Training WORKER NODE Albert van Eck UFS - ICTS 17 November 2009 Slides by GIUSEPPE PLATANIA

2 18 Nov. 2009 – Cape Town South African Grid Training 2 OUTLINE OVERVIEW INSTALLATION & CONFIGURATION TESTING FIREWALL SETUP TROUBLESHOOTING

3 18 Nov. 2009 – Cape Town South African Grid Training 3 OVERVIEW The Worker Node is a service where the jobs run. Its main function is to: – execute the jobs – update the status of the jobs to the Computing Element It can run on several kinds of client batch systems: – Torque – LSF – SGE – Condor

4 18 Nov. 2009 – Cape Town South African Grid Training 4 TORQUE client The Torque client is composed of: – pbs_mom – pbs_mom which places the job into execution. It is also responsible for returning the job’s output to the user

5 Worker Node installation & configuration using YAIM

6 There are several kinds of metapackages to install: ig_WN – “Generic” WorkerNode. ig_WN_noafs – Like ig_WN but without AFS. ig_WN_LSF – LSF WorkerNode. IMPORTANT: provided for consistency, it does not install LSF software but it apply some fixes via ig_configure_node. ig_WN_LSF_noafs – Like ig_WN_LSF but without AFS. ig_WN_torque – Torque WorkerNode. ig_WN_torque_noafs – Like ig_WN_torque but without AFS. WHAT KIND OF WN?

7 18 Nov. 2009 – Cape Town South African Grid Training 7 Repository settings REPOS=”ca dag glite-wn ig jpackage glite-wn_torque gilda” Download and save the repo files: for name in $REPOS; do wget http://grid018.ct.infn.it/mrepo/repos/ $name.repo -O /etc/yum.repos.d/$name.repo; done http://grid018.ct.infn.it/mrepo/repos A Worker Node doesn't require a host certificate

8 18 Nov. 2009 – Cape Town South African Grid Training 8 INSTALLATION yum remove jdk yum install xml-commons-resolver12 yum install jdk java-1.6.0-sun-compat yum install lcg-CA yum install torque-mom-2.1.9-4cri.slc4 yum install ig_WN_torque_noafs Gilda rpms: yum install gilda_utils gilda_applications In case you want to have AFS installed: – yum install openafs openafs-client kernel-module-openafs- `uname -r` – yum install ig_WN_torque

9 18 Nov. 2009 – Cape Town South African Grid Training Copy users and groups example files to /opt/glite/yaim/etc/gilda/ cp /opt/glite/yaim/examples/ig-groups.conf /opt/glite/yaim/etc/gilda/ cp /opt/glite/yaim/examples/ig-users.conf /opt/glite/yaim/etc/gilda/ Append gilda users and groups definitions to /opt/glite/yaim/etc/gilda/ig- users.conf and ig-groups.conf cat /opt/glite/yaim/etc/gilda/gilda_ig-users.conf >> /opt/glite/yaim/etc/gilda/ig-users.conf cat /opt/glite/yaim/etc/gilda/gilda_ig-groups.conf >> /opt/glite/yaim/etc/gilda/ig-groups.conf Customize ig-site-info.def

10 18 Nov. 2009 – Cape Town South African Grid Training 10 Copy ig-site-info.def template file provided by ig_yaim into gilda directory and customize it cp /opt/glite/yaim/examples/siteinfo/ig-site-info.def /opt/glite/yaim/etc/gilda/ Open /opt/glite/yaim/etc/gilda/ file using a text editor and set the following values according to your grid environment: CE_HOST= TORQUE_SERVER=$CE_HOST Customize ig-site-info.def

11 GROUPS_CONF=/opt/glite/yaim/etc/gilda/ig-groups.conf USERS_CONF=/opt/glite/yaim/etc/gilda/ig-users.conf JAVA_LOCATION=”/usr/java/latest” JOB_MANAGER=lcgpbs BATCH_BIN_DIR=/usr/bin BATCH_VERSION=torque-2.1.9-4 VOS=”gilda” ALL_VOMS=”gilda” Customize ig-site-info.def

12 18 Nov. 2009 – Cape Town South African Grid Training QUEUES=”short long infinite gilda” SHORT_GROUP_ENABLE=$VOS LONG_GROUP_ENABLE=$VOS INFINITE_GROUP_ENABLE=$VOS In case of to configure a queue fo a single VO: QUEUES=”short long infinite gilda” SHORT_GROUP_ENABLE=$VOS LONG_GROUP_ENABLE=$VOS INFINITE_GROUP_ENABLE=$VOS GILDA_GROUP_ENABLE=”gilda” Customize ig-site-info.def

13 18 Nov. 2009 – Cape Town South African Grid Training WN_LIST=/opt/glite/yaim/etc/gilda/wn-list.conf The file specified in WN_LIST has to define the list of all your WNs' full hostnames. WARNING: It’s important to configure the WN file before you run the yaim configure command Customize ig-site-info.def

14 18 Nov. 2009 – Cape Town South African Grid Training WN Torque CONFIGURATION Now we can configure the node: /opt/glite/yaim/bin/ig_yaim -c \ -s /opt/glite/yaim/etc/gilda/ \ -n ig_WN_torque_noafs

15 Worker Node testing

16 18 Nov. 2009 – Cape Town South African Grid Training Verify if the pbs_mom is active and if its status is free: [root@wn root]# /etc/init.d/pbs_mom status pbs_mom (pid 3692) is running... [root@wn root]# pbsnodes -a wn.localdomain state = free np = 2 properties = lcgpro ntype = cluster status = arch=linux,uname=Linux wn.localdomain 2.4.21-37.EL.cern 1 Tue Oct 4 16:45:05 CEST 2005 i686,sessions=5892 5910 563 1703 2649,3584,nsessions=6,nusers=1,idletime=1569,totmem=254024kb,av ailmem=69852kb,physmem=254024kb,ncpus=1,loadave=0.30,rectim e=1159016111 Testing

17 18 Nov. 2009 – Cape Town South African Grid Training First of all, check if a generic user on WN can ssh to the CE without typing a password: [root@wn root] su – gilda001 [gilda001@wn gilda001] ssh ce [gilda001@ce gilda001] The same test has to be executed between the WNs in order to run MPI jobs: [gilda001@wn gilda001] ssh wn1 [gilda001@wn1 gilda001] Testing

18 FIREWALL Setup

19 18 Nov. 2009 – Cape Town South African Grid Training *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] :RH-Firewall-1-INPUT - [0:0] -A INPUT -j RH-Firewall-1-INPUT -A FORWARD -j RH-Firewall-1-INPUT -A RH-Firewall-1-INPUT -i lo -j ACCEPT -A RH-Firewall-1-INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT -A RH-Firewall-1-INPUT -p tcp -s --dport 22 -j ACCEPT -A RH-Firewall-1-INPUT -p all -s -j ACCEPT -A RH-Firewall-1-INPUT -p tcp -m tcp --syn -j REJECT -A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited COMMIT /etc/sysconfig/iptables

20 18 Nov. 2009 – Cape Town South African Grid Training IPTABLES STARTUP /sbin/chkconfig iptables on /etc/init.d/iptables start

21 Troubleshooting

22 18 Nov. 2009 – Cape Town South African Grid Training [root@wn root]# su – gilda001 [gilda001@wn gilda001] ssh ce gilda001@ce’s password: probably this WN hostname is not in /etc/ssh/shosts.equiv or its ssh keys were not created and stored in /etc/ssh/ssh_known_hosts on CE Solution (run on CE): Ensure that the WN is in pbs list using: [root@ce root]# pbsnodes –a And then: [root@ce root]# /opt/edg/sbin/edg-pbs-shostsequiv [root@ce root]# /opt/edg/sbin/edg-pbs-known-hosts Troubleshooting

23 18 Nov. 2009 – Cape Town South African Grid Training [root@wn root]# pbsnodes -a wn.localdomain state = down np = 2 properties = lcgpro ntype = cluster Solution: [root@wn root]# /etc/init.d/pbs_mom restart Troubleshooting

24 18 Nov. 2009 – Cape Town South African Grid Training 24


Download ppt "South African Grid Training WORKER NODE Albert van Eck UFS - ICTS 17 November 2009 Slides by GIUSEPPE PLATANIA."

Similar presentations


Ads by Google