Presentation is loading. Please wait.

Presentation is loading. Please wait.

(Exchange Programme to advance e-Infrastructure Know-How)

Similar presentations


Presentation on theme: "(Exchange Programme to advance e-Infrastructure Know-How)"— Presentation transcript:

1 (Exchange Programme to advance e-Infrastructure Know-How)
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) CE+WN+siteBDII Installation and configuration Giuseppe Platania INFN Catania Catania,

2 Computing Element Overview
Computing Element provides some of main services of a site. Main functionalities: job management (job submission, job control) job status updated for WMS publish its informations (as queues, CPU availability and so on...) Authentication (LCAS) and authorization (LCMAPS) It can runs several kinds of batch system: Torque + MAUI LSF SGE Condor On of computing elements’ main task, fase? Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 2

3 Torque + MAUI Torque server service: Torque client service:
pbs_server provides basic batch services such as receiving/creating a batch job. Torque client service: psb_mom places jobs into execution. It’s is also responsible for returning job’s output to the user. MAUI system service: job_scheduler contains site’s policy to decide which job is going to be executed and when. subito Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 3

4 Requirements Two or more machine: Architecture: 64 bit
One will be used to perform CE installation; Others will be used to perform WN installation; Architecture: 64 bit Operating System: Scientific Linux 5 At least one machine with a public ip address, direct and reverse address resolution on a DNS and equipped with an X509 certificate. variare Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 4

5 CE Cream Installation (on Torque/PBS)
Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 5 5

6 Repository set up (by CNAF repo)
Add to system repository ones specific for middleware to install cd /etc/yum.repos.d/ mv dag.repo dag.repo.stop REPO="dag ig lcg-ca glite-cream_torque” for rep_name in $REPO; do wget done Only for this tutorial: wget Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 6

7 Which metapackages we are going to install?
There are several kinds of metapackages to install: lcg-CA LHC Computing Grid rpm collection to support external Certification Authority . ig_cream_torque INFNGRID Compunting Element CREAM and torque services rpm. ig_BDII_site INFNGRID site BDII services rpm. Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 7

8 Middleware component installation
Use yum to install needed packets yum install -y lcg-CA yum install -y ig_CREAM_torque Install gilda VOMS and CA certificates: # yum install -y gilda_utils Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 8

9 Before configuration Some preliminary steps before configuration:
copy host certificate in default path: Wget -O /etc/grid-security/hostcert.pem Wget -O /etc/grid-security/hostkey.pem chmod 400 /etc/grid-security/hostkey.pem chmod 600 /etc/grid-security/hostcert.pem Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 9

10 Before configuration/2
generate configurations file from YAIM template cd /opt/glite/yaim/examples/; mkdir mysite-conf cp -r wn-list.conf ig-users.conf ig-groups.conf siteinfo/vo.d/ siteinfo/services/ siteinfo/ig-site-info.def mysite-conf/ cd mysite-conf/ mv ig-site-info.def my-ig-site-info.def Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 10

11 YAIM configuration Main file to edit is my-ig-site-info.def, where you specify some general settings and other component’s parameters (CE Cream) Other file to be edited are: wn-list.conf, ig-groups.conf, services/glite-creamce, services/ig-bdii_site Set variables with corrected values replacing example ones. # vim services/glite-creamce CEMON_HOST=${CE_HOST} CREAM_DB_USER="cream_db_user" CREAM_DB_PASSWORD="cream_pass" BLPARSER_HOST=${CE_HOST} Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 11

12 YAIM configuration/3 # vim wn-list.conf
### Delete all example values present vmXX.ct.infn.it #insert worker nodes hostname Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 12

13 YAIM configuration/7 In my-ig-site-info.def there are many variables to set: # vim my-ig-site-info.def WN_LIST=/opt/glite/yaim/examples/mysite-conf/wn-list.conf USERS_CONF=/opt/glite/yaim/examples/mysite-conf/ig-users.conf GROUPS_CONF=/opt/glite/yaim/examples/mysite-conf/ig-groups.conf MYSQL_PASSWORD=good_mysql_pass # any password you want SITE_NAME=SITE-xx SITE_LAT=36.76 SITE_LONG=3.00 Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 13

14 YAIM configuration/8 # vim my-ig-site-info.def CE_HOST=vmXX.ct.infn.it
CE_CPU_MODEL=XEON #cat /proc/cpuinfo CE_CPU_VENDOR=Intel CE_CPU_SPEED=2230 CE_OS=ScientificSL CE_OS_RELEASE= #cat /etc/redhat-release CE_OS_VERSION="Boron" CE_OS_ARCH=x86_64 CE_MINPHYSMEM=512 #cat /proc/meminfo on WN CE_MINVIRTMEM=512 CE_PHYSCPU= #total cpu in site (dual dual core) CE_LOGCPU=4 CE_SMPSIZE=4 CE_OUTBOUNDIP=TRUE CE_INBOUNDIP=FALSE Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 14

15 YAIM configuration/9 CE_RUNTIMEENV=" GLITE-3_1_0 GLITE-3_2_0 R-GMA
# vim my-ig-site-info.def CE_RUNTIMEENV=" GLITE-3_1_0 GLITE-3_2_0 R-GMA SI00MeanPerCPU_3800 SF00MeanPerCPU_3800 " CE_SI00=3800 CE_SF00=3800 Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 15

16 YAIM configuration/10 # vim my-ig-site-info.def
CE_CAPABILITY="CPUScalingReferenceSI00=23.75" CE_OTHERDESCR="Cores=4,Benchmark=6.5-HEP-SPEC06” SE_MOUNT_INFO_LIST="${INT_HOST_SW_DIR}:/opt/exp_soft,/opt/exp_soft" Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 16

17 YAIM configuration/11 How to set CE_SI00, CE_SF00, CE_CAPABILITY, CE_OTHERDESCR ? Try to search for you value in thris link: SPEC06 12 For example if you have an Intel XEON GHz with no Hyper Threading will find in the table of previous link a value of 95 and a conversion factor of 1HS06=40 so: CE_SI00 = 3800 CE_SF00 = 3800 CE_CAPABILITY="CPUScalingReferenceSI00=3800” CE_OTHERDESCR="Cores=4,Benchmark=23.75-HEP-SPEC06” Where (3800/40)/4= 23.75 Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 17

18 YAIM configuration/12 # vim my-ig-site-info.def
BATCH_SERVER=vmXX.ct.infn.it JOB_MANAGER=pbs CE_BATCH_SYS=pbs BATCH_LOG_DIR=/var/spool/pbs APEL_DB_PASSWORD="anything" DGAS_ACCT_DIR=/var/spool/pbs/server_priv/accounting VOS="infngrid ops dteam" QUEUES="cert infngrid" CERT_GROUP_ENABLE="ops dteam" INFNGRID_GROUP_ENABLE="infngrid" Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 18

19 YAIM configuration/14 After editing you can launch command: /opt/glite/yaim/bin/ig_yaim -c -s my-ig-site-info.def -n ig_CREAM_torque /opt/glite/yaim/bin/ig_yaim -c -s my-ig-site-info.def -n ig_BDII_site Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 19

20 Fixing errors # rm -fr /var/lib/tomcat5/common/lib/jakarta*
Check tomcat running after configuration, if you get this message:: # /etc/init.d/tomcat5 status/etc/init.d/tomcat5 is stopped ONLY IF TOMCAT IS STOPPED TRY THIS SOLUTION: # rm -fr /var/lib/tomcat5/common/lib/jakarta* # /etc/init.d/tomcat5 start Starting tomcat5: /usr/bin/rebuild-jar-repository: error: Could not find log4j Java extension for this JVM/usr/bin/rebuild-jar-repository: error: Some detected jars were not found for this jvm [ OK ] Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 20

21 WN Cream Installation (on Torque/PBS)
Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 21 21

22 They are machines which really execute your job.
Worker Node Element Overview They are machines which really execute your job. User can only access their services by a Computing Element. Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 22

23 WN - Repository set up (by CNAF repo)
Add to system repository ones specific for middleware to install cd /etc/yum.repos.d/ mv dag.repo dag.repo.stop REPO="dag ig lcg-ca glite-wn_torque” for rep_name in $REPO; do wget done Only for this tutorial: wget Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 23

24 Which metapackages we are going to install?
There are several kinds of metapackages to install: lcg-CA LHC Computing Grid rpm collection to support external Certification Authority . ig_WN_torque_noafs INFNGRID Worker Node torque client in other to dialogue to torque server. We decide not to install afs file system. This metapackage is used with groupinstall option. Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 24

25 WN - Middleware component installation
Use yum to install needed packets # yum clean all # yum install -y lcg-CA # yum groupinstall -y ig_WN_torque_noafs Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 25

26 WN - YAIM Configuration
You can use same configuration file edited on CE: this can be done on all worker node of a site; so you don’t neet to re-edit anything! Copy file from CE machine: # cd /opt/glite/yaim/examples/ # scp -r . # cd mysite-conf Ready to configure now # /opt/glite/yaim/bin/ig_yaim -c -s my-ig-site-info.def -n ig_WN_torque_noafs Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 26

27 Testing installation Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 27 27

28 Tests on CE SSH access to CE to test if CE can see WN and to test if all main service are up & running # pbsnodes cerist45.grid.arn.dz state = free np = properties = lcgpro ntype = cluster status = opsys=linux,uname=Linux grid-test-63.trigrid.it el5 #1 [cut] # /etc/init.d/gLite status*** tomcat5:/opt/glite/etc/init.d/tomcat5 is already running (1514)*** glite-lb-locallogger:glite-lb-logd runningglite-lb-interlogd running# /etc/init.d/globus-gridftp statusglobus-gridftp-server (pid 25452) is running... Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 28

29 Tests on CE SSH access to CE and then become a gilda user:
# su – infngrid001 Create a file and add the following: $ vi test.sh #!/bin/sh sleep 20 #(it's useful to see the job status) hostname Set right permission to be executable: $ chmod 700 test.sh Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 29

30 Tests on CE Launch job locally on CE $ qsub –q infngrid test.sh
Then check list of job in execution on CE $ qstat -a ce.localdomain: Req'd Req'd ElapJob ID Username Queue Jobname SessID NDS TSK Memory Time S Time wn.localdo gilda001 short test.sh :15 R -- In case you want to more info: $ qstat -f 3 In case you want to abort a job execution: $ qdel 3 #that is jobid Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 30

31 Tests on CE If typing “qstat -a” command you didn’t get no output, no jobs are being executed on CE and this means your previous job terminated so now you can list output. $ lstest.sh.e3 test.sh.o3 $ cat test.sh.e3 #error file$$ cat test.sh.o3 #output filewn.localdomain Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 31

32 Troubleshooting Which logs are supposed to be open if something goes wrong?: /var/log/message, for general errors /opt/glite/var/log (especially glite-ce- cream.log) /var/spool/pbs/server_priv/accounting/<data >, if even local submission on batch system doesn’t work. Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 32

33 Site Bdii Installation
Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 33 33

34 Site BDII* By default it is installed on CE
It collect all site GRISes* (for example SE,RB,LFC,etc...) Service is named bdii Log file: /opt/bdii/var/bdii.log rara, acronimo *BDII = Berkeley Database Information Index **GRIS = Grid Resouce Information Service Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 34

35 WN - Repository set up (by CNAF repo)
Add to system repository ones specific for middleware to install cd /etc/yum.repos.d/ mv dag.repo dag.repo.stop REPO="dag ig lcg-ca glite-bdii_site” for rep_name in $REPO; do wget done Only for this tutorial: wget Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 35

36 Which metapackages we are going to install?
lcg-CA LHC Computing Grid rpm collection to support external Certification Authority . ig_BDII_site INFNGRID site BDII Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 36

37 WN - Middleware component installation
Use yum to install needed packets yum install -y lcg-CA yum groupinstall -y ig_BDII_site Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 37

38 YAIM Configuration # cd /opt/glite/yaim/examples/
You can use same configuration file edited on CE: this can be done on all worker node of a site; so you don’t neet to re-edit anything! Copy file from CE machine: # cd /opt/glite/yaim/examples/ # scp -r . # cp siteinfo/services/glite-bdii_site mysite-conf/services Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 38

39 YAIM configuration # vim mysite-conf/services/glite-bdii_site
SITE_DESC="SITE-xx Test site" SITESITE_LOC="Catania, Italy” SITE_WEB=" SITE_OTHER_GRID="WLCG|EGEE|IGI" BDII_REGIONS="CE" BDII_CE_URL="ldap://$CE_HOST:2170/mds-vo-name=resource,o=grid" # vim mysite-conf/my-ig-site-info.def SITE_BDII_HOST=my-bdii.$MY_DOMAIN Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 39

40 YAIM Configuration Ready to configure now # /opt/glite/yaim/bin/ig_yaim -c -s my-ig-site-info.def -n ig_BDII_site Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 40

41 References INFNGRID generic installation guide:
YAIM configuration variables CE Cream installation guide: GLITE Cream CE 3.2 SL5 Installation Guide [INFNGRID Release Wiki] YAIM system administrator guide: EUMEDGRID wiki: EuMedGRID sites installation and setup tips on How To Check And Test Your CREAMCE E Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 41

42 Any questions ? Thank you for your kind attention !
Algiers, Africa Joint EUMEDGRID-Support/EPIKH School for Grid Site Administrators, 42


Download ppt "(Exchange Programme to advance e-Infrastructure Know-How)"

Similar presentations


Ads by Google