Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 www.epikh.eu The EPIKH Project (Exchange Programme.

Similar presentations


Presentation on theme: "1 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 www.epikh.eu The EPIKH Project (Exchange Programme."— Presentation transcript:

1 1 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 www.epikh.eu The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) CE+WN Installation and configuration Riccardo Rotondo (riccardo.rotondo@ct.infn.it)riccardo.rotondo@ct.infn.it National Institute of Nuclear Physics Asia 2 2011 - CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators Kolkata, 03.02.2011

2 2 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Outline Computing Element overview Worker Node overview CE CREAM overview gLite stack overview gLite CE cream and siteBDII –Installation on CE and WN (wiki) –Configuration on CE and WN (wiki)

3 3 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 gLite stack overview

4 4 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 gLite overview worker node

5 5 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 glite overview User Interface: it’s the point of access for users to glite grid services WMS: it’s the component that optimize resource usage. CE: the machine who manage worker nodes WN: the machines who actually execute applications SE: machines where files are stored LFC: used to “find” files on the grid BDII: services responsible to publish all info of your sites Logging and Bookkeping: as it’s name says it’s a logger and alert user when job is finisched

6 6 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Computing Element Overview Computing Element provides some of main services of a site. Main functionalities: –job management (job submission, job control) –job status updated for WMS –Usually installed together with the site BDII service that publishes all information regarding the computing element It can runs several kinds of batch system: –Torque + MAUI –LSF –SGE –Condor

7 7 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Torque + MAUI Torque server service: –pbs_server provides basic batch services such as receiving/creating a batch job. Torque client service: –psb_mom places jobs into execution. It’s is also responsible for returning job’s output to the user. MAUI system service: –job_scheduler contains site’s policy to decide which job is going to be executed and when.

8 8 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Site BDII* By default it was installed on CE but now it’s better to install it on a dedicated server, physical or virtual. It collect all site GRISes* (for example SE,RB,LFC,etc...) Service is named bdii Log file: /opt/bdii/var/bdii.log *BDII = Berkeley Database Information Index **GRIS = Grid Resouce Information Service

9 9 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Worker Node Element Overview They are machines which really execute your job. User can only access their services by a Computing Element. Their characteristics are collected by Computing Element that publishes all information by BDII services

10 10 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Computing Resource Execution And Management Accept job submission requests belonging from a WMS and other job management request. It exposes a web services interface CE Cream overview

11 11 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Requirements Three or more machine: –One will be used to perform CE installation; –Others will be used to perform WN installation; Architecture: 64 bit Operating System: Scientific Linux 5 CE machine with a public ip address, direct and reverse address resolution on a DNS and equipped with an X509 certificate.

12 12 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 CE Cream and WN Installation & Configruation (on Torque/PBS)

13 13 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Wiki Follow the steps here for CE CREAM: –https://grid.ct.infn.it/twiki/bin/view/EPI KH/CECreamEpikhhttps://grid.ct.infn.it/twiki/bin/view/EPI KH/CECreamEpikh Follow the steps here for WN: https://grid.ct.infn.it/twiki/bin/view/ EPIKH/WNEpikhhttps://grid.ct.infn.it/twiki/bin/view/ EPIKH/WNEpikh

14 14 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 A few words on benchmark How to set CE_SI00, CE_SF00, CE_CAPABILITY, CE_OTHERDESCR ? Try to search for you value in thris link: http://www.italiangrid.org/grid_operations/site_manager/HEP-SPEC06 https://hepix.caspur.it/benchmarks/doku.php?id=bench:results_sl5_x86_64_gcc_4 12 https://hepix.caspur.it/processors/dokuwiki/doku.php?id=benchmarks:results For example if you have an Intel XEON 5520 2.23 GHz with no Hyper Threading will find in the table of previous link a value of 95 and a conversion factor of 1HS06=40 so: CE_SI00 = 3800 CE_SF00 = 3800 CE_CAPABILITY="CPUScalingReferenceSI00=3800” CE_OTHERDESCR="Cores=4,Benchmark=23.75-HEP-SPEC06” Where (3800/40)/4= 23.75

15 15 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Adding a VO # vim my-ig-site-info.def VOS="euindia infngrid ops dteam" QUEUES="cert grid" CERT_GROUP_ENABLE="euindia ops dteam" GRID_GROUP_ENABLE="infngrid"

16 16 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Adding a VO/2 q1q2q3

17 17 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Q1_GROUP_ENABLE Adding a VO/3 Q2_GROUP_ENABLE Q3_GROUP_ENABLE

18 18 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Adding a VO/4 # vim vo.d/euindia SW_DIR=$VO_SW_DIR/euindiaDEFAULT_SE=$SE_HOSTSTORAGE_DIR=$CLASS IC_STORAGE_DIR/euindiaVOMS_SERVERS="'vomss://voms.ct.infn.it:8443/voms/ euindia?/euindia'"VOMSES="'euindia voms.ct.infn.it 15004 /C=IT/O=INFN/OU=Host/L=Catania/CN=voms.ct.infn.it euindia'"VOMS_CA_DN="'/C=IT/O=INFN/CN=INFN CA'" Here some settings to support euindia VO: Then install the VO voms certificates with: wget http://grid018.ct.infn.it/mrepo/cometa_sl4- i386/RPMS.app/cometa-vomscert-1.0-3.noarch.rpm rpm –ivh cometa-vomscert-1.0-3.noarch.rpm

19 19 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Adding a VO/5 Now you have to provide a group and some users for EUINDIA VO modifying this two files: -ig-groups.conf -ig-users.conf # vim ig-groups.conf # Append following lines to the end of file "/euindia/ROLE=SoftwareManager":::sgm: "/euindia"::::-

20 20 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Adding a VO/6 # vim ig-users.conf #append this line at the end of the file 39001:euindia001:3900:euindia:euindia:: 39002:euindia002:3900:euindia:euindia:: 39003:euindia003:3900:euindia:euindia:: 39004:euindia004:3900:euindia:euindia:: 39005:euindia005:3900:euindia:euindia:: 39006:euindia006:3900:euindia:euindia:: 39007:euindia007:3900:euindia:euindia:: 39008:euindia008:3900:euindia:euindia:: 39009:euindia009:3900:euindia:euindia:: 39010:euindia010:3900:euindia:euindia:: 39011:euindia011:3900:euindia:euindia:: 39012:euindia012:3900:euindia:euindia:: 39013:euindia013:3900:euindia:euindia:: 39014:euindia014:3900:euindia:euindia:: 39015:euindia015:3900:euindia:euindia:: 39016:euindia016:3900:euindia:euindia:: 39017:euindia017:3900:euindia:euindia:: 39018:euindia018:3900:euindia:euindia:: 39019:euindia019:3900:euindia:euindia:: 39020:euindia020:3900:euindia:euindia:: 39101:sgmeuindia001:3910,3900:sgmeuindia,euindia:euindia:sgm: 39102:sgmeuindia002:3910,3900:sgmeuindia,euindia:euindia:sgm: 39103:sgmeuindia003:3910,3900:sgmeuindia,euindia:euindia:sgm:

21 21 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Testing installation

22 22 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Tests on CE SSH access to CE to test if CE can see WN and to test if all main service are up & running # pbsnodes Your-ip-hostname state = free np = 2 properties = lcgpro ntype = cluster status = opsys=linux,uname=Linux grid-test-63.trigrid.it 2.6.18-164.6.1.el5 #1 [cut] # /etc/init.d/gLite status*** tomcat5:/opt/glite/etc/init.d/tomcat5 is already running (1514)*** glite-lb- locallogger:glite-lb-logd runningglite-lb-interlogd running# /etc/init.d/globus- gridftp statusglobus-gridftp-server (pid 25452) is running...

23 23 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Tests on CE SSH access to CE and then become a gilda user: # su – euindia001 $ vi test.sh #!/bin/sh sleep 20 #(it's useful to see the job status) hostname Create a file and add the following: Set right permission to be executable: $ chmod 700 test.sh

24 24 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Tests on CE Launch job locally on CE $ qsub –q euindia test.sh Then check list of job in execution on CE $ qstat –a ce.localdomain: Req'd Req'd ElapJob ID Username Queue Jobname SessID NDS TSK Memory Time S Time--------------- -------- -------- ---------- ------ --- --- ------ ----- - ----3.wn.localdo gilda001 short test.sh 5839 -- -- -- 00:15 R -- In case you want to abort a job execution: $ qdel 3 #that is jobid In case you want to more info: $ qstat -f 3

25 25 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Tests on CE If typing “qstat -a” command you didn’t get no output, no jobs are being executed on CE and this means your previous job terminated so now you can list output. $ lstest.sh.e3 test.sh.o3 $ cat test.sh.e3 #error file$$ cat test.sh.o3 #output filewn.localdomain

26 26 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 JDL example $ vim hostname-cream.jdl Type = "Job"; JobType = "Normal"; Executable = "/bin/hostname"; StdOutput = "hostname.out"; StdError = "hostname.err"; OutputSandbox = {"hostname.err","hostname.out"}; Arguments = "-f"; OutputSandboxBaseDestUri = "gsiftp://localhost"; ShallowRetryCount = 3;

27 27 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Working test SSH access to UI to test if CE can receive and execute simple job $ ssh rotondo@genius.ct.infn.it #password: XXXXXXX $ voms-proxy-init --voms euinda [cut] [rotondo@genius ~]$ glite-ce-delegate-proxy -e grid-test-33.trigrid.it riccardo2010-06-29 02:36:21,683 WARN - No configuration file suitable for loading. Using built-in configuration2010-06-29 02:36:26,389 NOTICE - Proxy with delegation id [riccardo] succesfully delegated to endpoint [https://grid-test-33.trigrid.it:8443//ce- cream/services/gridsite-delegation] $[rotondo@genius ~]$ glite-ce-job-submit –r grid-test-33.trigrid.it:8443/cream-pbs-cert -D riccardo hostname-cream.jdl 2010-06-29 02:39:06,444 WARN - No configuration file suitable for loading. Using built-in configuration https://grid-test-33.trigrid.it:8443/CREAM501920532 $ glite-ce-job-status https://ceristXX.grid.arn.dz:8443/CREAM888739522****** JobID=[https://ceristXX.grid.arn.dz:8443/CREAM888739522] Status = [DONE-OK] ExitCode = [0]

28 28 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Troubleshooting Which logs are supposed to be open if something goes wrong?: –/var/log/message, for general errors –/opt/glite/var/log (especially glite- ce-cream.log) –/var/spool/pbs/server_priv/account ing/, if even local submission on batch system doesn’t work.

29 29 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 References INFNGRID generic installation guide: –http://igrelease.forge.cnaf.infn.it/doku.php?id=doc:guides:install-3_2http://igrelease.forge.cnaf.infn.it/doku.php?id=doc:guides:install-3_2 YAIM configuration variables –https://twiki.cern.ch/twiki/bin/view/LCG/Site-info_configuration_variableshttps://twiki.cern.ch/twiki/bin/view/LCG/Site-info_configuration_variables CE Cream installation guide: –GLITE Cream CE 3.2 SL5 Installation Guide [INFNGRID Release Wiki]GLITE Cream CE 3.2 SL5 Installation Guide [INFNGRID Release Wiki] YAIM system administrator guide: –https://twiki.cern.ch/twiki/bin/view/LCG/YaimGuide400https://twiki.cern.ch/twiki/bin/view/LCG/YaimGuide400 How To Check And Test Your CREAMCE –http://grid.pd.infn.it/cream/field.php?n=Main.HowToCheckAndTestYourCREAM CEhttp://grid.pd.infn.it/cream/field.php?n=Main.HowToCheckAndTestYourCREAM CE

30 30 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 Thank you for your kind attention ! Any questions ?


Download ppt "1 Kolkata, Asia 2 2011 - Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, 03.02.2011 www.epikh.eu The EPIKH Project (Exchange Programme."

Similar presentations


Ads by Google