Presentation is loading. Please wait.

Presentation is loading. Please wait.

PNPI HEPD seminar 4 th November 2003 1 Andrey Shevel Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

Similar presentations


Presentation on theme: "PNPI HEPD seminar 4 th November 2003 1 Andrey Shevel Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)"— Presentation transcript:

1 PNPI HEPD seminar 4 th November 2003 1 Andrey Shevel Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

2 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Topics  Grid/Globus  HEP Grid Projects  PHENIX as the example  Conceptions and scenario for widely distributed multi cluster computing environment  Job submission and job monitoring  Live demonstration

3 Andrey ShevelPNPI HEPD seminar 4 th November 2003 What is the Grid  “Dependable, consistent, pervasive access to [high-end] resources”.  Dependable: Can provide performance and functionality guarantees.  Consistent: Uniform interfaces to a wide variety of resources.  Pervasive: Ability to “plug in” from anywhere.

4 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Another Grid description  Quote from Information Power Grid (IPG) at NASA  http://www.ipg.nasa.gov/aboutipg/presentations/PDF_pres entations/IPG.AvSafety.VG.1.1up.pdf  Grids are tools, middleware, and services for:  providing a uniform look and feel to a wide variety of computing and data resources;  supporting construction, management, and use of widely distributed application systems;  facilitating human collaboration and remote access and operation of scientific and engineering instrumentation systems;  managing and securing the computing and data infrastructure.

5 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Basic HEP requirements in distributed computing  Authentication/Authorization/Security  Data Transfer  File/Replica Cataloging  Match Making/Job submission/Job monitoring  System monitoring

6 Andrey ShevelPNPI HEPD seminar 4 th November 2003 HEP Grid Projects  European Data Grid  www.edg.org  Grid Physics Nuclear  www.griphyn.org  Particle Physics Data Grid  www.ppdg.net  Many others.

7 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Possible Task  Here we are trying to gather computing power around many clusters. The clusters are located in different sites with different authorities. We use all local rules as they are:  Local schedulers, policies, priorities;  Other local circumstances.  One of many possible scenarios is discussed in this presentation.

8 Andrey ShevelPNPI HEPD seminar 4 th November 2003 General scheme: jobs are planned to go where data are and to less loaded clusters Remote cluster RCF Main Data Repository Partial Data Replica File Catalog User

9 Andrey ShevelPNPI HEPD seminar 4 th November 2003 GridFTP (Globus-url-copy) Globus job-manager/fork Package GSUNY Cataloging engine BOSS BODE User Jobs GT 2.2.4.latest Base subsystems for PHENIX Grid

10 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Input/Output Sandbox(es) Master Job (script) submitted by user Major Data Sets ( physics or simulated data) Minor Data Sets (Parameters, scripts, etc.) Satellite Job (script) Submitted by Master Job Conceptions

11 Andrey ShevelPNPI HEPD seminar 4 th November 2003 The job submission scenario at remote Grid cluster  To determine (to know) qualified computing cluster: available disk space, installed software, etc.  To copy/replicate the major data sets to remote cluster.  To copy the minor data sets (scripts, parameters, etc.) to remote cluster.  To start the master job (script) which will submit many jobs with default batch system.  To watch the jobs with monitoring system – BOSS/BODE.  To copy the result data from remote cluster to target destination (desktop or RCF).

12 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Master job-script  The master script is submitted from your desktop and performed on the Globus gateway (may be in group account) with using monitoring tool (it is assumed BOSS).  It is supposed that the master script will find the following information in the environment variables:  CLUSTER_NAME – name of the cluster;  BATCH_SYSTEM – name of the batch system;  BATCH_SUBMIT – command for job submission through BATCH_SYSTEM.

13 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Local desktop Globus gateway Submission of MASTER job Through globus-jobmanager/fork Remote Cluster Job submission with Command $BATCH_SUBMIT MASTER job is performing On Globus gateway Job submission scenario

14 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Transfer the major data sets  There are a number of methods to transfer major data sets:  The utility bbftp (whithout use of GSI) can be used to transfer the data between clusters;  The utility gcopy (with use of GSI) can be used to copy the data from one cluster to another one.  Any third party data transfer facilities (e.g. HRM/SRM).

15 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Copy the minor data sets  There are at least two alternative methods to copy the minor data sets (scripts, parameters, constants, etc.):  To copy the data to /afs/rhic.bnl.gov/phenix/users/user_account/…  To copy the data with the utility CopyMinorData (part of package gsuny).

16 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Package gsuny List of scripts  General commands ( ftp://ram3.chem.sunysb.edu/pub/suny-gt-2/gsuny.tar.gz)  GPARAM – configuration description for set of remote clusters  GlobusUserAccountCheck – to check the Globus configuration for local user account.  gping – to test availability of the Globus gateways.  gdemo – to see the load of remote clusters.  gsub – to submit the job on less loaded cluster;  gsub-data – to submit the job where data are;  gstat, gget, gjobs – to get status of the job, standard output, detailed info about jobs.

17 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Package gsuny Data Transfer  gcopy – to copy the data from one cluster (local hosts) to another one.  CopyMinorData – to copy minor data sets from cluster (local host) to cluster.

18 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Job monitoring  After the initial development of the description of required monitoring tool (https://www.phenix.bnl.gov/phenix/WWW/p/draft/shevel/TechM eeting4Aug2003/jobsub.pdf ) it was found the packages:  Batch Object Submission System (BOSS) by Claudio Grandi http://www.bo.infn.it/cms/computing/BOSS/  Web interface BO SS D ATABASE E XPLORER (BODE) by Alexei Filine http://filine.home.cern.ch/filine/

19 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Basic BOSS components boss executable: the BOSS interface to the user MySQL database: where BOSS stores job information jobExecutor executable: the BOSS wrapper around the user job dbUpdator executable: the process that writes to the database while the job is running Interface to Local scheduler

20 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Basic job flow boss submit boss query boss kill BOSS DB BOSS Local Scheduler Exec node n Exec node m Globus gateway BODE (Web interface) Here is cluster N gsub master-script Globus Space To wrap the job

21 Andrey ShevelPNPI HEPD seminar 4 th November 2003 gsub TbossSuny # submit to less loaded cluster [shevel@ram3 shevel]$ cat TbossSuny. /etc/profile. ~/.bashrc echo " This is master JOB" printenv boss submit -jobtype ram3master -executable ~/andrey.shevel/TestRemoteJobs.pl -stdout \ ~/andrey.shevel/master.out -stderr ~/andrey.shevel/master.err [shevel@ram3 shevel]$ CopyMinorData local:andrey.shevel unm:. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ YOU are copying THE minor DATA sets --FROM-- --TO-- Gateway = 'localhost' 'loslobos.alliance.unm.edu' Directory = '/home/shevel/andrey.shevel' '/users/shevel/.' Transfer of the file '/tmp/andrey.shevel.tgz5558' was succeeded

22 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Status of the PHENIX Grid  Live info is available on the page http://ram3.chem.sunysb.edu/~shevel/phenix-grid.html  The group account ‘phenix’ is available now at  SUNYSB (rserver1.i2net.sunysb.edu)  UNM (loslobos.alliance.unm.edu)  IN2P3 (in process now)

23 Andrey ShevelPNPI HEPD seminar 4 th November 2003 http://ram3.chem.sunysb.edu/~magda/BODE User: guest Pass: Guest101 Live Demo for BOSS Job monitoring

24 Andrey ShevelPNPI HEPD seminar 4 th November 2003 Computing Utility (instead conclusion)  It is clear that computing utility (computing cluster built up for concrete collaboration tasks).  The computing utility can be implemented anywhere in the World.  The computing utility can be used from anywhere (France, USA, Russia, etc.).  Most important part of the computing utility is man power.


Download ppt "PNPI HEPD seminar 4 th November 2003 1 Andrey Shevel Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)"

Similar presentations


Ads by Google