Download presentation
Presentation is loading. Please wait.
1
The B A B AR G RID demonstrator Tim Adye, Roger Barlow, Alessandra Forti, Andrew McNab, David Smith What is BaBar? The BaBar detector is a High Energy Physics experiment expanding our understanding of antimatter. Running at an accelerator in Stanford, it gathers data on the decay of the rare B particles Distribution Data is distributed across many sites, and it may be duplicated. Processing must be spread across available resources: we use dedicated CPU farms at 9 UK institutes, and also at Lyon, Karlsruhe, Bologna as well as Stanford. The CPU farms and disk arrays do not all ‘belong to BaBar’ but to the universities and research institutes, and are shared Step 1: Data Specification A physicist selects events of a particular type that they are interested in, The Data We have now recorded over 100 million B decays. These are real events, not simulated ones (though we have those too!) and hundreds more arrive every minute when the experiment is running. Each decay provides a lot of data – about 20 MB. This data is studied by a team of over 500 physicists from many countries, so the data and the computers to process it are distributed actross the world. There is a strong contingent from the UK Grid Technology Grid Technology is clearly the way forward. It provides a future system in which the user can specify the data they want to analyse and how they want to analyse it, and a job submission system that will locate the relevant files (choosing between alternatives if there are multiple copies) run the analysis jobs on those files using local available CPUs, retrieve the various outputs and combine them seamlessly before returning them to the user. Grid authorisation and authentification tools can enable BaBar collaborators to use facilities on any BaBar site without bureaucratic hindrances The Demonstrator The purpose of building the demonstrator was to see how much was possible today, using exiting tools, as a first and hopefully useful step towards a future system. The demonstrator runs through a web browser, presumably on the user’s desktop or laptop. Further software (such as globus) on this platform is not required as it was felt to be too restrictive. Step 2: Data Location A physicist selects events of a particular type that they are interested in, and uses a web browser to ask which sites have files of this type. Sites are given in order of preference. The physicist will probably want to use data at their local site and go to remote sites only for files that don’t exist local;l;y. The system handles this prioritisation., Step 3: Job submission Then they pres to ‘Go’ button to launch the job(s)
2
Step 6: Output retrieval The histograms produced by each job can be collected from all sites(by submitted jobs which tar and copy the files to a single file accessible by http).This automatically invokes the ROOT display program to run on the user’s platform and produce the physics output. Step 5: status is monitored The status of each individual job can be monitored (using the url caught earlier). When they are finished the output log file can be retrieved and inspected – though this is usually only necessary if something goes wrong. The B A B AR G RID demonstrator Tim Adye, Roger Barlow, Alessandra Forti, Andrew McNab, David Smith How it works 2: The BaBar VO When a job is submitted to a remote site and has been authernticated, the grid name is checked against the site map list to ensure that this person is authorised to use BaBar resources. This list of authorised BaBar users - BaBar VO - is maintained semi-automatically, with minimal user action required. All BaBar users have an account at the central system at SLAC. Anyone intending to use the Grid has a certificate. If the user copies their grid certificate to their SLAC account space, then a cron job detects it and checks that it is on the ‘babar’ afs access control list. If so, the details are copied to the central VO list, maintained at Manchester. Another cron job copies this list to the map files of the sites involved. How it works 3: Dynamic accounts When a job has been authorised and authenticated, the name on the grid certificate is compared with a list of known users and userids at that system. If a match is found then the job is submitted under that userid. If no match is found then the user is allocated a userid from a pool (babar01, babar02…).. If the user has used this machine in the past then they will be allocated their previous account if possible. Otherwise the next one on the list is used. This provides the user with the abailty to run jobs on any site in BaBarGrid (currently about 10, eventually 50+) without getting an indivudal account at each site (and 500+ users x 50+ sites = bureaucracy!) It provides the system manager with an audit trail: if a job run by a particular pool account misbehaves (inadvertently or maliciously) then that can be liked to the real physical user. Step 4 The jobs are spooled to the remote sites The web engine submits the jobs to the remote sites using globus-job-submit. The necessary control files (of which there are many) are copied along with the job, once for each site. The url pointing to the job output and status information is caught for later use How it works 1: Authorisation The user must have done a grid-proxy-init on their platform They then upload the X509 certificate into the gridpp web engine The web engine is then able to use this certificate to authenticate the job submission (done with globus-job-submit)
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.