Presentation on theme: "The GRID – Hype or Reality? Roger Barlow Manchester and Stanford University Symposium in celebration of John Vincent Atanasoff Iowa State University, 31."— Presentation transcript:
The GRID – Hype or Reality? Roger Barlow Manchester and Stanford University Symposium in celebration of John Vincent Atanasoff Iowa State University, 31 st October 2003 What is the Grid? Problems and Tools that solve them Progress Prospects
2 What is the GRID? 1: The Press The new Grid computing technologies are set to revolutionise the way scientists use the worlds computing resources. The technology now being deployed for particle physics will ultimately change the way that science and business are undertaken in the years to come. This will have a profound effect on the way society uses information technology, much as the World Wide Web did. Grid technology will extend to fields like bioinformatics, digital archive and biodiversity informatics. We are very excited to be able to participate in such a revolutionary global collaboration.
3 What is the GRID? 2: For Politicians Computing coming out of a wall socket Power: 110V AC 60Hz Computing: ? ?
4 What is the GRID? 3: For Machine vendors Another name for a CPU cluster A very limited and unimaginative use. But it sells boxes!
5 What is the GRID? Here is a (transport) grid that works Get from anywhere* to anywhere* (home, school, office, theater, park, mall..) By any* means (car, truck, bus…) For any* purpose (work, pleasure,shopping…) A users activities vary. * within reasonable limits
6 Aside: What is the WEB? Computers talk to computers They say:Give me that file.
7 What is the GRID? Computers talk to computers They say anything Find data like this… Add this data to that data Run this program on those data Collect results and plot them
physicists 9 countries 71 institutes Moving from a CENTRALISED To a DISPERSED Computing model
9 Particle Physics and building the Grid The Grid needs Particle Physics as trail-blazers… Demanding users with demanding problems Amateurs: Knowledgeable but non expert Write their own programs. Badly. Lots of resources – different universities/countries COTS components - heterogeneous Software not mission- critical Particle Physics needs the Grid… Locate data – which may be in different sites Select events and process (take-the-job-to-the-data beats take-the-data-to- the-job) Collate results
10 Warning: Cyberspace is dangerous! Rogues and vagabonds: evil people and stupid people How can you trust anyone? How can anyone trust you? How can we build a grid when all the doors and gates locked?
11 Old Solution: Passwords Not really secure Multiple passwords are a pain for users Does not scale for millions of users (N), tens of thousands of sites (M) SYSTEM Gatekeeper Gate
12 Eureka! RSA Encryption | | 3233 (123+5)*2 123 (256-10)* PUBLIC key (3233,17) PRIVATE key (3233,2753) Rivest, Shamir, Adleman Can deduce private key from public key in principle but not in practice 3233=53*61 52*60= (2753*17-1)*15 Anyone can do this using public key Only holder of private key can do this Trivial Example Simple Example
13 What you can do with it Suppose I trust Charles Adams Charles Adams has his own private key I have his public key (so do lots of people) If I get a coded message that makes sense when I decode it, I know CA must have written it If you bring me such a message saying This guy is John Doe of Iowa State… then I believe it because I believe CA. This is a CERTIFICATE. Your certificate. Signed by CA
14 Authentication Solved There are several (not many) Certificate Authorities (DOE, UK, France, Germany…) Each CA should have a rigorous procedure for issuing certificates, generally involving personal contact though maybe multilevel A site chooses which it will recognise, on the basis of trustworthiness When I approach a site with my certificate, they know I am who I say I am ~N+M transactions
15 Next Hurdle: Authorisation Your ID gets you into the bar – but it doesnt buy you a beer Virtual Organisation – membership allows you to use resources People join a VO. Or several VOs (N transactions) Sites give resources to VO (M transactions) Generic accounts make it easy Please present your ticket and photo ID…
16 The BaBar VO Want to join? Just put your certificate name into a file.cert- id on your SLAC BaBar account cron jobs do the rest Simple AND secure
17 Sandboxes You ship a whole load of files to the remote site. (May include the executable.) It runs and generates bytes to gigabytes of output files You get all these back Your job operates only within the input and output sandboxes, not in the rest of the remote system. So everyone is safe Issues: How do you know which files to send? How much can you assume standard libraries etc are present? (As little as possible is the right answer.) How do the files come back? Pull (inconvenience) or push (risky).
18 AFS solves all our sandbox problems Run/test jobs locally in directory which is part of afs cell: /afs/ / / Files needed are in this directory or linked from it (except for the big raw data files.) Output files created in this directory At start of remote job, klog to /afs/ / / All the I/O (except raw data input) is done there Uses gsiklog (certificate gives token). Transportable
19 Stuff that would be nice Resource Brokering Job monitoring Fun to use GUI interfaces SRB for storage But we dont want to hang around till its ready. Well, I dont. (and 50% of software projects fail)
20 BaBarGrid (Sorry, no demo today. Just screenshots) Some machines already in the system
21 Start by finding some data Use experiment- specific jargon to specify what sort of data you want Say where you want to run (Here MAN and RAL) Get whole bunch of input files
22 One command runs the jobs Manchester Rutherford Lab job is standard user (badly) written analysis code: nothing special required
23 And get the answer Plot obtained by merging all the job outputs from all the sites
24 Coming Next: the LHC Large Hadron Collider – CERN. 14 TeV proton-proton collisions 4 Experiments – each an order of magnitude bigger than BaBar Massive data analysis requirements Switch on 2007 Grid computing built in from the start
25 LCG now rolling out First LHC Computing Grid – ~4,000 CPUs
26 The Grid Its real – despite the hype Being test-driven by Particle Physicists –Especially in Europe You will be using it sooner or later – opening possibilities you never thought possible Get your certificate and join the fun