Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Linux Clusters and Grids

Similar presentations


Presentation on theme: "Introduction to Linux Clusters and Grids"— Presentation transcript:

1 Introduction to Linux Clusters and Grids
Design and Basic Services of LCG Grid Middleware SEE-GRID Infrastructure Overview Antun Balaž SCL, Institute of Physics

2 SEE-GRID Banjaluka Training Session
Linux Clusters Commodity hardware become available in the last 10 years Local network Mbps easily deployed Linux mature and widely available Software available and even standardized - MPI SEE-GRID Banjaluka Training Session

3 Science and technology are team sports
SEE-GRID Banjaluka Training Session

4 Unifying concept: Grid
Resource sharing and coordinated problem solving in dynamic, multi-institutional virtual organizations. SEE-GRID Banjaluka Training Session

5 What types of problems is the Grid intended to address?
Too hard to keep track of authentication data (ID/password) across institutions Too hard to monitor system and application status across institutions Too many ways to submit jobs Too many ways to store & access files/data Too many ways to keep track of data Too easy to leave “dangling” resources lying around (robustness) SEE-GRID Banjaluka Training Session

6 SEE-GRID Banjaluka Training Session
Requirements Security Monitoring/Discovery Computing/Processing Power Moving and Managing Data Managing Systems System Packaging/Distribution What end users need? Secure, reliable, on-demand access to data, software, people, and other resources (ideally all via a Web Browser!) SEE-GRID Banjaluka Training Session

7 Set of basic Grid services
Job submission/management File transfer (individual, queued) Database access Data management (replication, metadata) Monitoring/Indexing system information SEE-GRID Banjaluka Training Session

8 Multi-institution issues
No Cross- Domain Trust Certification Certification Authority Authority Domain A Domain B Policy Trust Mismatch Mechanism Mismatch Policy Authority Authority Task Server X Server Y Sub-Domain A1 Sub-Domain B1 SEE-GRID Banjaluka Training Session

9 Why Grid security is hard
Resources being used may be valuable & the problems being solved sensitive - Both users and resources need to be careful Dynamic formation and management of virtual organizations - Large, dynamic, unpredictable… VO Resources and users are often located in distinct administrative domains - Can’t assume cross-organizational trust agreements - Different mechanisms & credentials SEE-GRID Banjaluka Training Session

10 Why Grid security is hard 2
Interactions are not just client/server, but service-to-service on behalf of the user - Requires delegation of rights by user to service - Services may be dynamically instantiated Standardization of interfaces to allow for discovery, negotiation and use Implementation must be broadly available & applicable - Standard, well-tested, well-understood protocols; integrated with wide variety of tools Policy from sites, VO, users need to be combined - Varying formats Want to hide as much as possible from applications! SEE-GRID Banjaluka Training Session

11 Grid solution: use of VOs
No Cross- Domain Trust Certification Certification Authority Sub-Domain B1 Authority Server X Policy Authority Server Y Policy Authority Task Domain B Sub-Domain A1 Domain A Federation Service GSI Virtual Organization Domain SEE-GRID Banjaluka Training Session

12 Effective policy governing access within a collaboration
SEE-GRID Banjaluka Training Session

13 Use delegation to establish dynamic distributed system
Computing Center Service Rights VO Computing Center SEE-GRID Banjaluka Training Session

14 SEE-GRID Banjaluka Training Session
GSI implementation SSL/WS-Security with Proxy Certificates Services (running on user’s behalf) Authz Callout Access Compute Center CAS or VOMS issuing SAML or X.509 ACs Rights’’ VO Users Rights VO Local Policy on VO identity or attribute authority MyProxy Rights’ KCA SEE-GRID Banjaluka Training Session

15 “Logging on” to the Grid
To run programs, authenticate to Grid: voms-proxy-init –voms VONAME Enter PEM pass phrase: *************** Creates a temporary, local, short-lived proxy credential for use by our computations Delegation = remote creation of a (second level) proxy credential, which allows remote process to authenticate on behalf of the user SEE-GRID Banjaluka Training Session

16 SEE-GRID Banjaluka Training Session
Middleware LCG: Large Hadron Collider Computing Grid LCG infrastructure running LCG-2 is “EGEE-0” In parallel producing new web-service-oriented middleware (“gLite”), which will replace LCG-2 as production facility this year Globus 2 based Web services based EGEE-2 EGEE-1 LCG-2 LCG-1 SEE-GRID Banjaluka Training Session

17 SEE-GRID Banjaluka Training Session
User view of the Grid User Interface User Interface Grid services SEE-GRID Banjaluka Training Session

18 SEE-GRID Banjaluka Training Session
What really happens User interface Resource Broker Replica Catalogue Input “sandbox” DataSets info Output “sandbox” Information Service Job Submit Event SE & CE info Job Query Publish Auth. &Auth. Input “sandbox” + Broker Info Job Status Output “sandbox” Storage Element Job Status Computing Element Logging & Book-keeping SEE-GRID Banjaluka Training Session

19 Workload Management System (WMS)
Distributed scheduling multiple UI’s where you can submit your job multiple RB’s from where the job can be sent to a CE multiple CE’s where the job can be put in a queuing system Distributed resource management multiple information systems that monitor the state of the grid Information from SE, CE, sites SEE-GRID Banjaluka Training Session

20 Authentication and Authorization
User obtains certificate from CA Connects to UI by ssh Downloads certificate Invokes Proxy server Single logon – to UI - then Secure Socket Layer with proxy identifies user to other nodes Authorization - currently User joins Virtual Organisation VO negotiates access to Grid nodes and resources (CE, SE) Authorization tested by CE, SE: gridmapfile maps user to local account SEE-GRID Banjaluka Training Session

21 SEE-GRID Banjaluka Training Session
User Interface (UI) UI is the user’s interface to the Grid - Command-line interface to Proxy server Job operations To submit a job Monitor its status Retrieve output Data operations Upload file to SE Create replica Discover replicas Other grid services To run a job user creates a JDL (Job Description Language) file SEE-GRID Banjaluka Training Session

22 Computing Element (CE)
A CE is a grid batch queue with a “grid gate” front-end: Job request I.S. Logging Logging Info system Gatekeeper gridmapfile Grid gate node Local resource management system: Condor / PBS / LSF master Homogeneous set of worker nodes SEE-GRID Banjaluka Training Session

23 SEE-GRID Banjaluka Training Session
Storage Element (SE) Storage elements hold files: write once, read many Replica files can be held on different SE: “close” to CE; share load on SE Replica Catalogue - what replicas exist for a file? Replica Location Service - where are they? File transfer Requests Logging Event Logging GridFTP Gatekeeper Info system Local Info gridmapfile Disk arrays or tapes SEE-GRID Banjaluka Training Session

24 SEE-GRID Banjaluka Training Session
Resource Broker Run the Workload Management System To accept job submissions Dispatch jobs to appropriate Compute Element (CE) Allow users To get information about their status To retrieve their output A configuration file on each UI node determines which RB node(s) will be used When a user submits a job, JDL options are to: Specify CE Allow RB to choose CE (using optional tags to define requirements) Specify SE (then RB finds “nearest” appropriate CE, after interrogating Replica Location Service) SEE-GRID Banjaluka Training Session

25 Logging and Bookkeeping
Who did what and when? What’s happening to my job? Usually runs on RB node Information System Receives periodic (~5 min) updates from CE, SE Used by RB node to determine resources to be used by a job Currently BDII is used SEE-GRID Banjaluka Training Session

26 What have we learn so far?
Grid structure is complicated but hidden from end-users, enabling all the comfort they need Users just need to join the VO and obtain certificates: we already have the SEE-GRID VO! Use of Grid is then just as easy as the use of a computer cluster SEE-GRID Banjaluka Training Session

27 SEE-GRID Banjaluka Training Session
SEE-GRID Overview SEE-GRID is EU FP6 project, involving 11 partners from 11 European countries: Greece, Switzerland, Bulgaria, Romania, Turkey, Hungary, Albania, Bosnia and Herzegovina, FYR of Macedonia, Serbia and Montenegro, Croatia Each partner collaborates with one or more 3rd parties Project started in May 2004, lasts 2 years, SEE-GRID-2 on its way SEE-GRID Banjaluka Training Session

28 SEE-GRID Objectives (1)
Human network in the area of grid computing eScience and eInfrastructures Integrate incubating and existing National Grid infrastructures in all SEE-GRID countries Ease the digital divide and bring SEE Grid communities closer to the rest of the continent SEE-GRID Banjaluka Training Session

29 SEE-GRID Objectives (2)
Establish a dialogue at the level of policy developments for research and education networking and provide input to the agenda of national governments and funding bodies Promote awareness in the region regarding Grid developments through dissemination conferences, training material and demonstrations for hands-on experience Migrate and test Grid middleware components and APIs developed by pan-European and national Grid efforts in the regional infrastructure SEE-GRID Banjaluka Training Session

30 SEE-GRID Objectives (3)
Deploy (adapt if necessary) and test Grid applications developed by EGEE Demonstrate an additional Grid application of regional interest Integrate available pilot Resource Centres of Albania, Bosnia-Herzegovina, Croatia, FYR of Macedonia, Serbia-Montenegro and Turkey into the EGEE-compatible infrastructure Expand the operations and support centre of the EGEE SE Europe Federation to cater for the operations in the above countries SEE-GRID Banjaluka Training Session

31 SEE-GRID Infrastructure Overview (1)
At least one SEE-GRID site per country, (currently 15+1!), each deploying CE, SE, MON, UI, and a number of WNs SEE-GRID regional services: SEE-GRID CA (Greece) RB and BDII (Turkey + Serbia and Montenegro) VOMS (Croatia) R-GMA (Bulgaria) SFTs and GridICE (FYR of Macedonia) P-GRADE portal (Hungary) MYProxy (Greece + Serbia and Montegro) LFC (Serbia and Montenegro) SEE-GRID Banjaluka Training Session

32 SEE-GRID Infrastructure Overview (2)
SEE-GRID applications: SE4SEE (Turkey) VIVE (Serbia and Montenegro) Technical Forum (Hungary) SEE-GRID Web site and WIKI (Greece) Infrastructure mailing list: Strong human network SEE-GRID Banjaluka Training Session

33 SEE-GRID Banjaluka Training Session
Hands-on Plan Hands-on I: UI Installation and Configuration Hands-on II: Certificates, Proxies, Test Jobs TOMORROW: Hands-on III: Composing the site-info.def file Hands-on IV: UI/CE Installation and Configuration Hands-on V: SE/MON Installation and Configuration Hands-on VI: WNs Installation and Configuration Hands-on VII: Testing and SEE-GRID Tuning SEE-GRID Banjaluka Training Session

34 Hands-on II: Certificates, Proxies, Test Jobs
SEE-GRID Banjaluka Training Session

35 SEE-GRID Banjaluka Training Session
Grid Certificates Each user must have a valid X.509 certificate issued by a recognized Certification Authority (CA) Before doing any Grid operation, user must log in to User Interface (UI) machine and create a proxy certificate. A proxy certificate is a delegated user credential that authenticates the user in every secure interaction, and has a limited lifetime: in fact, it prevents having to use one's own certificate, which could compromise its safety voms-proxy-init –voms VONAME Voms-proxy-info; voms-proxy-destroy SEE-GRID Banjaluka Training Session

36 SEE-GRID Banjaluka Training Session
Job Submission (1) User have to create a file describing the submitted job in Job Description Language (JDL) User submits jobs to Resource Broker (RB) JDL for simple test job: antun]$ cat test.jdl Executable = "/bin/hostname"; StdOutput = "std.out"; StdError = "std.err"; OutputSandbox = {"std.out","std.err"}; SEE-GRID Banjaluka Training Session

37 SEE-GRID Banjaluka Training Session
Job Submission (2) edg-job-list-match test.jdl edg-job-submit test.jdl edg-job-status JobID edg-job-cancel JobID edg-job-get-output JobID edg-job-get-logging-info JobID Bypassing RB: globus-job-run CE command SEE-GRID Banjaluka Training Session

38 SEE-GRID Banjaluka Training Session
Using myproxy server Myproxy server is used for Very long jobs (that normal proxy may be expired) Getting proxy on other machines than UI (typical for portals) myproxy-init –s MYPROXYSERVER myproxy-get-delegation myproxy-info myproxy-destroy SEE-GRID Banjaluka Training Session

39 SEE-GRID Banjaluka Training Session
In a nutshell voms-proxy-init –voms VONAME edg-job-submit job.jdl edg-job-status JobID edg-job-get-output JobID SEE-GRID Banjaluka Training Session

40 Monitoring, SEE-GRID SFTs and GridICE (1)
Qstat, showq, pbsnodes on CE Ldapsearch of GIISes: ldapsearch -x -h <CE_or_SE> -p b mds-vo-name=local,o=grid ldapsearch -x -h <CE> -p b mds-vo-name=<site-giis-name>,o=grid ldapsearch -x -h <BDII> -p b o=grid Useful entries: GlueCEUniqueID, GlueSEUniqueID, GlueSEName, GlueCESEBindSEUniqueID SEE-GRID Banjaluka Training Session

41 Monitoring, SEE-GRID SFTs and GridICE (2)
For some grid components there are custom checking tools, e.g. rgma-client-check ps on all nodes – do not forget about excellent ps! Submitting test jobs SEE-GRID GStat SEE-GRID Banjaluka Training Session

42 Monitoring, SEE-GRID SFTs and GridICE (3)
SEE-GRID GridICE Real Time Monitor SEE-GRID Banjaluka Training Session


Download ppt "Introduction to Linux Clusters and Grids"

Similar presentations


Ads by Google