Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tutorial on Distributed High Performance Computing 14:30 – 19:00 (2:30 pm – 7:00 pm) Wednesday November 17, 2010 Jornadas Chilenas de Computación 2010.

Similar presentations


Presentation on theme: "Tutorial on Distributed High Performance Computing 14:30 – 19:00 (2:30 pm – 7:00 pm) Wednesday November 17, 2010 Jornadas Chilenas de Computación 2010."— Presentation transcript:

1 Tutorial on Distributed High Performance Computing 14:30 – 19:00 (2:30 pm – 7:00 pm) Wednesday November 17, 2010 Jornadas Chilenas de Computación 2010 INFONOR-CHILE 2010 November 15th - 19th, 2010 Antofagasta, Chile Dr. Barry Wilkinson University of North Carolina Charlotte Nov 3, 2010 © Barry Wilkinson

2 2 Agenda Part 1 14.30 – 16.30 (2:30 pm – 4:30 pm) Review of distributed high performance computing landscape, cluster computing, grid computing, cloud computing, grid portals and toolkits, GPU computing for HPC, CUDA. 16.30 – 17.00 (4:30 pm – 5:00 pm) Coffee Break Part 2 17.00 – 19.00 (5:00 pm – 7:00 pm) Grid computing infrastructure design, portal design, higher-level interfaces challenge of porting applications to a Grid. (Hands-one experience in Grid computing workshop)

3 Part 2 17.00 – 19.00 (5:00 pm – 7:00 pm) Grid Computing Infrastructure Design

4 Primary objective: To make seamless environment for users to access distributed resources. Grid computing infrastructure software

5 Key aspects include: Secure envelop over all transactions Single sign-on - being able to access all available resources and run jobs without having to supply additional passwords or account information. Data management tools Information services providing characteristics of resources and their status (including dynamic load) APIs and services that enable applications themselves to take advantage of Grid platform Convenient user interface

6 Globus A “toolkit” of services and packages for creating basic grid computing infrastructure Higher level tools need to be added to this infrastructure Four versions developed from 1996 to present time. (Version 5 being developed for 2010. We will not consider 5.) Reference implementations of Grid computing standards. Version 4 is web-services based Non-web services code exists from earlier versions (legacy) or where not appropriate (for efficiency, etc.). Note -- As cloud computing moves into HPC, expect to see cloud computing standards fro interoperability

7 Some Globus toolkit versions (approximate time line) Globus 5.0

8 Globus Toolkit Five major parts: Common run time - Libraries and services Security - Components to provide secure access Execution management - Executing, monitoring and management of jobs Data Management - Discovery access and transfer of data Information - Discovery and monitoring of resources and services

9 Data Management Security Common Runtime Execution Management Information Services Web Services Components Non-WS Components Pre-WS Authentication Authorization GridFTP Grid Resource Allocation Mgmt (Pre-WS GRAM) ‏ Monitoring & Discovery System (MDS2) ‏ C Common Libraries GT2GT2 WS Authentication Authorization Reliable File Transfer OGSA-DAI [Tech Preview] Grid Resource Allocation Mgmt (WS GRAM) ‏ Monitoring & Discovery System (MDS4) ‏ Java WS Core Community Authorization Service GT3GT3 Replica Location Service XIO GT3GT3 Credential Management GT4GT4 Python WS Core [contribution] C WS Core Community Scheduler Framework [contribution] Delegation Service GT4GT4 Globus Open Source Grid Software I Foster

10 Some basic Globus components GSI Grid Security Infrastructure – Provides for security envelop around Grid resources – Uses public key cryptography GRAM (Globus/Grid Resource Allocation Management)‏ – Globus’ basic execution management component – Used to issue and manage jobs MDS (Monitoring and Discovery Service)‏ – To discover resources and their status GridFTP – For transferring files between resources

11 Security Has to cross administrative domains. Need agreed mechanisms and standards. Focus on Internet security mechanisms, modified to handle the special needs of Grid computing. Distributed resources must be protected from unauthorized access.

12 GSI (Grid Security Infrastructure) Globus components for creating security envelop. Authentication -- Requires each user to be authenticated (their identity proved). Uses public key cryptography (basis of Internet security)‏ Each user must possess a (digital) certificate, signed by a trusted certificate authority. Delegation -- Users will also need to be able to give their authority to Grid components to act on their behalf Achieved with so-called proxy certificates. Authorization -- Users generally will also need authorization to use resources (generally accounts)

13 Resource Discovery Globus MDS (Monitoring and Discovery System) Users might access MDS to discover status of compute resources. In practice, users often know what resources are there but not dynamic load. MDS might be used by other Grid components such as schedulers.

14 Executing a Job Users typically want to submit jobs for execution. Basic Globus component for running a job is GRAM (Globus or Grid Resource Allocation Management).

15 GRAM Command-line Interface Grid computing environments mostly Linux-based and originally accessed through a command line. Once your security credentials established, to run a simple job you might issue GRAM command: globusrun-ws -submit –F torvalds.cis.uncw.edu -c prog1 where prog1 is executable of job. Executable needs to be present on compute resource that is to execute it (in this case torvalds.cis.uncw.edu)

16 Transferring files May be necessary beforehand to transfer files to resources and afterwards to transfer files to other locations including back to user. User might use data management component called GridFTP for that.

17 GridFTP command to transfer files globus-url-copy \ gsiftp://www.coitgrid02.uncc.edu/~abw/prog1out \ file:///home/abw/ First argument is source location and second argument is destination location. In the above case, the file: www.coit-grid02.uncc.edu/~abw/prog1out transferred to home/abw/ on the local computer.

18 User employing Globus services and facilities

19 Grid portals Command-line interface a very primitive way of interacting with Grid resources. Portal offers a higher-level Web based interfaces to accessing and controlling grid resources and to communicate with other members of Virtual Organization Popular portal toolkit is GridSphere

20 20 Some Gridsphere-based Grid portals

21 21

22 UNC-C course portal

23 Before users can log on, they need a user name and password for portal. They must have user “credentials” and accounts on the resources they wish to access. PURSe (Portal-based User Registration Service) Used In UNC–Charlotte course portal to facilitate setup procedures. Reached by selecting “Register” tab. User enters required information, which is forwarded to Grid system administrator to set up accounts and credentials. “Credentials” and accounts

24 PURSe registration portlet

25 Registration activities

26 Once logged into Grid portal, user will see a number of tabs across top, which enable user to perform many basic tasks. Grid information tab

27 Proxies To use many services, you are required to have a proxy certificate (a proxy), derived from your user certificate. Proxies enables resources to be accessed on user’s behalf. Proxies are part of Grid security infrastructure, discussed later. myProxy Credential Management Service Very convenient to use to hold proxies (and user certificates) Usually, Gridsphere automatically obtains a proxy from myProxy server for you when you log in.

28 Proxy management tab

29 1-2.29 File management tab

30 Batch job submission tab

31 Data Management Security Common Runtime Execution Management Information Services Web Services Components Non-WS Components Pre-WS Authentication Authorization GridFTP Grid Resource Allocation Mgmt (Pre-WS GRAM) Monitoring & Discovery System (MDS2) C Common Libraries GT2GT2 WS Authentication Authorization Reliable File Transfer OGSA-DAI [Tech Preview] Grid Resource Allocation Mgmt (WS GRAM) Monitoring & Discovery System (MDS4) Java WS Core Community Authorization Service GT3GT3 Replica Location Service XIO GT3GT3 Credential Management GT4GT4 Python WS Core [contribution] C WS Core Community Scheduler Framework [contribution] Delegation Service GT4GT4 I Foster (Globus) Grid Security Infrastructure (GSI)

32 Authentication -- Process of deciding whether a particular identity is who he says he is (applies to humans and systems) Authorization -- Process of deciding whether a particular identity can access a particular resource  Assumes identify has been previously validated through authentication  Access control - what type of access Finer level of authorization rather than blanket ability to make any type of access Authentication and authorization

33 GSI Authentication Basically same as regular PKI authentication. Users have credentials they use to prove their identity. Credentials consist of: X.509 certificate Private key Private key kept secret by owner (or on owner’s behalf at a secure repository) and encrypted with a passphrase. X.509 certificate is available to all.

34 Certificate Authorities for Grid Computing A Grid computing group (virtual organization) requires one or more certificate authorities to sign certificates. Generally, cannot use existing commercial certificate authorities because virtual organization wants to control who becomes a member of organization. Done by issuing certificates signed by a certificate authority of the virtual organization.

35 SimpleCA Simple implementation of a certificate authority Part of Globus toolkit Can be installed easily. Basically OpenSSL certificate authority configured to work with Globus.

36 UNC-C/UNC-W Grid Computing Course Certificate Authorities SimpleCA used in UNC-C/UNC-W Grid computing course. Currently, we have a CA at UNC-C and at UNC-W. Multiple certificate authorities possible -- One at each institution for signing certificates of students at that institution. Then arrange for Globus to accept certificates signed by each certificate authority in much the same way as a browser accepts multiple CAs.

37 Certificate Authorities for Grid Computing Projects Single centralized CA -- certainly simplifies management. Example UK e-Science national Grid has a centralized certificate authority but uses multiple registration authorities spread around country for identity management. Registrations manned by individuals who will require positive proof of identity (photo ID). Multiple CA’s -- Have multiple certificate authorities with bridge or hierarchical certificate authorities. More common for USA Grid computing projects.

38 Single national certificate authority with multiple registration authorities 5-1.38

39 grid-cert-request Creates a private key pair and request for a signed certificate, that is, an unsigned certificate containing subject name and public key. Default distinguished name (certificate subject) displayed for the user as part of the message. Command requires that you create a passphrase, which will be used to encrypt the private key and must be remembered. Getting a Certificate using Globus Commands

40 Three files created by grid-cert-request command in user’s.globus directory: User request: usercert_request.pem User’s private key: userkey.pem Empty file: usercert.pem Essentially an unsigned certificate containing subject name and public key. Placeholder for where signed certificate will be put later.

41 Sending request usercert_request.pem file sent to certificate authority. Typically, this file sent by email to administrator of certificate authority. grid-cert-request command includes a message telling user what to do.

42 CA Administrator After receiving request, administrator will run command: grid-ca-sign -in usercert_request.pem -out signedcert.pem Needs passphrase used to encrypt/decrypt certificate authority’s private key. Command above generates signed certificate called signedcert.pem

43 CA Administrator Certificate authority administrator will return signed certificate to user, typically by email. It is not a major security concern. Why? User then replaces empty usercert.pem with this file (rename it to be usercert.pem). Other ways of getting signed certificate back to user, including letting administrator access user’s account to download file into user’s account.

44 Getting a signed certificate using Globus commands

45 User Credentials Finally, we have the complete set of user credentials: User’s private key: userkey.pem User’s signed certificate: usercert.pem

46 Computing resources also need their identity verified in a formalized manner when added to the Grid infrastructure. Need their own host certificate signed by a certificate authority trusted by the Grid. Only such machines will be allowed to participate in the Grid activities. They might be used under certain access rights. Certificates for Resources

47 Certificate of Certificate Authority When a certificate authority created, it will self-sign its own certificate (unless a certificate authority hierarchy). Certificate authority has two files:.0.signing_policy where is the hash code of the identity of certificate authority (a 32-bit number, given in hexadecimal) cert_hash.0 is the actual certificate of certificate authority cert_hash.signing_policy defines format of distinguished names of certificates signed by certificate authority.

48 Configuring GT4 to Trust a Particular Certificate Authority Globus can be configured to accept certificates from multiple certificate authorities. It is just a matter of placing the two files of each certificate authority:.0 and.signing_policy in /etc/grid-security/certificates.

49 Certificate Authorities Trusted One can see certificate authorities recognized and choose one to sign your certificate request by issuing command grid-cert-request -ca Sample output nondefaultca=true The available CA configurations installed on this host are: 1) 61de2736 - /O=Grid/OU=GlobusTest/OU=simpleCA-coit- grid02.uncc.edu/CN=Globus Simple CA 2) 76cc56e4 - /O=Grid/OU=GlobusTest/OU=simpleCA-coit- grid03.uncc.edu/CN=Globus Simple CA 3) c41c7188 - /O=UNCW /OU=Computer Science/CN=Certificate Authority Enter the index number of the CA you want to sign your cert request:

50 GSI Authentication Protocol Originally based on SSL protocol. B authenticating A A sends its certificate to B. B gets A’s public key and name from certificate using public key of certificate authority that signed certificate. (It must be a CA it trusts.) B creates a random number and sends it to A. A encrypts random number with its private key and sends it to B. B decrypts number with A’s public key and checks number. If correct, B is certain of A’s identity. Mutual authentication - process done both ways.

51 GSI Authorization Authorization -- process of deciding whether a particular identity can access a particular resource and in what fashion. Currently in Globus toolkit proper, only basic access control facilities provided. (Other software packages aid process.)

52 Accounts Generally in a traditional Globus Grid environment, user accounts have to exist on each computer system that users wish to access. Setting up individual accounts time-consuming. Multiple system administrators involved. Sometimes, convenient to have a group account for virtual organization and virtual organization users share this account.

53 Accounts In our course, accounts simply set up by hand (using a script) An automated mechanism for creating and managing these accounts very desirable Use a network accessible (LDAP) database that lists users and their access privileges, and incorporates distinguished names format found in X-509 certificates.

54 Mapping Distinguished Names to Account gridmap file Very basic Globus way of mapping user’s distinguished names to their account names Used to give access to accounts via their distinguished name found on user’s certificate. Each user entry in list takes form: Distinguished_name local_user_account_name

55 Example: "/O=Grid/OU=GlobusTest/OU=simpleCA-coit- grid02.uncc.edu/OU=uncc.edu/CN=student1" student1 Distinguished name given in quotation marks to allow spaces. Must exactly match way it appears in user’s certificate. GSI uses gridmap file to establish that user may access account.

56 Mapping accounts using gridmap files on distributed computers

57 Account Privileges Gridmap files often compared to access control lists, but they only provide account name mapping and blanket access. They do not provide specific types of access (levels of permissions, read/write/execute, group memberships, etc.) User access privileges will derive from local system access control list. Generally, need more powerful mechanism to control type of access, see next.

58 Break/Questions


Download ppt "Tutorial on Distributed High Performance Computing 14:30 – 19:00 (2:30 pm – 7:00 pm) Wednesday November 17, 2010 Jornadas Chilenas de Computación 2010."

Similar presentations


Ads by Google