Presentation is loading. Please wait.

Presentation is loading. Please wait.

Carl Kesselman Information Sciences Institute University of Southern California Univa Corporation Grid MasterClass.

Similar presentations


Presentation on theme: "Carl Kesselman Information Sciences Institute University of Southern California Univa Corporation Grid MasterClass."— Presentation transcript:

1

2 Carl Kesselman Information Sciences Institute University of Southern California Univa Corporation Grid MasterClass

3 2 Credits l Globus Toolkit v4 is the work of many talented Globus Alliance members, at u Argonne Natl. Lab & U.Chicago u USC Information Sciences Corporation u National Center for Supercomputing Applns u U. Edinburgh u Swedish PDC u Univa Corporation u Other contributors at other institutions l Supported by DOE, NSF, UK EPSRC, and other sources

4 3 Acknowledgements l Ian Foster with whom I developed many of these slides l Bill Allcock, Lisa Childers, Kate Keahey, Jennifer Schopf, Frank Siebenlist, Mike Wilde @ ANL/UC l Ann Chervenak, Ewa Deelman, Laura Pearlman, Mike D’Arcy, Rob Schuler @ USC/ISI l Karl Czajkowski, Steve Tuecke @ Univa l Numerous other fine colleagues l NSF, DOE, IBM for research support

5 4 Context: System-Level Science Problems too large &/or complex to tackle alone …

6 5 Seismic Hazard Analysis (T. Jordan & SCEC) Seismic Hazard Model Seismicity Paleoseismology Local site effects Geologic structure Faults Stress transfer Crustal motion Crustal deformation Seismic velocity structure Rupture dynamics

7 6 SCEC Community Model IntensityMeasuresEarthquake Forecast Model AttenuationRelationship 1 Standardized Seismic Hazard Analysis Ground motion simulation Physics-based earthquake forecasting Ground-motion inverse problem Structural Simulation AWM GroundMotions SRM Unified Structural Representation Faults Motions Stresses Anelastic model 2 AWP = Anelastic Wave Propagation = SRM = Site Response Model RDMFSM 3 FSM = Fault System Model RDM = Rupture Dynamics Model Invert Other Data GeologyGeodesy 4 2 3 1 4 5 5

8 7 Science Takes a Village … l Teams organized around common goals u People, resource, software, data, instruments… l With diverse membership & capabilities u Expertise in multiple areas required l And geographic and political distribution u No location/organization possesses all required skills and resources l Must adapt as a function of the situation u Adjust membership, reallocate responsibilities, renegotiate resources

9 8 Virtual Organizations l From organizational behavior/management: u "a group of people who interact through interdependent tasks guided by common purpose [that] works across space, time, and organizational boundaries with links strengthened by webs of communication technologies" (Lipnack & Stamps, 1997) l The impact of cyberinfrastructure u People  computational agents & services u Communication technologies  IT infrastructure, i.e. Grid “The Anatomy of the Grid”, Foster, Kesselman, Tuecke, 2001

10 9 Beyond Science Silos: Service-Oriented Architecture l Decompose across network l Clients integrate dynamically u Select & compose services u Select “best of breed” providers u Publish result as a new service l Decouple resource & service providers Function Resource Data Archives Analysis tools Discovery tools Users Fig: S. G. Djorgovski

11 10 Decomposition Enables Separation of Concerns & Roles User Service Provider “Provide access to data D at S1, S2, S3 with performance P” Resource Provider “Provide storage with performance P1, network with P2, …” D S1 S2 S3 D S1 S2 S3 Replica catalog, User-level multicast, … D S1 S2 S3

12 11 Forming & Operating (Scientific) Communities l Define VO membership and roles, & enforce laws and community standards u I.e., policy l Build, buy, operate, & share community infrastructure u Data, programs, services, computing, storage, instruments u Service-oriented architecture l Define and perform collaborative work u Use shared infrastructure, roles, & policy u Manage community workflow

13 12 Forming & Operating (Scientific) Communities l Define VO membership and roles, & enforce laws and community standards u I.e., policy l Build, buy, operate, & share community infrastructure u Data, programs, services, computing, storage, instruments u Service-oriented architecture l Define and perform collaborative work u Use shared infrastructure, roles, & policy u Manage community workflow

14 13 Defining Community: Membership and Laws l Identify VO participants and roles u For people and services l Specify and control actions of members u Empower members  delegation u Enforce restrictions  federate policy A 12 B 12 A B 1 10 1 1 16

15 14 Security Services Objectives l It’s all about “policy” u Define a VO’s operating rules u Security services facilitate the enforcement l Policy facilitates “business objectives” u Related to goals/purpose of the VO l Security policy often delicate balance u Legislation may mandate minimum security u More security  Higher costs u Less security  Higher exposure to loss u Risk versus Rewards

16 15 Policy Challenges in VOs l Restrict VO operations based on characteristics of requestor u VO dynamics create challenges l Intra-VO u VO specific roles u Mechanisms to specify/enforce policy at VO level l Inter-VO u Entities/roles in one VO not necessarily defined in another VO Access granted by community to user Site admission- control policies Effective Access Policy of site to community

17 16 Core Security Mechanisms l Attribute Assertions u C asserts that S has attribute A with value V l Authentication and digital signature u Allows signer to assert attributes l Delegation u C asserts that S can perform O on behalf of C l Attribute mapping u {A1, A2… An}vo1  {A’1, A’2… A’m}vo2 l Policy u Entity with attributes A asserted by C may perform operation O on resource R

18 17 Trust in VOs l Do I “believe” an attribute assertion? u Used to evaluate cost vs. benefit of performing an operation l E.g., perform untrusted operation with extra auditing l Look at attributes of assertion signer l Rooting trust u Externally recognized source, e.g., CA u Dynamically via VO structure  delegation u Dynamically via alternative sources, e.g., reputation

19 18 Security Services for VO Policy l Attribute Authority (ATA) u Issue signed attribute assertions (incl. identity, delegation & mapping) l Authorization Authority (AZA) u Decisions based on assertions & policy l Use with message/transport level security VO A Service VO ATA VO AZA Mapping ATA VO B Service VO User A Delegation Assertion User B can use Service A VO-A Attr  VO-B Attr VO User B Resource Admin Attribute VO Member Attribute VO Member Attribute

20 19 Forming & Operating Scientific Communities l Define VO membership and roles, & enforce laws and community standards u I.e., policy l Build, buy, operate, & share community infrastructure u Data, programs, services, computing, storage, instruments u Service-oriented architecture l Define and perform collaborative work u Use shared infrastructure, roles, & policy u Manage community workflow

21 20 Community Services Provider Content Services Capacity Bootstrapping a VO by Assembling Services 1) Integrate services from other sources u Virtualize external services as VO services 2) Coordinate & compose u Create new services from existing ones Capacity Provider “Service-Oriented Science”, Foster, 2005

22 21 Providing VO Services: (1) Integration from Other Sources l Negotiate service level agreements l Delegate and deploy capabilities/services l Provision to deliver defined capability l Configure environment l Host layered functions Community A Community Z …

23 22 Virtualizing Existing Services into a VO l Establish service agreement with service u E.g., WS-Agreement l Delegate use to VO user User A VO Admin User B VO User Existing Services

24 23 Deploying New Services Policy Client Environment Activity Allocate/provision Configure Initiate activity Monitor activity Control activity Interface Resource provider

25 24 Activities Can Be Nested Policy Client Environment Interface Resource provider Client

26 25 Providing VO Services: (2) Coordination & Composition l Take a set of provisioned services … … & compose to synthesize new behaviors l This is traditional service composition u But must also be concerned with emergent behaviors, autonomous interactions u See the work of the agent & PlanetLab communities “Brain vs. Brawn: Why Grids and Agents Need Each Other," Foster, Kesselman, Jennings, 2004.

27 26 Birmingham The Globus-Based LIGO Data Grid Replicating >1 Terabyte/day to 8 sites >40 million replicas so far MTBF = 1 month LIGO Gravitational Wave Observatory www.globus.org/solutions  Cardiff AEI/Golm

28 27 l Pull “missing” files to a storage system List of required Files GridFTP Local Replica Catalog Replica Location Index Data Replication Service Reliable File Transfer Service Local Replica Catalog GridFTP Data Replication Service “Design and Implementation of a Data Replication Service Based on the Lightweight Data Replicator System,” Chervenak et al., 2005 Replica Location Index Data Movement Data Location Data Replication

29 28 Hypervisor/OS Deploy hypervisor/OS Composing Resources … Composing Services Physical machine Procure hardware VM Deploy virtual machine Provisioning, management, and monitoring at all levels JVM Deploy container DRS Deploy service GridFTP LRC VO Services GridFTP

30 29 Community Commons l What capabilities are available to VO? u Membership changes, state changes l Require mechanisms to aggregate and update VO information l Monitoring and discovery VO-specific indexes S S SS Information A A A FRESH MORE The age of information

31 30 Forming & Operating Scientific Communities l Define VO membership and roles, & enforce laws and community standards u I.e., policy l Build, buy, operate, & share community infrastructure u Data, programs, services, computing, storage, instruments u Service-oriented architecture l Define and perform collaborative work u Use shared infrastructure, roles, & policy u Manage community workflow

32 31 Collaborative Work Executed Executing Executable Not yet executable Query Edit Schedule Execution environment What I Did What I Want to Do What I Am Doing … Time

33 32 Managing Collaborative Work l Process as “workflow,” at different scales, e.g.: u Run 3-stage pipeline u Process data flowing from expt over a year u Engage in interactive analysis l Need to keep track of: u What I want to do (will evolve with new knowledge) u What I am doing now (evolve with system config.) u What I did (persistent; a source of information)

34 33 Trident: The GriPhyN Virtual Data System Abstract workflow Local planner DAGman DAG Statically Partitioned DAG DAGman & Condor-G Dynamically Planned DAG VDL Program Virtual Data catalog Virtual Data Workflow Generator Job Planner Job Cleanup Workflow spec Create Execution Plan Grid Workflow Execution

35 34 Workflow Generation l Given: desired result and constraints u desired result (high-level, metadata description) u application components u resources in the Grid (dynamic, distributed) u constraints & preferences on solution quality l Find: an executable job workflow u A configuration that generates the desired result u A specification of resources to be used u Sequence of operations: create agreement, move data, request operation l May create workflow incrementally as information becomes available "Mapping Abstract Complex Workflows onto Grid Environments," Deelman, Blythe, Gil, Kesselman, Mehta, Vahi, Arbree, Cavanaugh, Blackburn, Lazzarini, Koranda, 2003.

36 35 Seismic Hazard Curve Ground motion that will be exceeded every year Exceeded every year Ground motion that a person can expect to be exceeded during their lifetime Typical design for buildings Typical design for hospitals Typical design for nuclear power plant Exceeded 1 time in 10 years Exceeded 1 time in 100 years Exceeded 1 time in 1000 years Exceeded 1 time in 10,000 years Annual frequency of exceedance Ground Motion – Peak Ground Acceleration 0.10.20.30.40.50.6 Carl’s house during Northridge Minor damageModerate damage 10% probability of exceedance in 50 years

37 36 SCEC Cybershake l Calculate hazard curves by generating synthetic seismograms from estimated rupture forecast Rupture Forecast Synthetic Seismogram Strain Green Tensor Hazard Curve Spectral Acceleration Hazard Map

38 37 Cybershake on the SCEC VO TeraGrid Compute TeraGrid Storage VO Scheduler Workflow Scheduler/Engine VO Service Catalog Provenance Catalog Data Catalog SCEC Storage

39 38 SCEC is using Pegasus among other tools to generate hazard maps for the LA area Some uses of Pegasus Runs on the TeraGrid in 2005

40 39 Summary (1): Community Services l Community roll, city hall, permits, licensing & police force u Assertions, policy, attribute & authorization services l Directories, maps u Information services l City services: power, water, sewer u Deployed services l Shops, businesses u Composed services l Day-to-day activities u Workflows, visualization l Tax board, fees, economic considerations u Barter, planned economy, eventually markets

41 40 Summary (2) l Community based science will be the norm u Requires collaborations across sciences— including computer science l Many different types of communities u Differ in coupling, membership, lifetime, size l Must think beyond science stovepipes u Increasingly the community infrastructure will become the scientific observatory l Scaling requires a separation of concerns u Providers of resources, services, content l Small set of fundamental mechanisms required to build communities

42 41 The Globus Toolkit l Background l Globus Toolkit l Future directions l Related tools l Opportunities for collaboration

43 42 Don’t take our word for it! Read the UK eScience Evaluation of GT4 www.nesc.ac.uk/technical_papers/UKeS-2005-03.pdf (Reachable from www.globus.org, under “News”)

44 43 The Role of the Globus Toolkit l A collection of solutions to problems that come up frequently when building collaborative distributed applications l Heterogeneity u A focus, in particular, on overcoming heterogeneity for application developers l Standards u We capitalize on and encourage use of existing standards (IETF, W3C, OASIS, GGF) u GT also includes reference implementations of new/proposed standards in these organizations

45 44 The Application-Infrastructure Gap Dynamic and/or Distributed Applications A 1 B 1 9 9 Shared Distributed Infrastructure

46 45 Provisioning Bridging the Gap: Grid Infrastructure l Service-oriented Grid infrastructure u Provision physical resources to support application workloads Appln Service Users Workflows Composition Invocation l Service-oriented applications u Wrap applications as services u Compose applications into workflows

47 46 Layers in the Grid

48 47 A Typical eScience Use of Globus: Network for Earthquake Eng. Simulation Links instruments, data, computers, people

49 48 Without the Globus Toolkit Web Browser Compute Server Data Catalog Data Viewer Tool Certificate authority Chat Tool Credential Repository Web Portal Compute Server Resources implement standard access & management interfaces Collective services aggregate &/or virtualize resources Users work with client applications Application services organize VOs & enable access to other services Database service Database service Database service Simulation Tool Camera Telepresence Monitor Registration Service A B C D E Application Developer 10 Off the Shelf 12 Globus Toolkit 0 Grid Community 0

50 49 With the Globus Toolkit Web Browser Compute Server Globus MCS/RLS Data Viewer Tool Certificate Authority CHEF Chat Teamlet MyProxy CHEF Compute Server Resources implement standard access & management interfaces Collective services aggregate &/or virtualize resources Users work with client applications Application services organize VOs & enable access to other services Database service Database service Database service Simulation Tool Camera Telepresence Monitor Globus Index Service Globus GRAM Globus DAI Application Developer 2 Off the Shelf 9 Globus Toolkit 4 Grid Community 4

51 50 The Globus Toolkit: “Standard Plumbing” for the Grid l Not turnkey solutions, but building blocks & tools for application developers & system integrators u Some components (e.g., file transfer) go farther than others (e.g., remote job submission) toward end-user relevance l Easier to reuse than to reinvent u Compatibility with other Grid systems comes for free l Today the majority of the GT public interfaces are usable by application developers and system integrators u Relatively few end-user interfaces u In general, not intended for direct use by end users (scientists, engineers, marketing specialists)

52 51 Globus is Grid Infrastructure l Software for Grid infrastructure u Service enable new & existing resources u E.g., GRAM on computer, GridFTP on storage system, custom application service u Uniform abstractions & mechanisms l Tools to build applications that exploit Grid infrastructure u Registries, security, data management, … l Open source & open standards u Each empowers the other l Enabler of a rich tool & service ecosystem

53 52 An eBusiness Use of Globus: SAP Demonstration @ GlobusWorld l 3 Globus-enabled applns: u CRM: Internet Pricing Configurator (IPC) u CRM: Workforce Management (WFM) u SCM: Advanced Planner & Optimizer (APO) l Applications modified to: u Adjust to varying demand & resources u Use Globus to discover & provision resources IPC Dispatcher IPC Server Request: Price Query Delegation of Request Response: Pricelist Depending on: - Time - Discount - Number of Items - … Web Browsers / Batch Processes (typically several thousand requests) IPC Server 1 2 2 3 SAP AG R/3 Internet Pricing & Configurator (IPC)

54 53 The Globus Toolkit is a Collection of Components l A set of loosely-coupled components, with: u Services and clients u Libraries u Development tools l GT components are used to build Grid- based applications and services u GT can be viewed as a Grid SDK l GT components can be categorized across two different dimensions u By broad domain area u By protocol support

55 54 Our Goals for GT4 l Usability, reliability, scalability, … u Web service components have quality equal or superior to pre-WS components u Documentation at acceptable quality level l Consistency with latest standards (WS-*, WSRF, WS-N, etc.) and Apache platform u WS-I Basic Profile compliant u WS-I Basic Security Profile compliant l New components, platforms, languages u And links to larger Globus ecosystem

56 55 GT Domain Areas l Core runtime u Infrastructure for building new services l Security u Apply uniform policy across distinct systems l Execution management u Provision, deploy, & manage services l Data management u Discover, transfer, & access large data l Monitoring u Discover & monitor dynamic services

57 56 Data Mgmt Security Common Runtime Execution Mgmt Info Services GridFTP Authentication Authorization Reliable File Transfer Data Access & Integration Grid Resource Allocation & Management Index Community Authorization Data Replication Community Scheduling Framework Delegation Replica Location Trigger Java Runtime C Runtime Python Runtime WebMDS Workspace Management Grid Telecontrol Protocol Globus Toolkit v4 www.globus.org Credential Mgmt Globus Toolkit: Open Source Grid Infrastructure

58 57 GT Protocols l Web service protocols u WSDL, SOAP u WS Addressing, WSRF, WSN u WS Security, SAML, XACML u WS-Interoperability profile l Non Web service protocols u Standards-based, such as GridFTP u Custom

59 58 “Stateless” vs. “Stateful” Services l Without state, how does client: u Determine what happened (success/failure)? u Find out how many files completed? u Receive updates when interesting events arise? u Terminate a request? l Few useful services are truly “stateless”, but WS interfaces alone do not provide built-in support for state Client FileTransfer Service move (A to B) move

60 59 FileTransferService (without WSRF) l Developer reinvents wheel for each new service u Custom management and identification of state: transferID u Custom operations to inspect state synchronously (whatHappen) and asynchronously (tellMeWhen) u Custom lifetime operation (cancel) Client FileTransfer Service move (A to B) : transferID move state whatHappen tellMeWhen cancel

61 60 WSRF in a Nutshell l Service l State representation u Resource u Resource Property l State identification u Endpoint Reference l State Interfaces u GetRP, QueryRPs, GetMultipleRPs, SetRP l Lifetime Interfaces u SetTerminationTime u ImmediateDestruction l Notification Interfaces u Subscribe u Notify l ServiceGroups RPs Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR

62 61 FileTransferService (w/ WSRF) l Developer specifies custom method to createResource and leaves the rest to WSRF standards: u State exposed as Resource + Resource Properties and identified by Endpoint Reference (EPR) u State inspected by standard interfaces (GetRP, QueryRPs) u Lifetime management by standard interfaces (Destroy) Client FileTransferService createResource (A to B) : EPR createResource RPs Transfer getRP queryRPs destroy

63 62 Stateful Entities Registry Service requestor (e.g., user application) Factory service Create Stateful Entity State Address Resource allocation Register Stateful Entity Discovery Interactions standardized using WSDL and SOAP State inspection Lifetime mgmt Notifications Authentication & Authorization are applied to all requests Modeling State in Web Services

64 63 Data Mgmt Security Common Runtime Execution Mgmt Info Services GridFTP Authentication Authorization Reliable File Transfer Data Access & Integration Grid Resource Allocation & Management Index Community Authorization Data Replication Community Scheduling Framework Delegation Replica Location Trigger Java Runtime C Runtime Python Runtime WebMDS Workspace Management Grid Telecontrol Protocol Globus Toolkit v4 www.globus.org Credential Mgmt Globus Toolkit: Open Source Grid Infrastructure

65 64 GT4 Web Services Runtime l Supports both GT (GRAM, RFT, Delegation, etc.) & user-developed services l Redesign to enhance scalability, modularity, performance, usability l Leverages existing WS standards u WS-I Basic Profile: WSDL, SOAP, etc. u WS-Security, WS-Addressing l Adds support for emerging WS standards u WS-Resource Framework, WS-Notification l Java, Python, & C hosting environments u Java is standard Apache

66 65 GT4 WS Core in a Nutshell RPs Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR Implementation of WSRF: Resources, EndpointReferences, ResourceProperties Operation Providers: pre-build implementations of WSRF operations Notification implementation: Topics, TopicSet, Embedded Notification Consumer service Implementations of Resources (ReflectionResource, PersistentReflectionResource) and ResourceProperties (SimpleResourceProperty, ReflectionResourceProperty)

67 66 Service Container GT4 WS Core in a Nutshell RPs Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR ResourceHome RPs Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR ResourceHome RPs Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR ResourceHome Service Container: host multiple services in container; one JVM process …more details: based on AXIS service container, processes SOAP messages, ResourceContext extension.

68 67 Service Container GT4 WS Core in a Nutshell RPs Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR ResourceHome RPs Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR ResourceHome RPs Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR ResourceHome Secure Communication: Transport, Message, Conversation (Transport demonstrates best performance) PIP PDP Configurable Security Policies: Policy Information Points (PIPs), Policy Decision Points (PDP) -- chained Example authorization PDPs: GridMap, SAML implementations, XACML policies

69 68 Apache Tomcat Service Container GT4 WS Core in a Nutshell RPs Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR ResourceHome RPs Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR ResourceHome RPs Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR ResourceHome PIP PDP WorkManagerDB Conn Pool JNDI Directory Deploy Service Container “standalone” or within Apache Tomcat

70 69 Custom Web Services WS-Addressing, WSRF, WS-Notification Custom WSRF Web Services GT4 WSRF Web Services WSDL, SOAP, WS-Security User Applications Registry Administration GT4 Container GT4 Web Services Runtime

71 70 GetRP Test Distributed client and service on same LAN (times in milliseconds) GT4 - JavaGT4 - C pyGridWareWSRF::LiteWSRF.NET No Security GT4 - JavaGT4 - C pyGridWareWSRF::LiteWSRF.NET GT4 - JavaGT4 - C pyGridWareWSRF::LiteWSRF.NET X509 SigningHTTPS 10.05 2.34 25.57 17.1 8.23 181.96 14.8 140.5 81.39 N/A 11.46 2.85 12.91 55.6 149.67

72 71 GT4 WS Core Performance GT4 JavaGT4 CGT4 PythonWSRF.NET GetRP181.9614.77140.5081.39 SetRP182.0414.99142.2182.48 CreateR188.4614.98132.2696.22 DestroyR182.0315.76136.1286.89 Notify219.51N/A244.93101.57 GT4 JavaGT4 CGT4 PythonWSRF.NET getRP11.462.85149.6712.91 setRP11.472.86150.7912.3 createR18.002.82132.6020.84 destroyR14.922.71149.2116.05 Notify29.269.67169.0745.0 (1) Message-level security (times in milliseconds) (2) Transport-level security (times in milliseconds) “WSRF/WSNs Compared,” HPDC 2005.

73 72 Data Mgmt Security Common Runtime Execution Mgmt Info Services GridFTP Authentication Authorization Reliable File Transfer Data Access & Integration Grid Resource Allocation & Management Index Community Authorization Data Replication Community Scheduling Framework Delegation Replica Location Trigger Java Runtime C Runtime Python Runtime WebMDS Workspace Management Grid Telecontrol Protocol Globus Toolkit v4 www.globus.org Credential Mgmt Globus Toolkit: Open Source Grid Infrastructure

74 73 Globus Security l Control access to shared services u Address autonomous management, e.g., different policy in different work-groups l Support multi-user collaborations u Federate through mutually trusted services u Local policy authorities rule l Allow users and application communities to set up dynamic trust domains u Personal/VO collection of resources working together based on trust of user/VO

75 74 Virtual Organization (VO) Concept l VO for each application or workload l Carve out and configure resources for a particular use and set of users

76 75 GT4 Security VO Rights Users Rights’ Compute Center Access Services (running on user’s behalf) Rights Local policy on VO identity or attribute authority CAS or VOMS issuing SAML or X.509 ACs SSL/WS-Security with Proxy Certificates Authz Callout: SAML, XACML KCA MyProxy

77 76 GT4 Security l Public-key-based authentication l Extensible authorization framework based on Web services standards u SAML-based authorization callout l As specified in GGF OGSA-Authz WG u Integrated policy decision engine l XACML policy language, per-operation policies, pluggable l Credential management service u MyProxy (One time password support) l Community Authorization Service l Standalone delegation service

78 77 GT4’s Use of Security Standards Supported, Supported, Fastest, but slow but insecure so default

79 78 GT-XACML Integration l eXtensible Access Control Markup Language u OASIS standard, open source implementations l XACML: sophisticated policy language l Globus Toolkit ships with XACML runtime u Included in every client and server built on GT u Turned-on through configuration l … that can be called transparently from runtime and/or explicitly from application … l … and we use the XACML-”model” for our Authz Processing Framework

80 79 GT Authorization Framework

81 80 Other Security Services Include … l MyProxy u Simplified credential management u Web portal integration u Single-sign-on support l KCA & kx.509 u Bridging into/out-of Kerberos domains l SimpleCA u Online credential generation l PERMIS u Authorization service callout

82 81 Data Mgmt Security Common Runtime Execution Mgmt Info Services GridFTP Authentication Authorization Reliable File Transfer Data Access & Integration Grid Resource Allocation & Management Index Community Authorization Data Replication Community Scheduling Framework Delegation Replica Location Trigger Java Runtime C Runtime Python Runtime WebMDS Workspace Management Grid Telecontrol Protocol Globus Toolkit v4 www.globus.org Credential Mgmt Globus Toolkit: Open Source Grid Infrastructure

83 82 GT4 Data Management l Stage/move large data to/from nodes u GridFTP, Reliable File Transfer (RFT) u Alone, and integrated with GRAM l Locate data of interest u Replica Location Service (RLS) l Replicate data for performance/reliability u Distributed Replication Service (DRS) l Provide access to diverse data sources u File systems, parallel file systems, hierarchical storage: GridFTP u Databases: OGSA DAI

84 83 GridFTP in GT4 l 100% Globus code u No licensing issues u Stable, extensible l IPv6 Support l XIO for different transports l Striping  multi-Gb/sec wide area transport u 27 Gbit/s on 30 Gbit/s link l Pluggable u Front-end: e.g., future WS control channel u Back-end: e.g., HPSS, cluster file systems u Transfer: e.g., UDP, NetBLT transport Disk-to-disk on TeraGrid

85 84 Reliable File Transfer: Third Party Transfer RFT Service RFT Client SOAP Messages Notifications (Optional) Data Channel Protocol Interpreter Master DSI Data Channel Slave DSI IPC Receiver IPC Link Master DSI Protocol Interpreter Data Channel IPC Receiver Slave DSI Data Channel IPC Link GridFTP Server l Fire-and-forget transfer l Web services interface l Many files & directories l Integrated failure recovery l Has transferred 900K files

86 85 Replica Location Service l Identify location of files via logical to physical name map l Distributed indexing of names, fault tolerant update protocols l GT4 version scalable & stable l Managing ~40 million files across ~10 sites Index Local DB Update send (secs) Bloom filter (secs) Bloom filter (bits) 10K<121 M 22410 M 5 M717550 M

87 86 Data Mgmt Security Common Runtime Execution Mgmt Info Services GridFTP Authentication Authorization Reliable File Transfer Data Access & Integration Grid Resource Allocation & Management Index Community Authorization Data Replication Community Scheduling Framework Delegation Replica Location Trigger Java Runtime C Runtime Python Runtime WebMDS Workspace Management Grid Telecontrol Protocol Globus Toolkit v4 www.globus.org Credential Mgmt Globus Toolkit: Open Source Grid Infrastructure

88 87 Execution Management (GRAM) l Common WS interface to schedulers u Unix, Condor, LSF, PBS, SGE, … l More generally: interface for process execution management u Lay down execution environment u Stage data u Monitor & manage lifecycle u Kill it, clean up l A basis for application-driven provisioning

89 88 GT4 WS GRAM l 2nd-generation WS implementation optimized for performance, flexibility, stability, scalability l Streamlined critical path u Use only what you need l Flexible credential management u Credential cache & delegation service l GridFTP & RFT used for data operations u Data staging & streaming output u Eliminates redundant GASS code

90 89 GRAM services GT4 Java Container GRAM services Delegation RFT File Transfer request GridFTP Remote storage element(s) Local scheduler User job Compute element GridFTP sudo GRAM adapter FTP control Local job control Delegate FTP data Client Job functions Delegate Service host(s) and compute element(s) GT4 WS GRAM Architecture SEG Job events

91 90 WS GRAM Performance l Time to submit a basic GRAM job u Pre-WS GRAM: < 1 second u WS GRAM: 2 seconds l Concurrent jobs u Pre-WS GRAM: 300 jobs u WS GRAM: 32,000 jobs l Various studies are underway to test latest software

92 91 www.opensciencegrid.org Jobs (2004) Open Science Grid  50 sites (15,000 CPUs) & growing  400 to >1000 concurrent jobs  Many applications + CS experiments; includes long-running production operations  Up since October 2003; few FTEs central ops

93 92 VO User Embedded Resource Management Cluster Resource Manager GRAM Cluster Resource Manager GRAM VO admin delegates credentials to be used by downstream VO services. VO admin starts the required services. VO jobs comes in directly from the upstream VO Users VO job gets forwarded to the appropriate resource using the VO credentials Computational job started for VO Client-side VO Scheduler Other Services VO Admin... Monitoring and control Headnode Resource Manager GRAM Deleg VO User VO Job

94 93 Data Mgmt Security Common Runtime Execution Mgmt Info Services GridFTP Authentication Authorization Reliable File Transfer Data Access & Integration Grid Resource Allocation & Management Index Community Authorization Data Replication Community Scheduling Framework Delegation Replica Location Trigger Java Runtime C Runtime Python Runtime WebMDS Workspace Management Grid Telecontrol Protocol Globus Toolkit v4 www.globus.org Credential Mgmt Globus Toolkit: Open Source Grid Infrastructure

95 94 Monitoring and Discovery l “Every service should be monitorable and discoverable using common mechanisms” u WSRF/WSN provides those mechanisms l A common aggregator framework for collecting information from services, thus: u MDS-Index: Xpath queries, with caching u MDS-Trigger: perform action on condition u (MDS-Archiver: Xpath on historical data) l Deep integration with Globus containers & services: every GT4 service is discoverable u GRAM, RFT, GridFTP, CAS, …

96 95 GT4 Container GT4 Monitoring & Discovery GRAMUser MDS- Index GT4 Cont. RFT MDS- Index GT4 Container MDS- Index GridFTP adapter Registration & WSRF/WSN Access Custom protocols for non-WSRF entities Clients (e.g., WebMDS) Automated registration in container WS-ServiceGroup

97 96 Information Providers l GT4 information providers collect information from some system and make it accessible as WSRF resource properties l Growing number of information providers u Nagios, SGE, LSF, PBS l Many opportunities to build additional ones u E.g., network monitoring, storage systems, various sensors

98 97 Putting it Together: Data Replication Service l New capabilities can be build by: u Creating new services using GT4 containers u Composing/combining existing service

99 98  Cardiff AEI/Golm Birmingham Reliable Wide Area Data Replication Replicating >1 Terabyte/day to 8 sites >30 million replicas so far MTBF = 1 month LIGO Gravitational Wave Observatory www.globus.org/solutions

100 99 The Data Replication Service l Tech Preview in GT4.0 l Based on the publication component of the Lightweight Data Replicator system u Developed by Scott Koranda from U. Wisconsin at Milwaukee l Function to locally replicate a set of files u User identifys a desired files u DRS uses RLS to discover file locations u Use RFT to create local replicas u Registers new replicas in RLS

101 100 Motivation for DRS l Need for higher-level data management services that integrate lower-level Grid functionality u Efficient data transfer (GridFTP, RFT) u Replica registration and discovery (RLS) u Eventually validation of replicas, etc. l Goal is to generalize the custom data management systems developed by several application communities l Eventually to provide a suite of configurable high-level data management services l DRS is the first of these services

102 101 Relationship to Other Globus Services At requesting site, deploy: l WS-RF Services u Data Replication Service u Delegation Service u Reliable File Transfer Service l Pre WS-RF Components u Replica Location Service (Local Replica Catalog and Replica Location Index) u GridFTP Server

103 102 Service Composition Overview Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog Service Container MDS WSRF Services: Data Replication Reliable File Transfer Information Services Delegation Pre-WS Services: Replica Index Replica Catalog GridFTP Server

104 103 Service Container Data Replication Service Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Replicator RP Data Replication: Coordinates replication by discovery, transfer, registration of replicas Replicator Resource: Represents state of replication request. Detailed state info for per- replica status Resource Properties: Status Stage Result Error Msg Count

105 104 Service Container Reliable File Transfer Service Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Transfer RP Reliable File Transfer: Coordinates transfer of files using GridFTP servers. Transfer Resource: Represents state of transfer request. Detailed state info for per-file- transfer status Resource Properties: RequestStatus OverallStatus TotalBytes TotalTime

106 105 Service Container Delegation Service Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP Delegation: Manages delegated credentials from client/user. Credential: Represents the user’s delegated credential as a stateful entity. Resource Properties: None excepts standard lifetime RPs (CurrentTime, TerminationTime)

107 106 Service Container Information Services (MDS) Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Index RP MDS Information Services: Collects information from other Resources, may trigger events based on conditions. Index: an index of monitored resources. Trigger: an event / stimulus trigger (not pictured) Resource Properties: ResourceCount Entry: EPR + data from other Resources

108 107 Service Container Pre-WS Services Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Replica Index: Index of replica information across catalogs Replica Catalog: Catalog of replica information, per-SE GridFTP Server: Transfer service based on GridFTP protocol

109 108 Service Container Create Delegated Credential Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP proxy Initialize user proxy Create delegated credential resource Set termination time Credential EPR returned EPR

110 109 Service Container Create Delegated Credential Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP Create Replicator resource Pass delegated credential EPR Set termination time Replicator EPR returned EPR Replicator RP Access delegated credential resource

111 110 Service Container Monitor Replicator Resource Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP Replicator RP Periodically polls Replicator RP via GetRP or GetMultRP Add Replicator resource to MDS Information service Index Index RP Subscribe to ResourceProperty changes for “Status” RP and “Stage” RP Conditions may trigger alerts or other actions (Trigger service not pictured) EPR

112 111 Service Container Query Replica Information Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP Replicator RP Index RP Notification of “Stage” RP value changed to “discover” Replicator queries RLS Replica Index to find catalogs that contain desired replica information Replicator queries RLS Replica Catalog(s) to retrieve mappings from logical name to target name (URL)

113 112 Service Container Transfer Data Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP Replicator RP Index RP Notification of “Stage” RP value changed to “transfer” Create Transfer resource Pass credential EPR Set Termination Time Transfer resource EPR returned Transfer RP EPR Access delegated credential resource Setup GridFTP Server transfer of file(s) Data transfer between GridFTP Server sites Periodically poll “ResultStatus” RP via GetRP When “Done”, get state information for each file transfer

114 113 Service Container Register Replica Information Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP Replicator RP Index RP Notification of “Stage” RP value changed to “register” RLS Replica Catalog sends update of new replica mappings to the Replica Index Transfer RP Replicator registers new file mappings in RLS Replica Catalog

115 114 Service Container Client Inspection of State Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP Replicator RP Index RP Notification of “Status” RP value changed to “Finished” Transfer RP Client inspects Replicator state information for each replication in the request

116 115 Service Container Resource Termination Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP Replicator RP Index RP Termination time (set by client) expires eventually Transfer RP Resources destroyed (Credential, Transfer, Replicator) TIME

117 116 DRS: WSDL (PortType) … … … … … … … … … … … …

118 117 DRS: WSDL (RPs) … … … … … … … … … … … … … …

119 118 DRS: Stubs PortTypeMessagesTypes

120 119 DRS: Implementation WS Core DRS

121 120 Development and Deployment Process 1. Define Interface (WSDL, XSD) 2. WSDLPP (adds standard operations) 3. Generate Stubs 4. Develop service, resource, custom logic 5. Describe deployment (WSDD) 6. Configure JNDI properties 7. Build, package as GAR, and deploy to container 8. Deploy to Service Container Define Interface (WSDL) Insert Std. Operations (WSDLPP) Generate Stubs Develop Service, Resource, etc. Describe Deployment (WSDD) Configure (JNDI Props) Build & Package (GAR) Deploy to Service Container

122 121 The Globus Ecosystem l Globus components address core issues relating to resource access, monitoring, discovery, security, data movement, etc. u GT4 being the latest version l A larger Globus ecosystem of open source and proprietary components provide complementary components u A growing list of components l These components can be combined to produce solutions to Grid problems u We’re building a list of such solutions

123 122 Many Tools Build on, or Can Contribute to, GT4-Based Grids l Condor-G, DAGman l MPICH-G2 l GRMS l Nimrod-G l Ninf-G l Open Grid Computing Env. l Commodity Grid Toolkit l GriPhyN Virtual Data System l Virtual Data Toolkit l GridXpert Synergy l Platform Globus Toolkit l VOMS l PERMIS l GT4IDE l Sun Grid Engine l PBS scheduler l LSF scheduler l GridBus l TeraGrid CTSS l NEES l IBM Grid Toolbox l …

124 123 Example Solutions l Portal-based User Reg. System (PURSE) l VO Management Registration Service l Service Monitoring Service l TeraGrid TGCP Tool l Lightweight Data Replicator l GriPhyN Virtual Data System

125 124 The Globus Commitment to Open Source l Globus was first established as an open source project in 1996 l The Globus Toolkit is open source to: u allow for inspection l for consideration in standardization processes u encourage adoption l in pursuit of ubiquity and interoperability u encourage contributions l harness the expertise of the community l The Globus Toolkit is distributed under the (BSD-style) Apache License version 2

126 125 The Future: Structure l NSF Community Driven Improvement of Globus Software (CDIGS) project u 5 years of funding for GT enhancement u Regular Globus roadmaps outlining plans l GlobDevhttp://dev.globus.org u Apache-like community development site u Community governance of components u “Globus Toolkit” & other related software u Open for business early 2006 u “Globus Alliance” = “GlobDev committers”

127 126 The Future: Content l We now have a solid and extremely powerful Web services base l Next, we will build an expanded open source Grid infrastructure u Virtualization u New services for provisioning, data management, security, VO management u End-user tools for application development u Etc., etc. l And of course responding to user requests for other short-term needs

128 127 What to Expect from the Globus Alliance in the Coming Months l Support for users of GT4 u Working to make sure the toolkit meets user needs u Answering questions on the mailing lists u Further improving documentation l Normal evolution of performance, scalability and feature enhancements l Further development of tools and services in support of VOs l Expanding contributions to Globus

129 128 GlobDev l The current set of Globus components will be organized into several “Globus Projects” u Projects release products l Each project will have its own group of “Committers” u committers are responsible for governance on matters relating to their products l The “Globus Management Committee” will u provide overall guidance and conflict resolution u approve the creation of new Globus Projects

130 129 Opportunities for Collaboration l Use of Globus software u Feedback & involvement in design l Development of new Globus components u E.g., new information providers to enable use of GT to manage an entire Grid u Examples and documentation l Globalization and localization of software l New applications and tools u E.g., Grid operations, emergency response, ecogrid, bioinformatics, …

131 130 For More Information l Globus Alliance u www.globus.org l NMI and GRIDS Center u www.nsf-middleware.org u www.grids-center.org l Infrastructure u www.opensciencegrid.org u www.teragrid.org 2nd Edition www.mkp.com/grid2

132 131 For More Information l GT4 Programming


Download ppt "Carl Kesselman Information Sciences Institute University of Southern California Univa Corporation Grid MasterClass."

Similar presentations


Ads by Google