OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Ocean Observatories Initiative OOI CI Release 2 Life Cycle Objectives Review Common Execution Infrastructure (CEI) Subsystem OOI CI System Architecture Team R1 PRR and R2 LCO Review La Jolla, CA Aug-30 to Sep
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Agenda Introduction to the Subsystem Release 1 Accomplishments Release 2 Scope Use Case Overview Architecture and Design Technologies CEI Inception Accomplishments Preliminary Elaboration Plan 2
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Subsystem Purpose Allow OOI applications and system to scale to demand Provide Highly Available (HA) services Enact OOI deployment policies in elastic environment Provide a deployment foundation for OOI CI 3
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Subsystem Team NameOrganizationRoleArea of Focus Michael MeisingerUCSDArchitectSystem interfaces Stephen HenrieUCSDArchitectSystem interfaces Kate Keahey (pt)U of ChicagoDesignerCEI Design Tim Freeman (pt)U of ChicagoSenior DeveloperCEI John Bresnahan (pt)U of ChicagoDeveloperCEI David LaBissoniereU of ChicagoDeveloperCEI Patrick ArmstrongU of ChicagoDeveloperCEI Susanne JulACLUX DesignerUX liason 4
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Core System Structure: Service Layers 5
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Release 1 Accomplishments 6
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep EE type A instance CC instance Realizing High Availability and Scalability 7 Provisioner IaaS EE type A instance EPU Management ou-agent queue length create instance DTRS HA-App-v1 per-node status health Decision Engine AMQP Other Elastic Processing Unit (EPU) Design
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Reliable Management of Complex Deployments cloudinit.d: repeatable deployment and monitoring of complex systems –Write a launch plan once, deploy many times –Coordination of interdependent launches –User-defined launch tests –Test-based monitoring and repair –Lightweight and easy to use Users: “one click” deployment Developers: “stem cell” approach 8 database Web Server Launch plan
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Capabilities Node Monitoring –State model and introspection –Status view of the system Node repair and replacement –Automatic replacement of problematic/failed nodes –Manual replacement using the EPU "reconfigure” operation, e.g to update nodes with new versions Queue length monitoring as scalability sensor Contextualization/recipes that bring the ION services up with many different configurations System state persistence (system restart) Fast restart mechanism via supervisord Significant participation in integration activities 10
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Non-functional Considerations Enforcing HA policies Scalability and reliability –Zero -> 400 of VMs on EC2 in < 2 minutes –Keeping the VMs up despite failures Reducing boot time –Individual VMs: ~3 orders of magnitude (using caching and precompiled elements) –Overall system: reduced boot time by a factor of 2 11
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Array Network Facility (ANF) Demo 12 Using elastic processing to operate on seismic data Developed in collaboration with ASA/DM
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Release 2 Scope 13
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Release 2: Data Distribution and Stream Processing 14
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep R2 Scope Elastic Computing Services –Extend and refine the elastic computing services to provide a dynamically configurable Decision Engine that is central/global to an administrative domain. Execution Engine Catalog & Repository Services –Working with operations and ITV to develop and refine tools to upload and sync the different deployable type representations adapted to each site. Resource Management Services –Establish standard models for the operational management (monitor & control) of stateful and taskable resources Process Management Services –Provide the validation and management services for policy-based process execution within specified deployable types intended to support the data distribution services; as such the processes are sequential and require primarily a process to resource match. Process Catalog & Repository Services –The Process Catalog and Repository Services maintain process definitions as well as lists active processes. Integration with the National Computing Infrastructure –Provide the capability to deploy OOI processing on the Amazon cloud services for this release 15
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep R2 Use Case Overview 16 UC.R2.44 Define Service Type During Runtime UC.R2.45 Instantiate Service Anywhere During Runtime UC.R2.50 Define Scaling Policy Allocate services where need is greatest Configure service once, deploy many times Use UI to define policy to effect EPU scaling Elastic Computing Services Execution Engine Catalog and Repository Services Resource Management Services UC.R2.51 Define Execution EngineAdd new type of process execution engine to ION UC.R2.46 Operate Integrated Observatory NetworkManage ION system and respond to requests
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep R2 Use Case Overview (cntd) 17 Process Catalog and Repository Services Integration with National Computing Infrastructure UC.R2.47 Define Executable Process Define, register, and execute a source-code-derived process Process Management Services UC.R2.48 Schedule Process for Execution UC.R2.49 Deploy Distributed Processes UC.R2.52 Manage ION Processes Schedule a data stream process for execution Define service and process instantiation location and scheduling Monitor and control all system processes and environments
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Architecture and Design 18
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep CC instance R1 EPU Design Overview 19 Provisioner IaaS EE type A instance EPU Management ou-agent queue length create instance DTRS HA-App-v1 per-node status health Decision Engine AMQP Other
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep R1 EPU Design Scope Static payload per VM (no process management) –Limits to scaling, performance Support for ION-based processes only One EPU per policy per Execution Engine –The EPU is not EPU-ified –No “global” policy scope –Potential for non-uniform behavior 20
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep EPU Management R2 Elastic Processing Unit 21 EE ioncore 1.2 context-agent ou-agent EE matlab 6.1 context-agent ou-agent Decision Engine EE matlab 7.2 context-agent ou-agent Provisioner/DTRS IaaS create instance AMQP Other
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Making the EPU HA ou-agent EPU Worker Bootstrap EPU Dedicated DE Provisioner/DTRS IaaS create instance AMQP Other
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Creating a Process I 23 Process Definition Registry Process Dispatcher EE type A instance Process Instance Registry request to activate process X ee-agent Decision Engine lookup launch enter AMQP Other
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Creating a Process II 24 Process Definition Registry Process Dispatcher Provisioner/DTRS IaaS EE type A instance EPU Management Process Instance Registry request to activate process X ee-agent Decision Engine lookup launch enter request instance create instance AMQP Other
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Adapting Applications Using CEI 25 process needs HAService-Controller for datastore HAService-Controller for pubsub HAService-Controller for AIS Operator Webapp user (R3) P&P (R3) request to activate process X queuelen service datastore pubsub AIS subscribemonitor AMQP Other
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep CC instance Inside an Execution Engine 26 EE type A instance context-agent ee-agent ou-agent supervisord Matlab script C C M CMR CMK CMKO datastream subscriptionresult Process Dispatcher EPU Management Package Server process (adapter) 1 AMQP Other C – create M – monitor R – restart K – kill O – I/O
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Formalized in OOI Architecture 27
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Technologies 28
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Technology Choices ION: Integrated Observatory Network boto Nimboss Context Broker Fabric gevent
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Inception Accomplishments Overview 30
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Inception Accomplishments Overview Accomplishments –CEI R2 design –CEI/COI Workshop –CEI/DM Workshop (to be cntd) Prototypes –Matlab prototype 31
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Matlab Prototype Use Case: –A scientist from institution A wants to run a matlab process on EC2. Via her appointment at A she has a license to use matlab, but the license server is behind a firewall. How can she use her matlab license from A on VMs deployed at EC2? Goals: –Understand how to deal with the license acquisition issues –Stimulate discussion on data stream processing in R2, with execution engines, process management, and governance interactions –Start thinking towards combining execution resources (VMs, Licenses, Execution Engines, etc.) Results: –Design of TCP tunneling service for cloud and execution engine environments –Discussion of the Licit framework for license negotiation 32
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Proposed Elaboration Plan 33
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep R2E1: Evaluation and Design Tasks –Design EPU worker concurrency, evaluate solutions –Detailed design of EPU Management –Detailed design of Process Dispatcher –Detailed design for ee-agent –Detailed design for taskable resource framework –Prototype a simple process I (ION process) –DM workshop –Technology evaluation: monitoring and clouds –Code refactoring –Lightweight provisioner (integration) –Initial chef recipes for new container (integration) –Begin revising EPU implementation –Scalability testing I 34
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep R2E2: Development (Proposed) Tasks: –Initial implementation of EPU –Initial implementation of PD –Prototype a simple process II (non-ION process) –Logging and monitoring (cont of R1 work) –Provisioner improvements based on R1 –TRF – initial implemenation –TRF – integration with other subsystems –Initial support to integration team to make process-dispatched system launch (proof-of-concept) –Scalability testing II Integration –Transition to the new COI container –Other integration points under discussion 35
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep R2E3: Integration and Testing (Proposed) Tasks –System information in registries –PD support for EC2 for dynamic processes –Adapter for HA example (application-specific) –Example of elastic scaling in action –Example of multi-site deployment (two CyberPOPs as well as one CyberPOP and EC2) –Help make full system launch a reality –Integration with other subsystems –Scalability testing III 36
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Questions? 37