Presentation is loading. Please wait.

Presentation is loading. Please wait.

R3 Kickoff Meeting Ocean Observatories Initiative Common Execution Infrastructure (CEI) Subsystem OOI CI System Architecture Team: 1.

Similar presentations


Presentation on theme: "R3 Kickoff Meeting Ocean Observatories Initiative Common Execution Infrastructure (CEI) Subsystem OOI CI System Architecture Team: 1."— Presentation transcript:

1 R3 Kickoff Meeting Ocean Observatories Initiative Common Execution Infrastructure (CEI) Subsystem OOI CI System Architecture Team: 1

2 R3 Kickoff Meeting CEI Developers 2 4/14/2015 2 CEI Developer John Bresnahan Argonne National Lab (part-time) CEI Developer Patrick Armstrong University of Chicago CEI Developer Pierre Riteau University of Chicago (part-time) CEI Senior Developer Pierre Riteau University of Chicago

3 R3 Kickoff Meeting Subsystem Purpose Allow OOI applications and system to –Provide Highly Available (HA) services –Scale to demand Enact OOI deployment policies in elastic environment Provide a deployment foundation for OOI CI 3

4 R3 Kickoff Meeting Core System Structure: Service Layers 4

5 R3 Kickoff Meeting CEI Scope Elastic Computing Services –Implement elastic computing services to provide on-demand scaling and high availability. Execution Engine Catalog & Repository Services –Working with operations and ITV to develop and refine tools to upload and sync the different deployable type representations adapted to each site. Process Management Services –Provide the management services for policy-based process execution within specified deployable types intended to support the data distribution services; as such the processes are sequential and require primarily a process to resource match. Process Catalog & Repository Services –The Process Catalog and Repository Services maintain process definitions as well as lists active processes. Integration with the National Computing Infrastructure –Provide the capability to deploy OOI processing on the Amazon cloud services as well as academic clouds 5

6 R3 Kickoff Meeting High Availability and Scaling High Availability –Towards an always-on service model –Failures in outsourced resources –Providing a pool of replenishable compute resources Autoscaling –Provide resources for peaks in demand –Ensure good utilization during “valleys” in demand –Flexible resource mix 4/14/20154/14/20154/14/2015 6

7 R3 Kickoff Meeting Resources for HA and Scaling 4/14/20154/14/20154/14/2015 7 EPU Management Monitor and regulate set properties based on system-specific and application-specific metrics –Cloud resources are available on-demand, but any particular resource may fail at any time –Applications/processes can absorb new resources –Applications/processes can tolerate failures EPU

8 R3 Kickoff Meeting Managing Resources 8

9 R3 Kickoff Meeting EE ioncore 1.3 EPU Management Elastic Processing Unit (EPU) Management 9 EE ioncore 1.2 context-agent ou-agent EE matlab 6.1 context-agent ou-agent Decision Engine context-agent ou-agent Provisioner IaaS create instance AMQP Other DTRS CB

10 R3 Kickoff Meeting Making the EPU HA ou-agent EPU Worker Bootstrap EPU Dedicated DE Provisioner/DTRS IaaS create instance AMQP Other cloudinit.d

11 R3 Kickoff Meeting Managing Processes

12 R3 Kickoff Meeting Creating a Process I 12 Process Definition Registry Process Dispatcher EE type A instance Process Instance Registry request to activate process X ee-agent Decision Engine lookup launch enter AMQP Other

13 R3 Kickoff Meeting Creating a Process II 13 Process Definition Registry Process Dispatcher Provisioner/DTRS IaaS EE type A instance EPU Management Process Instance Registry request to activate process X ee-agent Decision Engine lookup launch enter request instance create instance AMQP Other

14 R3 Kickoff Meeting CC instance Inside an Execution Engine 14 EE type A instance context-agent ee-agent ou-agent supervisord Matlab script C C M CMR CMK CMKO datastream subscriptionresult Process Dispatcher EPU Management Package Server process (adapter) 1 AMQP Other C – create M – monitor R – restart K – kill O – I/O

15 R3 Kickoff Meeting Adventures in Availability Time to repair (TTR) –Diagnosis –Time to scale (TTS) PENDING (request) STARTED (deployment) RUNNING (contextualization) 4/14/20154/14/20154/14/2015 15 A = MTBF MTBF+MTTR Mean time between failures Mean time to repair TTS: preliminary results for 2,000 VMs provisioned on AWS EC2

16 R3 Kickoff Meeting R3 Scope Process management –Activation and validation –New execution site registration Integration with National Infrastructure –Framework for integration of academic cloud providers, TeraGrid and OSG –Integration with Microsoft cloud 16

17 R3 Kickoff Meeting R3 Activities Refine/change scope to achieve a complete and maintainable system Decide on specific solutions for R3 scope 17

18 R3 Kickoff Meeting Questions? 18


Download ppt "R3 Kickoff Meeting Ocean Observatories Initiative Common Execution Infrastructure (CEI) Subsystem OOI CI System Architecture Team: 1."

Similar presentations


Ads by Google