Addressing Complexity in Emerging Cyber-Ecosystems – Exploring the Role of Autonomics in E-Science Manish Parashar Center for Autonomic Computing The Applied.

Addressing Complexity in Emerging Cyber-Ecosystems – Exploring the Role of Autonomics in E-Science Manish Parashar Center for Autonomic Computing The Applied Software Systems Laboratory Rutgers, The State University of New Jersey & Office of Cyberinfrastructure National Science Foundation

5 th EGEE User Forum – 04/13/10 Outline of My Presentation Computational Ecosystems –Unprecedented opportunities, challenges Autonomic computing – A pragmatic approach for addressing complexity! Experiments with autonomics for science and engineering Concluding Remarks

5 th EGEE User Forum – 04/13/10 The Cyberinfrastructure Vision “Cyberinfrastructure integrates hardware for computing, data and networks, digitally-enabled sensors, observatories and experimental facilities, and an interoperable suite of software and middleware services and tools…” - NSF’s Cyberinfrastructure Vision for 21st Century Discovery A global phenomenon; several LARGE deployments –EGEE, European Grid Infrastructure (EGI), TeraGrid, Open Science Grid (OSG), Cybera, etc., etc. New capabilities for computational science and engineering –seamless access resources, services, data, information, expertise, … –seamless aggregation –seamless (opportunistic) interactions/couplings

Cyberinfrastructure => Cyber-Ecosystems 21 st Century Science and Engineering: New Paradigms & Practices Transformed by CI End-to-end – seamless access, aggregation, interactions Fundamentally collaborative & data-driven/data intensive Unprecedented opportunities New requirements, challenges New thinking in/approaches to computation science How can it benefit current applications? How can it enable new thinking in science?

5 th EGEE User Forum – 04/13/10 Unprecedented opportunities for Science/Engineering Knowledge-based, information/data-driven, context/content- aware computationally intensive, pervasive applications –Crisis management, monitor and predict natural phenomenon, monitor and manage engineered systems, optimize business processes Addressing applications in an end-to-end manner! –Opportunistically combine computations, experiments, observations, data, to manage, control, predict, adapt, optimize, … New paradigms and practices in science and engineering? –How can it benefit current applications? –How can it enable new thinking in science?

5 th EGEE User Forum – 04/13/10 The Instrumented Oil Field (with UT-CSM, UT-IG, OSU, UMD, ANL) Detect and track changes in data during production. Invert data for reservoir properties. Detect and track reservoir changes. Assimilate data & reservoir properties into the evolving reservoir model. Use simulation and optimization to guide future production. Data Driven Model Driven

5 th EGEE User Forum – 04/13/10 Many Application Areas …. Hazard prevention, mitigation and response –Earthquakes, hurricanes, tornados, wild fires, floods, landslides, tsunamis, terrorist attacks Critical infrastructure systems –Condition monitoring and prediction of future capability Transportation of humans and goods –Safe, speedy, and cost effective transportation networks and vehicles (air, ground, space) Energy and environment –Safe and efficient power grids, safe and efficient operation of regional collections of buildings Health –Reliable and cost effective health care systems with improved outcomes Enterprise-wide decision making –Coordination of dynamic distributed decisions for supply chains under uncertainty Next generation communication systems –Reliable wireless networks for homes and businesses … … Report of the Workshop on Dynamic Data Driven Applications Systems, F. Darema et al., March 2006, www.dddas.org Source: M. Rotea, NSF

5 th EGEE User Forum – 04/13/10 The Challenge: Managing Complexity, Uncertainty System –Very large scales –Disruptive trends many/multi-cores, accelerators, clouds –Heterogeneity capability, connectivity, reliability, guarantees, QoS –Dynamics Ad hoc structures, failure –Distributed system! Lack of guarantees, common/complete knowledge, … –Emerging concerns Power, resilience, … Data and Information –Scale, heterogeneity –Availability, resolution, quality –Semantics, meta data, data models, provenance –Trust in data, …. Application –Compositions –Dynamic behaviors –Dynamic and complex couplings –Software/systems engineering issues Emergent rather than by design

5 th EGEE User Forum – 04/13/10 The Challenge: Managing Complexity, Uncertainty (I) Increasing application, data/information, system complexity –Scale, heterogeneity, dynamism, unreliability, …, disruptive trends, … New application formulations, practices –Data intensive and data driven, coupled, multiple physics/scales/resolution, adaptive, compositional, workflows, etc. Complexity/uncertainty must be simultaneously addressed at multiple levels –Algorithms/Application formulations Asynchronous/chaotic, failure tolerant, … –Abstractions/Programming systems Adaptive, application/system aware, proactive, … –Infrastructure/Systems Decoupled, self-managing, resilient, …

5 th EGEE User Forum – 04/13/10 The Challenge: Managing Complexity, Uncertainty (II) The ability of scientists to realize the potential of computational ecosystems is being severely hampered due to the increased complexity and dynamism of the applications and computing environments. To be productive, scientists often have to comprehend and manage complex computing configurations, software tools and libraries as well as application parameters and behaviors. Autonomics and self-* can help ? (with the “plumbing” for starters…)

5 th EGEE User Forum – 04/13/10 The Autonomic Computing Metaphor Current paradigms, mechanisms, management tools are inadequate to handle the scale, complexity, dynamism and heterogeneity of emerging systems and applications Nature has evolved to cope with scale, complexity, heterogeneity, dynamism and unpredictability, lack of guarantees –self configuring, self adapting, self optimizing, self healing, self protecting, highly decentralized, heterogeneous architectures that work !!! Goal of autonomic computing is to enable self-managing systems/applications that addresses these challenges using high level guidance –Unlike AI duplication of human thought is not the ultimate goal! “Autonomic Computing: An Overview,” M. Parashar, and S. Hariri, Hot Topics, Lecture Notes in Computer Science, Springer Verlag, Vol. 3566, pp. 247-259, 2005.

5 th EGEE User Forum – 04/13/10 Motivations for Autonomic Computing Source: http:idc 2006 8/12/07: 20K people + 60 planes held at LAX after computer failure prevented customs from screening arrivals 8/3/07: (EPA) datacenter energy use by 2011 will cost $7.4 B, 15 power plants, 15 Gwatts/hour peak Source:http://www.almaden.ibm.com/almaden/talks/Morris_AC_10-02.pdfhttp://www.almaden.ibm.com/almaden/talks/Morris_AC_10-02.pdf Key Challenge Current levels of scale, complexity and dynamism make it infeasible for humans to effectively manage and control systems and applications 2/27/07: Dow fell 546. Since worst plunge took place after 2:30 pm, trading limits were not activated 8/1/06: UK NHS hit with massive computer outage. 72 primary care + 8 acute hospital trusts affected.

5 th EGEE User Forum – 04/13/10 Autonomic Computing – A Pragmatic Approach Separation + Integration + Automation ! Separation of knowledge, policies and mechanisms for adaptation The integration of self–configuration, – healing, – protection,– optimization, … Self-* behaviors build on automation concepts and mechanisms –Increased productivity, reduced operational costs, timely and effective response System/Applications self-management is more than the sum of the self-management of its individual components M. Parashar and S. Hariri, Autonomic Computing: Concepts, Infrastructure, and Applications, CRC Press, Taylor & Francis Group, ISBN 0-8493-9367-1, 2007.

5 th EGEE User Forum – 04/13/10 Autonomic Computing Theory Integrates and advances several fields –Distributed computing Algorithms and architectures –Artificial intelligence Models to characterize, predict and mine data and behaviors –Security and reliability Designs and models of robust systems –Systems and software architecture Designs and models of components at different IT layers –Control theory Feedback-based control and estimation –Systems and signal processing theory System and data models and optimization methods Requires experimental validation (From S. Dobson et al., ACM Tr. on Autonomous & Adaptive Systems, Vol. 1, No. 2, Dec. 2006.)

5 th EGEE User Forum – 04/13/10 Some Information Sources “Autonomic Computing: Concepts, Infrastructure and Applications,” M. Parashar and S. Hariri (Ed.), CRC Press, ISBN 0-8493-9367-1 (Available at http://www.crcpress.com/)http://www.crcpress.com/ NSF Center on Autonomic Computing –http://nsfcac.rutgers.eduhttp://nsfcac.rutgers.edu –http://www.nsfcac.orghttp://www.nsfcac.org Autonomic Computing Portal –http://www.autnomiccomputing.orghttp://www.autnomiccomputing.org IEEE International Conference on Autonomic Computing –http://www.autonomic-conference.orghttp://www.autonomic-conference.org IEEE Task Force on Autonomous and Autonomic Systems –http://tab.computer.org/aas/http://tab.computer.org/aas/

5 th EGEE User Forum – 04/13/10 Autonomics for Science and Engineering ? Autonomic computing aims at developing systems and application that can manage and optimize themselves using only high-level guidance or intervention from users –dynamically adapt to changes in accordance with business policies and objectives and take care of routine elements of management Separation of management and optimization policies from enabling mechanisms –allows a repertoire of a mechanisms to be automatically orchestrated at runtime to respond to heterogeneity, dynamics, etc. E.g., develop strategies that are capable of identifying and characterizing patterns at design and at runtime and, using relevant (dynamically defined) policies, managing and optimizing the patterns. Application, Middleware, Infrastructure Manage application/information/system complexity not just hide it! Enabling new thinking, formulations how do I think about/formalize my problem differently?

5 th EGEE User Forum – 04/13/10 Autonomics for Science and Engineering ? Manage application/information/system complexity not just hide it! Enabling new thinking, formulations how do I think about/formalize my problem differently?

5 th EGEE User Forum – 04/13/10 A Conceptual Framework for ACS (GMAC 07, with S. Jha and O. Rana) Hierarchical Within and across levels …

5 th EGEE User Forum – 04/13/10 Crosslayer Autonomics

5 th EGEE User Forum – 04/13/10 Existing Autonomic Practices in Computational Science (GMAC 09, SOAR 09, with S. Jha and O. Rana) Autonomic tuning by the application Autonomic tuning of the application

5 th EGEE User Forum – 04/13/10 Spatial, Temporal and Computational Heterogeneity and Dynamics in SAMR Simulation of combustion based on SAMR (H2-Air mixture; ignition via 3 hot-spots) Temperature OH Profile Temporal Heterogeneity Spatial Heterogeneity Courtesy: Sandia National Lab

5 th EGEE User Forum – 04/13/10 Autonomics in SAMR Tuning by the application –Application level: when and where to refine –Runtime/Middleware level: When, where, how to partition and load balance –Runtime level: When, where, how to partition and load balance –Resource level: Allocate/de-allocate resources Tuning of the application, runtime –When/where to refine –Latency aware ghost synchronization –Heterogeneity/Load-aware partitioning and load-balancing –Checkpoint frequency –Asynchronous formulations –…

5 th EGEE User Forum – 04/13/10 Autonomics for Science and Engineering – Application-level Examples Autonomic to address complexity in science and engineering Autonomic as a paradigm for science and engineering Some examples: –Autonomic runtime management – multiphysics, adaptive mesh refinement –Autonomic data streaming and in-network data processing – coupled simulations –Autonomic deployment/scheduling – HPC Grid/Cloud integration –Autonomic workflows – simulation based optimization (Many system level examples not presented here …)

5 th EGEE User Forum – 04/13/10 Adaptive Methods in Science and Engineering Multi-block grid structure and oil concentrations contours (IPARS, M. Peszynska, UT Austin) Blast wave in the presence of a uniform magnetic field) – 3 levels of refinement. (Zeus + GrACE + Cactus, P. Li, NCSA, UCSD) Mixture of H 2 and Air in stoichiometric proportions with a non-uniform temperature field (GrACE + CCA, Jaideep Ray, SNL, Livermore) Richtmyer-Meshkov - detonation in a deforming tube - 3 levels. Z=0 plane visualized on the right (VTF + GrACE, R. Samtaney, CIT)

5 th EGEE User Forum – 04/13/10 Autonomic (Physics/Model/System Driven) Runtime Management Optimization Monitoring & Context-aware Services Characterization System State Capability Synthesizer Resource Monitoring Service CPU Memory Bandwidth Availability Access Policy Nature of Adaptation Application Dynamics Application State Decision-making engine Execution Application Runtime Manager Partition/Compose Application Monitoring Service Deductive Engine Policy Repository Knowledge Base Self-learning VCU Application Computation Unit VCU Processing Unit Mapping Distribution Redistribution Repartition/Recompose Computation/ communication “Hybrid Runtime Management of Space-Time Heterogeneity for Dynamic SAMR Applications,” X. Li and M. Parashar, IEEE TPDS 18(8), pp. 1202 – 1214, August 2007.

5 th EGEE User Forum – 04/13/10 Cross-layer Adaptations for SAMR Efficiency Performance Survivability When resources are under-utilized When resources are scarce ALP: Trade in space (resource) for time (performance) ALOC: Trade in time (performance) for space (resource)

5 th EGEE User Forum – 04/13/10 Experimental Results - ALP Experiment Setup: IBM SP4 cluster (DataStar at San Diego Supercomputing Center, total 1632 processors) SP4 (p655) node: 8 processors(1.5 GHz), memory 16 GB, 6.0 GFlops Performance gain up to 40% on 512 processors

5 th EGEE User Forum – 04/13/10 Effects of Finite Memory - ALOC Intel Pentium 4 CPU 1.70GHz, Linux 2.4 kernel Cache size: 256 KB, Physical memory: 512 M, Swap space: 1 G.

5 th EGEE User Forum – 04/13/10 Experimental Results - ALOC Boewulf Cluster (Frea at Rutgers, 64 processors) Intel Pentium 4 CPU 1.70GHz, Linux 2.4 kernel Cache size: 256 KB, Physical memory: 512 M, Swap space: 1 G.

5 th EGEE User Forum – 04/13/10 Coupled Fusion Simulations: A Data Intensive Workflow

5 th EGEE User Forum – 04/13/10 Autonomic Data Streaming and In-Transit Processing for Data-Intensive Workflows Large-scale distributed environments and data intensive workflows –Applications entities separated in space and time –Seamless interactions and couplings across entities Distributed application entities need to interact at runtime –Data processing, interactive data monitoring, online data analysis, visualization, data/service/vm migration, data archiving, collaboration, etc. Large data volumes and rates, heterogeneous data types –Must be streamed efficiently and effectively between distributed application components –Application-specific manipulations need to be applied in-transit “An Self-Managing Wide-Area Data Streaming Service,” V. Bhat*, M. Parashar, H. Liu*, M. Khandekar*, N. Kandasamy, S. Klasky, and S. Abdelwahed, Cluster Computing: The Journal of Networks, Software Tools, and Applications, Volume 10, Issue 7, pp. 365 – 383, December 2007.

5 th EGEE User Forum – 04/13/10 Autonomic Data Streaming and In-Transit Processing for Data-Intensive Workflows Workflow with coupled simulation codes, i.e., the edge turbulence particle-in-cell (PIC) code (GTC) and the microscopic MHD code (M3D) -- run simultaneously on separate HPC resources Data streamed and processed enroute -- e.g. data from the PIC codes filtered through “noise detection” processes before it can be coupled with the MHD code Efficiently data streaming between live simulations -- to arrive just-in- time -- if it arrives too early, times and resources will have to be wasted to buffer the data, and if it arrives too late, the application would waste resources waiting for the data to come in Opportunistic use of in-transit resources “An Self-Managing Wide-Area Data Streaming Service,” V. Bhat*, M. Parashar, H. Liu*, M. Khandekar*, N. Kandasamy, S. Klasky, and S. Abdelwahed, Cluster Computing: The Journal of Networks, Software Tools, and Applications, Volume 10, Issue 7, pp. 365 – 383, December 2007.

5 th EGEE User Forum – 04/13/10 Autonomic Data Streaming & In-Transit Processing –Application level Proactive QoS management strategies using model-based LLC controller Capture constraints for in-transit processing using slack metric –In-transit level Opportunistic data processing using dynamic in-transit resource overlay Adaptive run-time management at in-transit nodes based on slack metric generated at application level –Adaptive buffer management and forwarding Application Level “Proactive” management Simulation LLC Controller Slack metric Generator In-Transit node Simulation Slack metric Generator In-Transit Level “Reactive” management Slack metric corrector Coupling Slack metric corrector Budget estimation Slack metric adjustment metric updates Sink Data flow

5 th EGEE User Forum – 04/13/10 Autonomics for Coupled Fusion Simulation Workflows

5 th EGEE User Forum – 04/13/10 Autonomic Streaming: Implementation/Deployment Simulation Workflow –SS = Simulation Service (GTC) –ADSS = Autonomic Data Streaming Service CBMS = LLC Controller based buffer management service DTS = Data Transfer service –DAS = Data Analysis Service –SLAMS = Slack Manager Service –PS = Processing Service –BMS = Buffer Management Service –ArchS = Archiving data at sink Sort data Scale data Data Producers SS NERSC Rutgers University ADSS ArchS DAS CBMSDTS DAS SS ORNL ADSS Data In-Transit Data Consumers SLAMS DTS PS PPPL FFT DAS Rutgers University VisS DAS BMS SLAMS BudjS SLAMS Sink SLAMS FFT Simulations executes on leadership class machines at ORNL and NERSC In-transit nodes located at PPPL and Rutgers

5 th EGEE User Forum – 04/13/10 Adaptive Data Transfer No congestion in intervals 1-9 –Data transferred over WAN Congested at intervals 9-19 –Controller recognizes this congestion and advises the Element Manager, which in turn adapts DTS to transfer data to local storage (LAN). Adaptation continues until the network is not congested –Data sent to the local storage by the DTS falls to zero at the 19 th controller interval.

5 th EGEE User Forum – 04/13/10 Adaptation of the Workflow Create multiple instances of the Autonomic Data Streaming Service (ADSS) –Effective Network Transfer Rate dips below the threshold (our case around 100Mbs) Transfer Simulation ADSS-0 Data Transfer Buffer Data Transfer Buffer Data Transfer Buffer ADSS-1 ADSS-2 % Network throughput is difference between the max and current network transfer rate

5 th EGEE User Forum – 04/13/10 %Buffer Occupancy @ In-Transit Nodes w & w/o Coupling %Buffer occupancy at in-transit nodes before congestion is around 50% During congestion application level controller throttles data items –%Buffer occupancy at in-transit nodes reduces from 80% without coupling to 60.8% with coupling Higher %buffer occupancies at in-transit nodes lead to failures & loss of data

5 th EGEE User Forum – 04/13/10 Exploring Hybrid HPC-Grid/Cloud Usage Modes [eScience’09] Production computation infrastructures will be (are) hybrid integrating HPC Grids and Clouds What are appropriate usage modes for hybrid infrastructure? –Acceleration Clouds can be used as accelerators to improve the application time to completion –To alleviate the impact of queue wait times –“Strategically Off load” appropriate tasks to Cloud resources –All whilst respecting budget constraints. –Conservation Clouds can be used to conserve HPC Grid allocations, given appropriate runtime and budget constraints. –Resilience Clouds can be used to handle: –General: Response to dynamic execution environments –Specific: Unanticipated HPC Grid downtime, inadequate allocations or unexpected Queue delays/QoS change

5 th EGEE User Forum – 04/13/10 Reservoir Characterization: EnKF-based History Matching (with S. Jha) Black Oil Reservoir Simulator –simulates the movement of oil and gas in subsurface formations Ensemble Kalman Filter –computes the Kalman gain matrix and updates the model parameters of the ensembles Hetergeneous, dynamic workflows Based on Cactus, PETSc

5 th EGEE User Forum – 04/13/10 Exploring Hybrid HPC-Grid/Cloud Usage Modes using CometCloud EnKF application CometCloud Cloud Grid Agent Pull Tasks Push Tasks HPC Grid Mgmt. Info. HPC Grid Cloud Agent Workflow manager Runtime estimator Autonomic scheduler Monitor Analysis Adaptation Adaptivity Manager Application adaptivity Infrastructure adaptivity

5 th EGEE User Forum – 04/13/10 Experiment Background and Set-Up (1/2) Single stage EnKF workflow with 128 members with heterogeneous computational requirements The distribution of runtimes of ensemble members on 1 node (16 processors) of a TG compute system (Ranger) and one VM on EC2 (type m1).

5 th EGEE User Forum – 04/13/10 Establishing Baseline Performance Baseline TTC for EC2 and TG for a 1-stage, 128 ensemble member EnKF run. The first 4 bars represent the TTC as the number of EC2 VMs increase; the next 4 bars represent the TTC as the number of CPUs (nodes) used increases.

5 th EGEE User Forum – 04/13/10 Experiment Background and Set-Up (2/2) Key metrics –Total Time to Completion (TTC) –Total Cost of Completion (TCC) Basic assumptions –TG gives the best performance but is relatively more restricted resource. –EC2 is a relatively more freely available but is not as capable. Note that the motivation of our experiments is to understand each of the usage scenarios and their feasibility, behaviors and benefits, and not to optimize the performance of any one scenario.

5 th EGEE User Forum – 04/13/10 Objective I: Using Clouds as Accelerators for HPC Grids (1/2) Explore how Clouds (EC2) can be used as accelerators for HPC Grid (TG) work-loads –16 TG CPUs (1 node on Ranger) –average queuing time for TG was set to 5 and 10 minutes. –the number of EC2 nodes from 20 to 100 in steps of 20. –VM start up time was about 160 seconds

5 th EGEE User Forum – 04/13/10 Objective I: Using Clouds as Accelerators for HPC Grids (2/2) The TTC and TCC for Objective I with 16 TG CPUs and queuing times set to 5 and 10 minutes. As expected, more the number of VMs that are made available, the greater the acceleration, i.e., lower the TTC. The reduction in TTC is roughly linear, but is not perfectly so, because of a complex interplay between the tasks in the work load and resource availability

5 th EGEE User Forum – 04/13/10 Objective I: Using Clouds as Accelerators for HPC Grids (with Adaptation) Experiment with adaptivity applied for both infrastructure and application. The TTC is reduced further than with application or infrastructure adaptivity on its own. The cost is similar to that in infrastructure adaptivity for durations less than one hour since EC2 usage is billed hourly with a one hour minimum.

5 th EGEE User Forum – 04/13/10 Objective II: Using Clouds for Conserving CPU-Time on the TeraGrid Explore how to conserve fixed allocation of CPU hours by offloading tasks that perhaps don’t need the specialized capabilities of the HPC Grid Distribution of tasks across EC2 and TG, TTC and TCC, as the CPU-minute allocation on the TG is increased.

5 th EGEE User Forum – 04/13/10 Objective III: Response to Changing Operating Conditions (Resilience) (1/4) Explore the situation where resources that were initially planned for, become unavailable at runtime, either in part or in entirety –How can Cloud services be used to address this situations and allow the system/application to respond to a dynamic change in availability of resources. Initially 16 TG CPUs for 800 minutes allocated. After about 50 minutes of execution (i.e., 3 Tasks were completed on the TG), available CPU time is change to only 20 CPU minutes remain

5 th EGEE User Forum – 04/13/10 Objective III: Response to Changing Operating Conditions (Resilience) (2/4) Allocation of tasks to TG CPUs and EC2 nodes for usage mode III. As the 16 allocated TG CPUs become unavailable after only 70 minutes rather than the planned 800 minutes, the bulk of the tasks are completed by EC2 nodes.

5 th EGEE User Forum – 04/13/10 Objective III: Response to Changing Operating Conditions (Resilience) (3/4) Number of TG cores and EC2 nodes as a function of time for usage mode III. Note that the TG CPU allocation goes to zero after about 70 minutes causing the autonomic scheduler to increase the EC2 nodes by 8.

5 th EGEE User Forum – 04/13/10 Objective III: Response to Changing Operating Conditions (Resilience) (4/4) Overheads of resilience on TTC and TCC.

5 th EGEE User Forum – 04/13/10 The Instrumented Oil Field Production of oil and gas can take advantage of installed sensors that will monitor the reservoir’s state as fluids are extracted Knowledge of the reservoir’s state during production can result in better engineering decisions –economical evaluation; physical characteristics (bypassed oil, high pressure zones); productions techniques for safe operating conditions in complex and difficult areas Detect and track changes in data during production Invert data for reservoir properties Detect and track reservoir changes Assimilate data & reservoir properties into the evolving reservoir model Use simulation and optimization to guide future production, future data acquisition strategy “Application of Grid-Enabled Technologies for Solving Optimization Problems in Data-Driven Reservoir Studies,” M. Parashar, H. Klie, U. Catalyurek, T. Kurc, V. Matossian, J. Saltz and M Wheeler, FGCS. The International Journal of Grid Computing: Theory, Methods and Applications (FGCS), Elsevier Science Publishers, Vol. 21, Issue 1, pp 19-26, 2005.

5 th EGEE User Forum – 04/13/10 Effective Oil Reservoir Management: Well Placement/Configuration Why is it important –Better utilization/cost-effectiveness of existing reservoirs –Minimizing adverse effects to the environment Better Management Less Bypassed Oil Bad Management Much Bypassed Oil

5 th EGEE User Forum – 04/13/10 Optimize Economic revenue Environmental hazard … Based on the present subsurface knowledge and numerical model Improve numerical model Plan optimal data acquisition Acquire remote sensing data Improve knowledge of subsurface to reduce uncertainty Update knowledge of model Management decision START Dynamic Decision System Dynamic Data- Driven Assimilation Data assimilation Subsurface characterization Experimental design Autonomic Grid Middleware Grid Data Management Processing Middleware Autonomic Reservoir Management: “Closing the Loop” using Optimization

5 th EGEE User Forum – 04/13/10 Autonomic Formulations/Programming

5 th EGEE User Forum – 04/13/10 LLC-based Self Management in Accord Element/Service Managers are augmented with LLC Controllers –monitors state/execution context of elements –enforces adaptation actions determined by the controller –augment human defined rules Self-Managing Element Computational Element LLC Controller Element Manager Internal State Contextual State Optimization Function LLC Controller Element Manager Advice Computational Element Model

5 th EGEE User Forum – 04/13/10 An Autonomic Well Placement/Configuration Workflow AutoMate Programming System/Grid Middleware History/ Archive d Data Sensor/ Context Data Oil prices, Weather, etc.

5 th EGEE User Forum – 04/13/10 Autonomic Oil Well Placement/Configuration permeability Pressure contours 3 wells, 2D profile Contours of NEval(y,z,500)(10) Requires NYxNZ (450) evaluations. Minimum appears here. VFSA solution: “walk”: found after 20 (81) evaluations

5 th EGEE User Forum – 04/13/10 Autonomic Oil Well Placement/Configuration (VFSA) “An Reservoir Framework for the Stochastic Optimization of Well Placement,” V. Matossian, M. Parashar, W. Bangerth, H. Klie, M.F. Wheeler, Cluster Computing: The Journal of Networks, Software Tools, and Applications, Kluwer Academic Publishers, Vol. 8, No. 4, pp 255 – 269, 2005 “Autonomic Oil Reservoir Optimization on the Grid,” V. Matossian, V. Bhat, M. Parashar, M. Peszynska, M. Sen, P. Stoffa and M. F. Wheeler, Concurrency and Computation: Practice and Experience, John Wiley and Sons, Volume 17, Issue 1, pp 1 – 26, 2005.

5 th EGEE User Forum – 04/13/10 Summary CI and emerging computational ecosystems –Unprecedented opportunity new thinking, practices in science and engineering –Unprecedented research challenges scale, complexity, heterogeneity, dynamism, reliability, uncertainty, … Autonomic Computing can address complexity and uncertainty –Separation + Integration + Automation Experiments with Autonomics for science and engineering –Autonomic data streaming and in-transit data manipulation, Autonomic Workflows, Autonomic Runtime Management, … However, there are implications –Added uncertainty –Correctness, predictability, repeatability –Validation –New formulations necessary….

5 th EGEE User Forum – 04/13/10 Thank You! Email: parashar@rutgers.edu

Addressing Complexity in Emerging Cyber-Ecosystems – Exploring the Role of Autonomics in E-Science Manish Parashar Center for Autonomic Computing The Applied.

Similar presentations

Presentation on theme: "Addressing Complexity in Emerging Cyber-Ecosystems – Exploring the Role of Autonomics in E-Science Manish Parashar Center for Autonomic Computing The Applied."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Addressing Complexity in Emerging Cyber-Ecosystems – Exploring the Role of Autonomics in E-Science Manish Parashar Center for Autonomic Computing The Applied.

Similar presentations

Presentation on theme: "Addressing Complexity in Emerging Cyber-Ecosystems – Exploring the Role of Autonomics in E-Science Manish Parashar Center for Autonomic Computing The Applied."— Presentation transcript:

Similar presentations

About project

Feedback