Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Kheiron: Runtime Adaptation of Native-C and Bytecode Applications Rean Griffith, Gail Kaiser Programming Systems Lab (PSL) Columbia University June 14.

Similar presentations


Presentation on theme: "1 Kheiron: Runtime Adaptation of Native-C and Bytecode Applications Rean Griffith, Gail Kaiser Programming Systems Lab (PSL) Columbia University June 14."— Presentation transcript:

1 1 Kheiron: Runtime Adaptation of Native-C and Bytecode Applications Rean Griffith, Gail Kaiser Programming Systems Lab (PSL) Columbia University June 14 2006 Presented by Rean Griffith rg2023@cs.columbia.edu

2 2 Overview Introduction Introduction Problem Problem Solution Solution System Operation System Operation Feasibility Experiments Feasibility Experiments Supported Adaptations Supported Adaptations Conclusions & Future Work Conclusions & Future Work

3 3 Introduction Self-healing systems are supposed to reduce the cost and complexity of system management. Self-healing systems are supposed to reduce the cost and complexity of system management. Extra facilities for problem detection, diagnosis and remediation help end-users and administrators. Extra facilities for problem detection, diagnosis and remediation help end-users and administrators. Sounds great, where do I get one? Sounds great, where do I get one?

4 4 Problem Existing/legacy systems don’t have all the self-healing mechanisms they’ll ever need. Existing/legacy systems don’t have all the self-healing mechanisms they’ll ever need. Tomorrow’s systems won’t have all of them either. Tomorrow’s systems won’t have all of them either. It’s impractical, costly and time-consuming to re-design, re-build and re-deploy new self-healing versions. It’s impractical, costly and time-consuming to re-design, re-build and re-deploy new self-healing versions. What happens when we need a new self- healing facility? What happens when we need a new self- healing facility?

5 5 6 Questions Can we retro-fit self-healing mechanisms onto existing systems as a form of system adaptation? Can we retro-fit self-healing mechanisms onto existing systems as a form of system adaptation? How could we do it? How could we do it? Can we do it on-the-fly? Can we do it on-the-fly? Can we do things in a general way rather than ad-hoc one-time fixes? Can we do things in a general way rather than ad-hoc one-time fixes? Sounds risky, if we can do it, can we give any guarantees? Sounds risky, if we can do it, can we give any guarantees? What kinds of self-healing mechanisms can we add? What kinds of self-healing mechanisms can we add?

6 6 3.5 Quick Answers Can we retro-fit self-healing mechanisms onto existing systems? Yes How could we do it? … Can we do it on the fly? Yes Can we do it in a general way, rather than ad-hoc one-time fixes? Yes If we can do it, can we give guarantees? Some What kinds of self-healing mechanisms can we add? …

7 7 How can we do it? Observation: All software systems run in a software execution environment (EE). Use it as the lowest common denominator for adapting live systems. Observation: All software systems run in a software execution environment (EE). Use it as the lowest common denominator for adapting live systems. Hypotheses: Hypotheses: The execution environment is a feasible target for efficiently and transparently effecting adaptations in the applications they host. The execution environment is a feasible target for efficiently and transparently effecting adaptations in the applications they host. Existing facilities in unmodified execution environments can be used to effect runtime adaptations. Existing facilities in unmodified execution environments can be used to effect runtime adaptations. Any guarantees we give are a function of the execution environment and its operation. Any guarantees we give are a function of the execution environment and its operation.

8 8 Solution Considerations Two kinds of execution environments: Two kinds of execution environments: Un-managed/native [Processor + OS e.g. x86 + Linux] Un-managed/native [Processor + OS e.g. x86 + Linux] Managed [JVM/CLR] Managed [JVM/CLR] What do we need from the EE? What do we need from the EE? Facility for tracing program execution. Facility for tracing program execution. Facility for controlling program execution. Facility for controlling program execution. Access to metadata about the units of execution. Access to metadata about the units of execution. Facility for adding/editing metadata. Facility for adding/editing metadata.

9 9 Comparing Execution Environments Unmanaged Execution Environment Managed Execution Environment ELF Binaries JVM 5.x CLR 1.1 Program tracing ptrace, /proc JVMTI callbacks + API ICorProfilerInfoICorProfilerCallback Program control Trampolines + Dyninst Bytecode rewriting MSIL rewriting Execution unit metadata.symtab,.debug sections Classfile constant pool + bytecode Assembly, type & method metadata + MSIL Metadata augmentation N/A for compiled C-programs Custom classfile parsing & editing APIs + JVMTI RedefineClasses IMetaDataImport, IMetaDataEmit APIs

10 10 System Architecture from 10,000ft

11 11 How Kheiron Works Attaches to programs while they run or when they load. Attaches to programs while they run or when they load. Interacts with programs while they run at various points of their execution. Interacts with programs while they run at various points of their execution. Augments type definitions and/or executable code Augments type definitions and/or executable code Needs metadata – rich metadata is better Needs metadata – rich metadata is better Interposes at method granularity, inserting new functionality via method prologues and epilogues. Interposes at method granularity, inserting new functionality via method prologues and epilogues. Control can be transferred into/out of adaptation library logic Control can be transferred into/out of adaptation library logic Control-flow changes can be done/un-done dynamically Control-flow changes can be done/un-done dynamically

12 12 System Operation Time period/ execution event Unmanaged/Native Applications (C-Programs) Managed Applications JVM 5.x CLR 1.1 Application start Attach Kheiron, augment methods Load Kheiron/JVM Load Kheiron/CLR Module load No real metadata to manipulate Augment type definition, augment module metadata, bytecode rewrite Augment type definition, augment module metadata Method invoke/entry Transfer control to adaptation logic Method JIT n/a No explicit notifications Augment module metadata, MSIL rewrite, force re-jit Method exit Transfer control to adaptation logic

13 13 Kheiron/C Operation Kheiron/C Dyninst API Dyninst Code ptrace/procfs void foo( int x, int y) { int z = 0; } Snippets C/C++ Runtime Library Points ApplicationMutator

14 14 Kheiron/JVM Operation SampleMethod Bytecode Method body SampleMethod Bytecode Method body _SampleMethodSampleMethod New Bytecode Method Body Call _Sample Method _SampleMethod Bytecode Method body A BC Prepare Shadow Create Shadow SampleMethod( args ) [throws NullPointerException] push args call _SampleMethod( args ) [throws NullPointerException] { try{…} catch (IOException ioe){…} } // Source view of _SampleMethod’s body return value/void

15 15 Experiments Goal: Measure the feasibility of our approach. Goal: Measure the feasibility of our approach. Look at the impact on execution when no repairs/adaptations are active. Look at the impact on execution when no repairs/adaptations are active. Selected compute-intensive applications as test subjects (SciMark and Linpack). Selected compute-intensive applications as test subjects (SciMark and Linpack). Unmanaged experiments Unmanaged experiments P4 2.4 GHz processor, 1GB RAM, SUSE 9.2, 2.6.8x kernel, Dyninst 4.2.1. P4 2.4 GHz processor, 1GB RAM, SUSE 9.2, 2.6.8x kernel, Dyninst 4.2.1. Managed experiments Managed experiments P3 Mobile 1.2 GHz processor, 1GB RAM, Windows XP SP2, Java HotspotVM v1.5 update 04. P3 Mobile 1.2 GHz processor, 1GB RAM, Windows XP SP2, Java HotspotVM v1.5 update 04.

16 16 Kheiron/C – Results Run 1 Run 2 Run 3 Run 4 Run 5 Avgstd Instrumentation time (ms) 689.33691.01675.87678.78689.79684.967.0686

17 17 Kheiron/JVM – Results Instrumentation time Sub-millisecond since all instrumentation done at load-time as in-memory operations on the classfile byte array.

18 18 What did we learn from our experiments? Our approach is feasible with between ~1% - 5% runtime overhead when no repairs active. Our approach is feasible with between ~1% - 5% runtime overhead when no repairs active. Kheiron is transparent to both the application and the unmodified execution environment. Kheiron is transparent to both the application and the unmodified execution environment. More/rich metadata makes things “easier” More/rich metadata makes things “easier” Easier to navigate and make changes in managed execution environments then their un-managed counterparts. Easier to navigate and make changes in managed execution environments then their un-managed counterparts. We can perform and undo our changes on-the- fly. Allowing us to manage the performance impact. We can perform and undo our changes on-the- fly. Allowing us to manage the performance impact. We use a general approach where we can hook/interpose at method-granularity in a variety of execution environments. We use a general approach where we can hook/interpose at method-granularity in a variety of execution environments.

19 19 Unmanaged Execution Environment Metadata Not enough information to support type discovery and/or type relationships. Not enough information to support type discovery and/or type relationships. No APIs for metadata manipulation. No APIs for metadata manipulation. In the managed world, units of execution are self- describing. In the managed world, units of execution are self- describing.

20 20 Adaptation Guarantees Managed execution environments give guarantees about: Managed execution environments give guarantees about: Valid executables – bytecode verification Valid executables – bytecode verification Security attributes – security sandboxes and permissions/policies. Security attributes – security sandboxes and permissions/policies. These guarantees encoded in metadata in the units of execution. These guarantees encoded in metadata in the units of execution. Any inserted adaptations are bound by the same rules as the original application. Any inserted adaptations are bound by the same rules as the original application. Un-managed execution environments don’t give the same guarantees. Un-managed execution environments don’t give the same guarantees.

21 21 Supported Adaptations Instrumentation insertion/removal. Instrumentation insertion/removal. Component/structure instance-caching. Component/structure instance-caching. Periodic/on-demand consistency checks on cached components or sub-system interfaces. Periodic/on-demand consistency checks on cached components or sub-system interfaces. Hot component swaps. Hot component swaps. Function-input filters. Function-input filters. Residual testing. Residual testing. Ghost Transactions – (POST for software). Ghost Transactions – (POST for software). Selective Emulation (compiled C-binaries). Selective Emulation (compiled C-binaries).

22 22 Selective Emulation Using STEM + Dyninst STEM – an instruction level x86 emulator developed by another group at Columbia (Locasto et. al.). STEM – an instruction level x86 emulator developed by another group at Columbia (Locasto et. al.). Dyninst – a toolkit for instrumenting running C-applications. Dyninst – a toolkit for instrumenting running C-applications.

23 23 How it works Running an application in an emulator/sandbox isn’t a new idea Running an application in an emulator/sandbox isn’t a new idea Security benefits Security benefits Isolation benefits Isolation benefits High overheads associated with whole- program execution – Valgrind, Bochs, original STEM. High overheads associated with whole- program execution – Valgrind, Bochs, original STEM. Idea: Vary, at runtime, the portions of the application which run inside the STEM emulator to manage the performance impact. Idea: Vary, at runtime, the portions of the application which run inside the STEM emulator to manage the performance impact.

24 24 Background on STEM Original STEM works at the source level: Original STEM works at the source level: void foo() { int i = 0; int i = 0; // save cpu registers macro // save cpu registers macro emulate_init(); emulate_init(); // begin emulation function call // begin emulation function call emulate_begin(); emulate_begin(); i = i + 10; i = i + 10; // end emulation function call // end emulation function call emulate_end(); emulate_end(); // commit/restore cpu registers macro // commit/restore cpu registers macro emulate_term(); emulate_term();}

25 25 Using un-modified Dyninst 4.2.1 void foo() { int i = 0; int i = 0; // save cpu registers macro // save cpu registers macro emulate_init(); // Oops…can’t inject macros with Dyninst emulate_init(); // Oops…can’t inject macros with Dyninst // begin emulation function call // begin emulation function call emulate_begin(); // OK to inject function calls with Dyninst emulate_begin(); // OK to inject function calls with Dyninst i = i + 10; i = i + 10; // end emulation function call // end emulation function call emulate_end(); // OK to inject function calls with Dyninst emulate_end(); // OK to inject function calls with Dyninst // commit/restore cpu registers macro // commit/restore cpu registers macro emulate_term(); // Oops…can’t inject macros with Dyninst emulate_term(); // Oops…can’t inject macros with Dyninst}

26 26 Modified STEM + Dyninst Modify Dyninst trampoline to save CPU state to a memory address (rather than the stack) before method call. Modify Dyninst trampoline to save CPU state to a memory address (rather than the stack) before method call. Use Dyninst API to allocate memory areas in target process address space for register storage area and code storage area. Use Dyninst API to allocate memory areas in target process address space for register storage area and code storage area. Save instructions relocated by trampoline to prime STEM’s instruction pipeline in the code storage area. Save instructions relocated by trampoline to prime STEM’s instruction pipeline in the code storage area. Use Dyninst API to insert calls to our RegisterSave and EmulatorPrime functions which configure STEM. Use Dyninst API to insert calls to our RegisterSave and EmulatorPrime functions which configure STEM. Use Dyninst API to insert calls to STEM’s emulate_begin(). Use Dyninst API to insert calls to STEM’s emulate_begin(). Modify STEM to keep track of its stack depth (initially set to 0), emulation ends when a ret/leave instruction is encountered at stack depth 0. The search for emulate_end goes away. Modify STEM to keep track of its stack depth (initially set to 0), emulation ends when a ret/leave instruction is encountered at stack depth 0. The search for emulate_end goes away.

27 27 Conclusions – 6 Answers Kheiron can be used to efficiently and transparently retro-fit self-healing mechanisms onto existing systems as a form of adaptation. Kheiron can be used to efficiently and transparently retro-fit self-healing mechanisms onto existing systems as a form of adaptation. Kheiron uses facilities and characteristics of the unmodified execution environment to adapt running programs. Kheiron uses facilities and characteristics of the unmodified execution environment to adapt running programs. Changes can be done/un-done at runtime to manage the performance impact as well as give flexibility in evolving the system. Changes can be done/un-done at runtime to manage the performance impact as well as give flexibility in evolving the system. Based on metadata, and its verification/validation rules, we can extend existing systems in a general way. Based on metadata, and its verification/validation rules, we can extend existing systems in a general way. Guarantees on application properties are a function of the execution environment. Guarantees on application properties are a function of the execution environment. Kheiron supports a wide range of adaptations. Kheiron supports a wide range of adaptations.

28 28 Future Work Kheiron can be used for disturbance/fault injection. Kheiron can be used for disturbance/fault injection. Working on a methodology for benchmarking self-healing systems with respect to the efficacy of their self-healing mechanisms (extensions to work done by Aaron Brown et. al.). Actively looking for systems to field-test/refine/reject ideas about our proposed benchmarking methodology for my thesis.

29 29 Questions, Comments, Queries? Thank you for your time and attention. Contact: Rean Griffith rg2023@cs.columbia.edu [reanG@us.ibm.com]


Download ppt "1 Kheiron: Runtime Adaptation of Native-C and Bytecode Applications Rean Griffith, Gail Kaiser Programming Systems Lab (PSL) Columbia University June 14."

Similar presentations


Ads by Google