Presentation is loading. Please wait.

Presentation is loading. Please wait.

Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison

Similar presentations


Presentation on theme: "Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison"— Presentation transcript:

1

2 Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu http://www.cs.wisc.edu/condor/bypass Multiple Bypass: Interposition agents for distributed computing

3 www.cs.wisc.edu/condor Overview › Good news and bad news. › Our solution: Bypass › Three simple (but useful) examples › Problems:  Impedance Matching  Composition › Related and Future Work

4 www.cs.wisc.edu/condor Good News! New distributed systems give you access to untold computing resources around the world.

5 www.cs.wisc.edu/condor Bad News Your programs won’t run on them.

6 www.cs.wisc.edu/condor home machine remote machine HELP! core dumped

7 www.cs.wisc.edu/condor Why not? › Interface mismatch:  open() != OpenFile()  open() != super_duper_open() › Resource mismatch:  Open(“datafile”) -> “doesn’t exist!”  Open(“output”) -> “no space for you!”  Getpwnam(“thain”) -> “who is that?”

8 www.cs.wisc.edu/condor Just rewrite your programs! › Not possible:  Commercial application  Don’t know how!  Unwilling to spend time/money to achieve uncertain benefits.  N programs * M systems = not scalable

9 www.cs.wisc.edu/condor Solution: Interposition Agent › An agent can solve an interface mismatch by converting the application’s operations into those provided by the available system. › An agent can solve a resource mismatch by sending the application’s operations to be executed elsewhere: split execution.

10 www.cs.wisc.edu/condor Solution to Interface Mismatch Application Agent open() Super-Duper Library super_duper_name_lookup() super_duper_open()

11 www.cs.wisc.edu/condor Solution to Resource Mismatch Kernel Home Machine Kernel Remote Machine Application Agent Shadow Standard Lib Via RPC

12 www.cs.wisc.edu/condor home machine remote machine Just like home!

13 www.cs.wisc.edu/condor Interposition Agents are an Open Research Topic › Several systems have been built, each with various strengths and weaknesses. › What is the appropriate mechanism? › What are the semantics of stacking? › Interesting problems result when we do “impedance matching”.

14 www.cs.wisc.edu/condor Split Execution is an Open Research Topic › We want to explore many possibilities:  Remote machine has some needed resources, but not all.  Data may be buffered and cached at both the agent and the shadow.  What procedure calls to trap depends on the application and the services needed.  Some procedure calls could be routed to third parties such as file servers.  …

15 www.cs.wisc.edu/condor Split Execution is Hard › One example of many: Trapping stat()  Different data types: struct stat, struct stat64 Depending on system, integer elements are 2->8 bytes  Multiple entry points: stat, _stat, __libc_stat  Surprises: #define stat(a,b) _fxstat(VERSION,a,b)

16 www.cs.wisc.edu/condor Is this new? › Several previous systems have built gigantic and ambitious agents to virtualize the entire UNIX interface  Condor  MOSIX  GLUnix  Legion

17 www.cs.wisc.edu/condor These systems work, but... › They never cover all of the features.  e.g. memory-mapped files › They combine unrelated features.  e.g. checkpointing and remote file access › They are difficult to customize to new classes of applications  e.g. ORCA needs network access, remote stdio, but not full remote file access.

18 www.cs.wisc.edu/condor Our Vision: › We want...  to create agents in a language independent of their (ugly) implementation.  to create simple agents that are small enough to be understood and debugged.  to compose simple agents together into larger agents that do no more (and no less) than what is needed.

19 www.cs.wisc.edu/condor Overview › Good news and bad news. › Our solution: Bypass › Three simple (but useful) examples › Problems:  Impedance Matching  Composition › Related and Future Work

20 www.cs.wisc.edu/condor Our Solution: Bypass › Bypass takes a specification of a split execution system and produces a matched shadow and agent. › Building only an agent is a subset of this ability. › Bypass hides all of the ugly details of trapping, type conversion, and RPCs.

21 www.cs.wisc.edu/condor Bypass allows you to... ›...split any dynamically-linked application. ›...transparently use heterogeneous systems. ›...trap calls with minimal overhead. ›...control execution paths with plain C. ›...combine small agents in interesting ways.

22 www.cs.wisc.edu/condor Bypass Language › Declare what procedures to trap in C++ › Annotate pointer types with data flow.  Direction: in, out, or in out  Binary data: give expression yielding the number of bytes to send/receive. › Give two function bodies:  agent_action  shadow_action

23 www.cs.wisc.edu/condor ssize_t write ( int fd, in "length" const void *data, size_t length ) agent_action {{ if( fd<3 ) { return bypass_shadow_write(fd,data,length); } else { return write(fd,data,length); } }} shadow_action {{ return write(fd,data,length); }} ;

24 www.cs.wisc.edu/condor Agent Action › Any arbitrary C++ code. › When the program invokes write(), the agent_action is executed at the home machine. › Within the agent_action:  write() - Invoke the original write() at the foreign machine.  bypass_shadow_write() - Invoke the shadow_action via RPC.

25 www.cs.wisc.edu/condor Shadow Action › Any arbitrary C++ code. › If the agent decides to invoke the RPC to the shadow, the shadow_action is executed at the home machine. › Within the shadow_action:  write() - Invoke write() at the home machine.

26 www.cs.wisc.edu/condor Using Bypass › Run "bypass" to read the specification and produce C++ source code: % bypass -agent -shadow simple.bypass › The shadow is compiled into a plain executable. › The agent is compiled into a shared library.

27 www.cs.wisc.edu/condor Using Bypass › The dynamic linker is used to force the agent into an executable at run-time: setenv LD_PRELOAD simple_agent.so › Procedure calls are “trapped” merely by putting the agent first in the link list. › This method can be used on any dynamically- linked program: tcsh, netscape, emacs…

28 www.cs.wisc.edu/condor Shadow Features › Multiple configurations:  One shadow, one agent  New process per incoming agent  New thread per incoming agent › Tracing of calls actually executed › Authentication:  Trivial: Hostname  Secure: Globus GAA, X509 identities

29 www.cs.wisc.edu/condor Bypass can be used by Real Users! › Bypass works on unmodified executables.  (Real Users are not willing/able to rewrite/recompile their programs.) › Bypass requires no special privileges.  (Real Users do not have the root password) › Thus, Bypass allows a Real User to make good use of a remote machine without begging the administrator to configure it to his/her needs.

30 www.cs.wisc.edu/condor Performance › Overhead of trapping a system call is very small: 1-9 us  The "trapping mechanism" simply interposes a few extra function calls.  Small compared to the expense of a real system call (about 10-70us) › Remote procedure calls are, as expected, much slower: about 1 ms under the best conditions.

31 www.cs.wisc.edu/condor Overview › Good news and bad news. › Our solution: Bypass › Three simple (but useful) examples › Problems:  Impedance Matching  Composition › Related and Future Work

32 www.cs.wisc.edu/condor Example One: Remote Console › Trap only read and write, and send operations on standard files back to a single shadow process. int read( int fd, in opaque “length” void *data, int length ) agent_action {{ if( fd<3 ) { bypass_remote_read( fd, data,length ); } else { return read(fd,data,length); } }} shadow_action {{ return read(fd,data,length); }};

33 www.cs.wisc.edu/condor Remote Console Kernel Home Machine Standard I/O reads and writes Shadow Standard Lib Kernel Foreign Machine Standard Lib Agent Appl Kernel Foreign Machine Standard Lib Agent Appl Kernel Foreign Machine Standard Lib Agent Appl

34 www.cs.wisc.edu/condor Example Two: Attach New Filesystem › Trap standard I/O calls and replace them with calls to a user-level filesystem library, such as Globus GASS. int open( in string const char *path, int flags, int mode ) agent_action {{ return globus_gass_open( path, flags, mode ); }}; int close( int fd ) agent_action {{ return globus_gass_close( fd ); }};

35 www.cs.wisc.edu/condor Standard Library Layer Application Application attempts a plain POSIX open(). Globus GASS does a variety of system calls to strong authentication, remote file access, caching, etc… POSIX to GASS Agent openclose openreadwriteclose

36 www.cs.wisc.edu/condor Example Three: Instrumentation agent_prologue {{ static int bytes_read=0 static int bytes_written=0; }}; int read( int fd, out opaque “length” void *data, int length ) agent_action {{ int result; result = read( fd, data, length); if(result>0) bytes_read+= result; return result; }}; /** Definition for write is very similar **/

37 www.cs.wisc.edu/condor Example Three: Instrumentation Cont. int exit( int status ) agent_action {{ printf(“NOTICE: %d bytes read, %d bytes written,” bytes_read, bytes_written ); exit(status): }};

38 www.cs.wisc.edu/condor Standard Library Layer Application Measurement Agent writereadexit writeread

39 www.cs.wisc.edu/condor Overview › Good news and bad news. › Our solution: Bypass › Three simple (but useful) examples › Problems:  Impedance Matching  Composition › Related and Future Work

40 www.cs.wisc.edu/condor Problem One: Impedance Matching › An agent may not be able to transform operations from a layer above to a layer below. › Example: Globus GASS provides an equivalent for open() and close(), but not for stat().

41 www.cs.wisc.edu/condor Possible Solutions: › Be honest.  Make stat() fail: “not supported” › Be evasive.  Find some way to serve the request indirectly. › Be dishonest.  Conjure up a complete lie about the file.

42 www.cs.wisc.edu/condor What to do? › We need not come up with a universally applicable solution: we are building small, interchangeable software. › Consider why the application uses stat:  to see if the file exists.  to test permission to access it.  to find out the best block size.  to get its size before creating a buffer.  to report meta-data to the user.

43 www.cs.wisc.edu/condor Should I be honest? › Cause stat() to fail: “not supported” › Occasionally works! › If the application only needs a hint such as block size, it might fall back on a default. › Example: Sometimes a big malloc() calls mmap() to get a new segment. If that fails, fall back on brk(). › Fails in many contexts: “not supported” is often interpreted as “permission denied.”

44 www.cs.wisc.edu/condor Should I be evasive? › Open the file, fstat() it, then close it. › Almost always preserves the correct semantics. › May break application’s assumptions.  stat() is assumed to be quite cheap.  open() through GASS or other storage system may incur huge delays as the entire file is pulled in. › In this example, GASS caches recently used files. This solution is good if the application only stat()s files it intends to read anyway.

45 www.cs.wisc.edu/condor Should I be dishonest? › Return very permissive information:  read/write/execute by anyone  block size is 4K  owned by you  file is 4GB big › Almost always works! › (Not sufficient to implement “ls -l”)

46 www.cs.wisc.edu/condor Why is dishonesty the best policy? › The results from stat are (almost) universally used as hints.  First check permissions, then open.  First check size, then read data. › In both cases, the situation may change, so the application must check for error conditions anyway.

47 www.cs.wisc.edu/condor Problem Two: Composition › Bypass allows agents to be composed together: simply preload them all together. › How do procedure calls bind to procedure definitions? › Previous agent systems have proposed such rules, but do not explore their ramifications.

48 www.cs.wisc.edu/condor Rules of Composition › 1: The process maintains a pointer to an active layer. The topmost layer is the initial active layer. › 2: A call to a trapped procedure resolves to the highest definition found below the active layer. › 3: After resolving, but before invoking, the active layer is lowered to that of the callee. Before returning, the active layer is restored to that of the caller. › 4: Calls to untrapped procedures do not consult or change the active layer.

49 www.cs.wisc.edu/condor Practical Interpretation › A layer is only capable of invoking those below it. › A layer can only be invoked by those above. › Why?  Strict layering creates order from chaos.  Without it, measurement is not possible.

50 www.cs.wisc.edu/condor Example: Measure above GASS › Notice: calls only propagate down › Measurement layer only traps those operations actually attempted by the application. Application Layer POSIX to GASS Layer openclose Standard Library Layer openreadwriteclose Measurement Layer readwrite exit

51 www.cs.wisc.edu/condor Example: GASS above Measure › Again: calls only propagate down › Measurement layer catches the resources consumed by both layers together. Application Layer POSIX to GASS Layer openclose Standard Library Layer openreadwritecloseexit Measurement Layer readwriteexit

52 www.cs.wisc.edu/condor Example: Third Party Function › printf is a third party function: it is not trapped by a layer. › It contains a write, so where does it bind? › It binds to the layer below that of the caller. Application Layer Standard Library Layer printf Agent Layer write

53 www.cs.wisc.edu/condor Others Have Chosen Different Rules › Mediating Connectors:  Layer may invoke either the layer below, or start again at the topmost.  Disjoint layers may commute. › We disagree:  If you can re-invoke at top, it is not possible to build a sensible measuring agent.  Careful with “disjoint”: GASS and measurement layers appear to be disjoint, but they do not commute.

54 www.cs.wisc.edu/condor A Layered Remote Execution System Kernel Home Machine Kernel Remote Machine Application Measurement Shadow Standard Lib Via RPC POSIX to GASS Remote I/O Measurement

55 www.cs.wisc.edu/condor Overview › Good news and bad news. › Our solution: Bypass › Three simple (but useful) examples › Problems:  Impedance Matching  Composition › Related and Future Work

56 www.cs.wisc.edu/condor Related Work › “Classic” RPC and XDR:  Define standard integer sizes, endianness, etc.  Start by defining external protocol, then produce programming interface which is not always convenient: struct read_results * read_1( int fd, int length );

57 www.cs.wisc.edu/condor Related Work › Bypass:  We are stuck with existing interfaces, so annotate them to produce a protocol: int read( int fd, out opaque “length” void *data, int length );  Do “best effort” conversion to/from external data format: off_t is 4 bytes on some platforms, 8 bytes on others. A conversion might fail!  Define canonical values for source-level symbols: O_CREAT has different values on Linux and Solaris!

58 www.cs.wisc.edu/condor Related Work › Hunt and Brubacher, “Detours”  Trap library calls on NT using binary rewriting – can be applied to any executable.  Make original procedure available through special “trampoline” call.  Bypass leaves the original entry point intact, so subroutines need not be re-written to use the trampoline.

59 www.cs.wisc.edu/condor Related Work › Alexandrov, et al., “UFO”  Use a kernel-level facility to trap all of a process’ system calls and translate some of them into WWW operations.  The kernel mechanism is secure and can be applied to any process.  But… it has a high (7x) trapping overhead and cannot be applied to procedures that are not true system calls.

60 www.cs.wisc.edu/condor Related Work › Bypass:  Trapping overhead is very small and can be performed on procedures that are not necessarily system calls.  But… can only be applied to dynamically- linked executables, and is not suitable as a security mechanism.

61 www.cs.wisc.edu/condor Related/Future Work › A complete remote execution system needs both methods:  The program owner provides a lightweight mechanism for creating a correct split execution environment.  The machine owner provides a heavyweight mechanism to defend itself from a (possibly) malicious program.

62 www.cs.wisc.edu/condor Complete System Kernel Home Machine Kernel Remote Machine Shadow Standard Lib Application Agent Standard Lib Via RPC Sandbox

63 www.cs.wisc.edu/condor Our Contributions › A language for writing agents  Independent of implementation mechanism.  Correct mechanism depends on purpose. › Implicit binding:  Agents name procedures, not other agents.  Original procedure entry point preserved. › Composition rules  Strict layering makes order from chaos.

64 www.cs.wisc.edu/condor Future Work › Interaction of sandbox and utility agents  A utility agent modified the application’s operations to make them acceptable to the sandbox.  Should they negotiate on permitted operations? › Signal handling  How to specify? (Many relevant functions)  Flow of control is backwards › Other implementations  Binary rewriting.  Build specialized linker that understands multiple definitions of symbols.

65 www.cs.wisc.edu/condor Further Questions? › Douglas Thain  thain@cs.wisc.edu thain@cs.wisc.edu › Miron Livny  miron@cs.wisc.edu miron@cs.wisc.edu › Bypass Web Page  http://www.cs.wisc.edu/condor/bypass http://www.cs.wisc.edu/condor/bypass › Questions now?


Download ppt "Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison"

Similar presentations


Ads by Google