Presentation is loading. Please wait.

Presentation is loading. Please wait.

PSUCS322 HM 1 Languages and Compiler Design II Runtime System Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring.

Similar presentations


Presentation on theme: "PSUCS322 HM 1 Languages and Compiler Design II Runtime System Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring."— Presentation transcript:

1 PSUCS322 HM 1 Languages and Compiler Design II Runtime System Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring 2010 rev.: 4/27/2010

2 PSUCS322 HM 2 Agenda Runtime Storage Organization Static Storage Runtime Stack System Heap Functions and Activations Activation Records Function Call Register Saving Scopes Function Parameters

3 PSUCS322 HM 3 Runtime Storage Organization Multiple memory uses on computer: OS memory needs; e.g. ½ for Windows Program code User program data Function invocations Temporaries I/O buffers Etc. Different requirements, caused by differences in: lifetime, size, access rights. Result: static space, stack, and heap Stack Heap Static Data Code Reserved Space

4 PSUCS322 HM 4 Static Storage Space for static data objects is allocated in a fixed location for the whole lifetime of a program Possible when the sizes of the objects is known at compile-time Static objects can be bound to absolute addresses; not necessarily desirable Static allocation requires no runtime management, hence simple to handle Space is wasted if objects are not needed for complete program lifetime Mostly used for global variables, code, and constants Fortran and Cobol are designed to use only static storage Such ancient languages need no support for recursive functions, nor do they allow dynamic arrays

5 PSUCS322 HM 5 Runtime Stack Stack needed for data that are pushed and popped dynamically, following a last-in, first-out pattern Space needed at moment of function call, freed at moment of return Allocation and de-allocation can be implemented cheaply, by adjusting stack pointer; though “old” data remain in memory More efficient use of space than static allocation Most newer imperative languages use stack storage for data associated with activations; became popular with Algol60

6 PSUCS322 HM 6 System Heap Space for heap data objects can be allocated and freed any time during program execution. Is most flexible, expensive, and dangerous method of storage allocation (memory leaks). Typical heap operations are: Allocation: Acquire free storage for program. Typically triggered by explicit or implicit user commands, e.g. (C) struct node *root = (struct node *) malloc( sizeof( struct node ) ); (Java) TreeNode root = new TreeNode( val ); De-allocation: Reclaims (AKA frees) no-longer-needed storage for reuse Languages such as C and Pascal contain commands for storage reclaiming –e.g. free( root ) Compaction: Construct larger blocks of free storage from smaller pieces –Can be triggered by a failed allocation request; AKA garbage collection Lisp, ML and some interpreted languages need heap for activation records

7 PSUCS322 HM 7 Functions and Activations Functions, procedures (methods), and classes constitute a form of programming abstraction –focus here functions, not classes Allow program to be divided into named components with hidden internals –returning a result to place on invocation Permits code re-use Each function invocation at runtime is called an activation Each activation has its own data: formals => actuals, and locals Storage for these data is called an Activation Record (AR). AKA Stack Frame Many activations for the same function can exist at one moment of time, due to recursive calls Data associated with one activation are independent from all others Normally, an activation record is created when a function is invoked and is destroyed when the function returns

8 PSUCS322 HM 8 Activation Records Activation record typically contain the following entries: Return address: the address of instruction after the call Formal parameters: sequence of parameters passed to function by caller –At call: actuals. Inside function: formals. Actuals are bound to formals Return value: a place for storing the function return value Local data: storage for local variables Access link: AKA static link, a pointer to next activation record in chain for accessing non-local data –e.g. lexically enclosing function’s AR, as in Algol, Ada, Pascal, PLI Control link: AKA dynamic link, pointer to caller’s activation record Saved machine status: holds info about the machine (i.e. registers’ values) just before the function is called Temporaries: storage for compiler-allocated temporary objects (e.g. dynamic arrays)

9 PSUCS322 HM 9 Where Are Activation Records Stored? Static Allocation : number of ARs and the size of each AR must be known at compile time No runtime management needed Multiple invocations of the same function reuse the same AR Can’t handle recursive functions and dynamic data Only early Fortran uses this approach Stack Allocation: ARs are pushed on and popped off the stack Works for block-structured languages: a function must return before its own caller returns Very efficient: hence default choice of most programming languages Can’t handle “first-class” functions Heap Allocation: AR can be created and destroyed at any time Needed for implementing functional languages High overhead

10 PSUCS322 HM 10 Stack-Based AR Allocation In most languages, if function f is declared inside function g, then f can only be invoked within the scope of g. This nesting property of function calls makes it possible to allocate ARs on a stack. Guarantees that non-local variables exist when needed. Stack implementation is very efficient. Stack Growth Return Value actual 1 actual N control link access link save reg 1 save reg N local 1 local N temp 1 temp N sp bp Return Value actual 1 actual N control link access link locals save reg 1

11 PSUCS322 HM 11 Function Call When a Method is activated = Function is Called: The Caller: allocates [part of] an activation record for the callee evaluates the actual parameters, and stores them into AR stores a return address [or return slot] into the AR if needed, saves (some) register values into the AR stores current AR pointer (AKA bp, for base pointer; or bp for base pointer) and updates it to point to callee’s AR –But which place in the AR? transfers control to the callee The Callee: saves (some) register values and other machine status info allocates and initializes its local data and begins execution Allocates temps, if needed

12 PSUCS322 HM 12 Function Call, Cont’d Upon Returning The Callee: places return value at place the caller can access restores caller’s AR pointer and other registers, using saved info in stack marker returns control to the caller The Caller can copy the returned value into its own AR On some architectures: frees space for actuals

13 PSUCS322 HM 13 Register Saving Live registers’ content must be saved in memory before they can be used for new purpose in callee. The register- saving task can be done by the caller alone, by the callee alone, or split between the two Caller Saving: The caller needs to save the registers that hold live data, regardless whether the callee is actually going to use any of these registers. May end being unnecessary work Callee Saving: The callee needs to save the registers that it’s going to use, regardless whether they contain any live contents. It may also end up doing unnecessary work Split Saving: Designate a set of registers as caller-save registers, and the rest callee-save registers. The caller may use any callee- save register without saving; while the callee may use any caller- save register without saving

14 PSUCS322 HM 14 Scopes Def: Scope is a region of program text over which a name is known; e.g. var binding is effective. Scopes are typically introduced by function declarations as well as program blocks, like { … } blocks in C++, Java main() { //B0 int a = 0, b = 0; { //B1 int b = 1; { //B2 int a = 2; printf("%d %d\n", a, b); } //end B2 { //B3 int b = 3; printf("%d %d\n", a, b); } //end B3 printf("%d %d\n", a, b); } //end B1 printf("%d %d\n", a, b); } //end B0 declarationScope Name int a = 0B0, B1, B3 int b = 0B0 int b = 1B1, B2 int a = 2B2 int b = 3B3

15 PSUCS322 HM 15 Lexical Scope Rule Under lexical scope rules, variables are identified by looking backwards through the program text to find the nearest enclosing declaration. early all programming languages use lexical scope For the program on the right, when f is executed, it needs to look up a value for a, which is a free variable of f. The nearest enclosing declaration in this case is the global declaration. At the time f executes, the global a has the value 5, so f returns 5+10, and 15 is printed by the program. program main; var a : int := 0; function f( b : int) : int is return a + b; end f; function g( c : int ) : int is var a : int := 1; a := a + 2; return f( c ); end g; begin a := 5; print( g( 10 ) ); end main;

16 PSUCS322 HM 16 Nested Scopes Environments Associated with a Function: Definition Environment: the environment in which the function is defined. Needed if lexical scope is used Invocation Environment: the environment in which the function is invoked. Needed if dynamic scope is used Passing Environment: the environment in which the function is passed as a parameter. No direct use

17 PSUCS322 HM 17 Needed Environment Set up an access link in AR to point to the AR of function’s def-env or invoc-env: either another AR on stack or the global env: For static-scoped languages: The access link should be pointing the function’s def-env, which can be derived from the caller’s access link (see next slide). In the case of a nest of scopes, a chain of access links can be followed to access to every enclosing environment of an inner function For dynamic-scoped languages: The access link should be pointing to the function’s invoc-env, which is simply the caller’s AR(!). Since the control link is already pointing to caller’s AR, there is no need to set up a separate access link

18 PSUCS322 HM 18 Setting Up Access Links Assume f calls g, and f and g are defined at scope-levels m and n, respectively. Further assume that f ’s access link is already set up: If m > n — For g to be visible to f, g’s definition environment must be one of the scopes that encloses f. Traverse f ’s access links, the AR at scope-level n − 1 should be the target for g’s access link If m = n — f and g are defined in the scope Simply use f ’s access link as g’s access link If m < n — f must be the definition environment of g. Let g’s access link points to f ’s AR

19 PSUCS322 HM 19 Sample: Scope program main; | function count( i : integer; a: Intlist): integer; | | var sum: integer := 0; | | procedure check_int( j : integer ); | | | begin –- check_int | | | if j = i then sum := sum + 1; end if; | | | end check_int; | | procedure do_intlist( a: Intlist ); | | | begin -– do_intlist | | | while (a) loop | | | check_int(a^.x); a := a^.next; | | | end loop; | | | end do_intlist; | | begin –- count | | | do_intlist( a ); count := sum; | | end count; | procedure print_int( i: integer); | | begin –- print_int | | | writeln( i ); | | end print_int; | begin -- main | var a: Intlist; | print_int( count( 1, a ) ); end main;

20 PSUCS322 HM 20 Execution Scenario main calls count count calls do_intlist do_intlist calls check_int · · · main calls print_int (A snapshot of ARs on the stack is shown on the right.) When check_int is passed as a parameter to do_intlist, its access link can be computed, since it is defined in this scope.

21 PSUCS322 HM 21 Function Parameters – e.g. Pascal program main; procedure do_intlist(a: Intlist; procedure f(i: integer)); begin... f(a^x);... end; function count(i: integer; a: Intlist): integer; var sum: integer := 0; procedure check_int(j: integer); begin if j = i then sum := sum + 1; end; begin do_intlist(a, check_int); count := sum; end; begin var a: Intlist; print_int(count(2,a)); end. Here check_int is passed as a parameter to do_intlist, and gets invoked there; it references two non-local variables i and sum, which are not global variables either. Cannot be directly expressed in C or C++

22 PSUCS322 HM 22 Functions as Parameters The call-callee relationship discussed previously does not hold for languages with nested procedural scopes, like Pascal, Algol, Ada check_int’s definition environment may have nothing to do with do_intlist. How can we set up the access link for check_int’s AR in this case? Solution: The routine that passes f as a parameter to g has information about f ’s definition and can set up access link for f. And it can pass f ’s access link together with f. Effectively, when passing a function as a parameter, we should pass a closure (function pointer plus its environment) instead of just a function pointer.

23 PSUCS322 HM 23 Passing Global Functions Global functions have a unique feature — the definition environment is the global scope. There is no need to set up an access link for a global function’s invocation, since any non-local variable to these functions must be a global variable, which can be accessed directly. Example: In C, all functions are defined at global scope, hence there is no need to use closure to handle function parameters. Side Note: gcc extends C with nested function definitions, but it does use closures to handle function parameters — result’s correctness is not guaranteed(!).

24 PSUCS322 HM 24 Functions as Return Values Going one step further, suppose that function values are treated like other values, e.g., they can be returned as function results or stored into variables (the following example is in ML): type counter = int list -> int fun make_counter( i : int ) : counter = let fun count( a: int list) = let val sum = ref 0 fun check_int( j : int ) = if j = i then sum := !sum + 1 else () in do_intlist( a, check_int ); !sum end in count end val g: counter = make_counter(2); val c: int list =...; val c2 : int = g(c); A scenario: main calls make_counter, which returns count; main calls count; count calls do_intlist; do_intlist calls check_int;

25 PSUCS322 HM 25 Functions as Return Values The scenario: main calls make_counter, which returns count; main calls count; count calls do_intlist; do_intlist calls check_int; Problems: check_int requires value of non-local variable i, which is the parameter to make_counter, but activation of make counter is no longer live when check_int is called! If i is stored in activation record for make_counter and activation- record is stack-allocated, it will be gone at the point where check int needs it! Solution: Store activation records in the heap Special Case: Again, if a global function is returned as a return value, there is no problem for executing it later, since all its non- local variables are global variables

26 PSUCS322 HM 26 Handling Program Blocks Nested program blocks can have their own local variables. E.g. if (i>j) { int x;... } else { double y[100];... } Where should these variables be stored? Solution 1: Consider a block as an in-line function without parameters, and create an AR for it. Advantages: efficient use of storage. Downside: high runtime overhead Solution 2: Use ARs only for true functions. If there are blocks within a function, statically collect storage requirement information from each block; then compute the maximum amount of storage needed for handling all blocks, and allocate that in the function’s AR. Advantages: no runtime overhead. Downside: may waste storage space


Download ppt "PSUCS322 HM 1 Languages and Compiler Design II Runtime System Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring."

Similar presentations


Ads by Google