3-1 © 2004, D.A. Watt, University of Glasgow 3 Variables and Storage  A simple storage model.  Simple and composite variables.  Copy semantics vs reference.

3-2 Variables and Storage (cont’d)  Commands.  Expressions with side effects.  Implementation notes.

3-3 An abstract model of storage (1)  In functional/logic PLs (as in mathematics), a “variable” stands for a fixed but unknown value.  In imperative/OO PLs, a variable is a container for a value, which may be inspected and updated as often as desired.  Such a variable can be used to model a real-world object whose state changes over time.

3-4 An abstract model of storage (2)  To understand such variables, assume a simple abstract model of storage: A store is a collection of storage cells. Each storage cell has a unique address. Each storage cell is either allocated or unallocated. Each allocated storage cell contains either a simple value or undefined. true 3.14 7 ‘X’ ? unallocatedunallocated cells allocated allocated cells undefined

3-5 Simple vs composite variables  A simple value is one that can be stored in a single storage cell (typically a primitive value or a pointer).  A simple variable occupies a single allocated storage cell.  A composite variable occupies a group of allocated storage cells.

3-6 Simple variables  When a simple variable is declared, a storage cell is allocated for it.  Assignment to the simple variable updates that storage cell.  At the end of the block, that storage cell is deallocated.  Animation (Ada): declare n: Integer; begin n := 0; n := n+1; end; n ? 0 n 1 n

3-7 Composite variables (1)  A variable of a composite type has the same structure as a value of that type. For instance: A record variable is a tuple of component variables. An array variable is a mapping from an index range to a group of component variables.  The component variables can be inspected and updated either totally or selectively.

3-8 Composite variables (2)  Animation (Ada): declare type Date is record y: Year_Number; m: Month; d: Day_Number; end record; xmas, today: Date; begin xmas.d := 25; xmas.m := dec; xmas.y := 2004; today := xmas; end; ? ? ? ? ? ? xmastoday ? 25 ? ? ? ? xmastoday dec 25 ? ? ? ? xmastoday dec 25 2004 ? ? ? xmastoday dec 25 2004 ? ? ? xmastoday dec 25 2004 dec 25 2004 xmastoday

3-9 Total vs selective update  Total update of a composite variable means updating it with a new (composite) value in a single step: today := xmas;  Selective update of a composite variable means updating a single component: today.y := 2004;

3-10 Static vs dynamic vs flexible arrays  A static array is an array variable whose index range is fixed by the program code.  A dynamic array is an array variable whose index range is fixed at the time when the array variable is created. In Ada, the definition of an array type must fix the index type, but need not fix the index range. Only when an array variable is created must its index range be fixed. Ada arrays are therefore dynamic.  A flexible array is an array variable whose index range is not fixed at all, but may change whenever a new array value is assigned.

3-11 Example: C static arrays  Array variable declarations: float v1[] = {2.0, 3.0, 5.0, 7.0}; float v2[10]; index range is {0, …, 3} index range is {0, …, 9} A C array doesn’t know its own length!  Function: void print_vector (float v[], int n) { // Print the array v[0], …, v[n-1] in the form “[… … …]”. int i; printf("[%f8.2", v[0]); for (i = 1; i < n; i++) printf(" %f8.2", v[i]); printf("]"); } … print_vector(v1, 4); print_vector(v2, 10);

3-12 Example: Ada dynamic arrays  Array type and variable declarations: type Vector is array (Integer range <>) of Float; v1: Vector(1.. 4) := (1.0, 0.5, 5.0, 3.5); v2: Vector(0.. m) := (0.. m => 0.0);  Procedure: procedure print_vector (v: in Vector) is -- Print the array v in the form “[… … …]”. begin put('['); put(v(v'first)); for i in v'first + 1.. v'last loop put(' '); put(v(i)); end loop; put(']'); end; … print_vector(v1); print_vector(v2);

3-13 Example: Java flexible arrays  Array variable declarations: float[] v1 = {1.0, 0.5, 5.0, 3.5}; float[] v2 = {0.0, 0.0, 0.0}; … v1 = v2; index range is {0, …, 3} v1 ’s index range is now {0, …, 2} index range is {0, …, 2}  Method: static void printVector (float[] v) { // Print the array v in the form “[… … …]”. System.out.print("[" + v[0]); for (int i = 1; i < v.length; i++) System.out.print(" " + v[i]); System.out.print("]"); } … printVector(v1); printVector(v2);

3-14 Copy semantics vs reference semantics  What exactly happens when a composite value is assigned to a variable of the same type?  Copy semantics: All components of the composite value are copied into the corresponding components of the composite variable.  Reference semantics: The composite variable is made to contain a pointer (or reference) to the composite value.  C and Ada adopt copy semantics.  Java adopts copy semantics for primitive values, but reference semantics for objects.

3-15 Example: Ada copy semantics (1)  Declarations: type Date is record y: Year_Number; m: Month; d: Day_Number; end record; dateA: Date := (2004, jan, 1); dateB: Date;  Effect of copy semantics: dateB := dateA; dateB.y := 2005; jan 1 2004 dateA ? ? ? dateB jan 1 2004 dateA jan 1 2004 dateB jan 1 2004 dateA jan 1 2005 dateB

3-16 Example: Java reference semantics (1)  Declarations: class Date { int y, m, d; public Date (int y, int m, int d) { … } } Date dateR = new Date(2004, 1, 1); Date dateS = new Date(2004, 12, 25);  Effect of reference semantics: dateS = dateR; dateR.y = 2005; 1 1 2004 dateR 12 25 2004 dateS 1 1 2004 dateRdateS 1 1 2005 dateRdateS

3-17 Example: Ada copy semantics (2)  We can achieve the effect of reference semantics in Ada by using explicit pointers: type Date_Pointer is access Date; Date_Pointer dateP = new Date; Date_Pointer dateQ = new Date; … dateP.all := dateA; dateQ := dateP;

3-18 Example: Java reference semantics (2)  We can achieve the effect of copy semantics in Java by cloning: Date dateR = new Date(2004, 4, 1); dateT = dateR.clone();

3-19 Lifetime (1)  Every variable is created (or allocated) at some definite time, and destroyed (or deallocated) at some later time when it is no longer needed.  A variable’s lifetime is the interval between its creation and destruction.  A variable occupies storage cells only during its lifetime. When the variable is destroyed, the storage cells that it occupied may be deallocated (and subsequently allocated for some other purpose).

3-20 Lifetime (2)  A global variable’s lifetime is the program’s run-time. It is created by a global declaration.  A local variable’s lifetime is an activation of a block. It is created by a declaration within that block, and destroyed on exit from that block.  A heap variable’s lifetime is arbitrary, but bounded by the program’s run-time. It can be created at any time, by a command or expression, and may be destroyed at any later time. It is accessed through a pointer.

3-21 Example: Ada global and local variables (1)  Outline of Ada program: procedure main is g1: Integer; g2: Float; begin … P; … Q; … end; procedure P is p1: Float; p2: Integer; begin … Q; … end; procedure Q is q: Integer; begin … end;

3-22 Example: Ada global and local variables (2)  Lifetimes of global and local variables:  Global and local variables’ lifetimes are nested. return from P startstop call Q return from Q time call Q call P lifetime of q lifetime of g1, g2 lifetime of q lifetime of p1, p2

3-23 Example: Ada local variables of recursive procedure (1)  Outline of Ada program: procedure main is g: Integer; begin … R; … end; procedure R is r: Integer; begin … R; … end;

3-24 call R return from R startstop call R return from R call R return from R time Example: Ada local variables of recursive procedure (2)  Lifetimes of global and local variables (assuming 3-deep recursive activation of R ): lifetime of r lifetime of g

3-25 Example: Ada heap variables (1)  Outline of Ada program: procedure main is type IntNode; type IntList is access IntNode; type IntNode is record elem: Integer; succ: IntList; end record; odds, primes: IntList := null; function cons (h: Integer; t: IntList) return IntList is begin return new IntNode'(h, t); end;

3-26 Example: Ada heap variables (2)  Outline of Ada program (continued): procedure A is begin odds := cons(3, cons(5, cons(7, null))); primes := cons(2, odds); end; procedure B is begin odds.succ := odds.succ.succ; end; begin … A; … B; … end;

3-27 unreachable Example: Ada heap variables (3)  After call and return from A : 2357 primes odds 2357 primes odds  After call and return from B : heap variables

3-28 call A return from A startstop call B return from B time Example: Ada heap variables (4)  Lifetimes of global and heap variables: lifetime of 7-node lifetime of primes lifetime of odds lifetime of 5-node lifetime of 3-node lifetime of 2-node  Heap variables’ lifetimes have no particular pattern.

3-29 Allocators and deallocators  An allocator is an operation that creates a heap variable, yielding a pointer to that heap variable. Ada and Java’s allocator is an expression of the form “ new …”. C’s allocator is a library function, malloc.  A deallocator is an operation that explicitly destroys a designated heap variable. Ada’s deallocator is a library (generic) procedure, unchecked_deallocation. C’s deallocator is a library function, free. Java has no deallocator at all.

3-30 Reachability  A heap variable remains reachable as long as it can be accessed by following pointers from a global or local variable.  A heap variable’s lifetime extends from its creation until: it is destroyed by a deallocator, or it becomes unreachable, or the program stops.

3-31 Pointers (1)  A pointer is a reference to a particular variable. (In fact, pointers are sometimes called references.)  A pointer’s referent is the variable to which it refers.  A null pointer is a special pointer value that has no referent.  In terms of our abstract model of storage, a pointer is essentially the address of its referent in the store. However, each pointer also has a type, and the type of a pointer allows us to infer the type of its referent.

3-32 Pointers (2)  Pointers and heap variables can be used to represent recursive values such as lists and trees.  But the pointer itself is a low-level concept. Manipulation of pointers is notoriously error-prone and hard to understand.  For example, the assignment “ p.succ := q; ” appears to manipulate a list, but which list? Also: Does it delete nodes from the list? Does it stitch together parts of two different lists? Does it introduce a cycle?

3-33 Dangling pointers (1)  A dangling pointer is a pointer to a variable that has been destroyed.  Dangling pointers arise from the following situations: where a pointer to a heap variable still exists after the heap variable is destroyed by a deallocator where a pointer to a local variable still exists at exit from the block in which the local variable was declared.  A deallocator immediately destroys a heap variable; all existing pointers to that heap variable then become dangling pointers. Thus deallocators are inherently unsafe.

3-34 Dangling pointers (2)  C is highly unsafe: After a heap variable is destroyed, pointers to it might still exist. At exit from a block, pointers to its local variables might still exist (e.g., stored in global variables).  Ada is safer: After a heap variable is destroyed, pointers to it might still exist. But pointers to local variables may not be stored in global variables.  Java is very safe: It has no deallocator. Pointers to local variables cannot be obtained.

3-35 deallocates that heap variable ( dateP and dateQ are now dangling pointers) Example: C dangling pointers  Consider this C code: struct Date {int y, m, d;}; Date* dateP; Date* dateQ; dateP = (Date*)malloc(sizeof Date); dateP->y = 2004; dateP->m = 1; dateP->d = 1; dateQ = dateP; free(dateQ); printf("%d4", dateP->y); dateP->y = 2005; allocates a new heap variable makes dateQ point to the same heap variable as dateP fails

3-36 Commands  A command (or statement) is a PL construct that will be executed to update variables.  Commands are characteristic of imperative and OO (but not functional) PLs.  Forms of commands: skips assignments procedure calls sequential commands conditional commands iterative commands.

3-37 Skips  A skip is a command with no effect.  Typical forms: “ ; ” in C and Java “ null; ” in Ada.  Skips are useful mainly within conditional commands.

3-38 Assignments  An assignment stores a value in a variable.  Single assignment: “V = E ; ” in C and Java “V := E ; ” in Ada – the value of expression E is stored in variable V.  Multiple assignment: “V 1 =  = V n = E ; ” in C and Java – the value of E is stored in each of V 1, , V n.  Assignment combined with binary operator: “V  = E ; ” in C and Java means the same as “V = V  E ; ”.

3-39 Procedure calls  A procedure call achieves its effect by applying a procedure to some arguments.  Typical form: P ( E 1, , E n ); Here P determines the procedure to be applied, and E 1, , E n are evaluated to determine the arguments. Each argument may be either a value or (sometimes) a reference to a variable.  The net effect of the procedure call is to update variables. The procedure achieves this effect by updating variables passed by reference, and/or by updating global variables. (But updating its local variables has no net effect.)

3-40 Sequential commands  Sequential, conditional, and iterative commands (found in all imperative/OO PLs) are ways of composing commands to achieve different control flows. Control flow matters because commands update variables, so the order in which they are executed makes a difference.  A sequential command specifies that two (or more) commands are to be executed in sequence. Typical form: C1 C2C1 C2 – command C 1 is executed before command C 2.

3-41 Conditional commands  A conditional command chooses one of its subcommands to execute, depending on a condition.  An if-command chooses from two subcommands, using a boolean condition.  A case-command chooses from several subcommands.

3-42 If-commands (1)  Typical forms (Ada and C/Java, respectively): if E thenif ( E ) C 1 C 1 elseelse C 2 C 2 end if; – if E yields true, C 1 is executed; otherwise C 2 is executed. E must be of type Boolean  Common abbreviation (Ada): if E then  if E then C 1 C 1 end if;else null; end if;

3-43 If-commands (2)  Generalisation to multiple conditions (in Ada): if E 1 then C 1 elsif E 2 then C 2 … elsif E n then C n else C 0 end if; – if E 1, …, E i-1 all yield false but E i yields true, then C i is executed; otherwise C 0 is executed. E 1, …, E n must be of type Boolean

3-44 Case-commands (1)  In Ada: case E is when v 1 => C 1 … when v n => C n when others => C 0 end case; – if the value of E equals some v i, then C i is executed; otherwise C 0 is executed. v 1, …, v n must be distinct values of that type E must be of some primitive type other than Float

3-45 Case-commands (2)  In C and Java: switch ( E ) { case v 1 : C 1 … case v n : C n default: C 0 } – if the value of E equals some v i, then C i, …, C n, C 0 are all executed; otherwise only C 0 is executed. v 1, …, v n must be integers, not necessarily distinct E must be of integer type

3-46 Example: Ada case-command  Code: today: Date;  case today.m is when jan => put("JAN"); when feb => put("FEB");  when nov => put("NOV"); when dec => put("DEC"); end case;

3-47 Example: Java switch-command  Code: Date today;  switch (today.m) { case 1: System.out.print("JAN"); break; case 2: System.out.print("FEB"); break;  case 11: System.out.print("NOV"); break; case 12: System.out.print("DEC"); } breaks are essential

3-48 Iterative commands  An iterative command (or loop) repeatedly executes a subcommand, which is called the loop body.  Each execution of the loop body is called an iteration.  Classification of iterative commands: Indefinite iteration: the number of iterations is not predetermined. Definite iteration: the number of iterations is predetermined.

3-49 Indefinite iteration (1)  Indefinite iteration is most commonly supported by the while-command. Typical forms (Ada and C/Java): while E loopwhile ( E ) CC end loop;  Meaning (defined recursively): while E loop  if E then CC end loop;while E loop C end loop; end if;

3-50 Indefinite iteration (2)  Indefinite iteration is also supported in some PLs by the do-while-command. Typical form (C/Java): do C while ( E );  Meaning: do  C  C C if ( E ) {while ( E ) while ( E );do C C while ( E ); }

3-51 R must be of some primitive type other than Float Definite iteration (1)  Definite iteration is characterized by a control sequence, a predetermined sequence of values that are successively assigned (or bound) to a control variable.  Ada for-command: for V in R loop C end loop; – the control sequence consists of all values in the range R, in ascending order.

3-52 Definite iteration (2)  Java 1.5’s new-style for-command can iterate over an array, list, or set: for ( T V : E ) C – the control sequence consists of all component values of the array/list/set yielded by E.  NB: Java’s old-style for-command is just an abbreviation for a while-command (indefinite iteration): for ( C 1 ; E ; C 2 )  C 1 C 3 while ( E ) { C 3 C 2 }

3-53 Example: definite iteration over arrays  In Ada: dates: array ( … ) of Date; … for i in dates'range loop put(dates(i)); end loop;  In Java: Date[] dates; … for (int i = 0; i < dates.length; i++) System.out.println(dates[i]); for (Date dat : dates) System.out.println(dat); old- style new- style

3-54 Expressions with side effects  The primary purpose of evaluating an expression is to yield a value.  But in many imperative/OO PLs, evaluating an expression can also update variables – side effects.  In Ada, C, and Java, the body of a function is a command. If that command updates global variables, calling the function has side effects.  In C and Java, assignments are in fact expressions with side effects: “V = E” stores the value of E in V as well as yielding that value. Similarly “V  = E”.

3-55 Example: side effects in C  The C function getchar(f) reads a character and updates the file variable that f points to.  The following code is correct and concise: char ch; while ((ch = getchar(f)) != NUL) putchar(ch);  The following code is incorrect (why?): enum Gender {female, male}; Gender g; if (getchar(f) == 'F') g = female; else if (getchar(f) == 'M') g = male; else 

3-56 Implementation notes  Each variable occupies storage space throughout its lifetime. That storage space must be allocated at the start of the variable’s lifetime (or before), and deallocated at the end of the variable’s lifetime (or later).  The amount of storage space occupied by each variable depends on its type.  Assume that the PL is statically typed: all variables’ types are declared explicitly, or the compiler can infer them.

3-57 Storage for global and local variables (1)  A global variable’s lifetime is the program’s entire run- time. So the compiler can allocate a fixed storage space to each global variable.  A local variable’s lifetime is an activation of the block in which the variable is declared. The lifetimes of local variables are nested. So the compiler allocates storage space to local variables on a stack.

3-58 Storage for global and local variables (2)  At any given time, the stack contains several activation frames.  Each activation frame contains enough space for the local variables of a particular procedure. housekeeping data local variables  An activation frame is: pushed on to the stack when a procedure is called popped off the stack when the procedure returns.  Storage can be allocated to local variables of recursive procedures in exactly the same way.

3-59 Example: storage for global and local variables (1)  Outline of Ada program: procedure main is g1: Integer; g2: Float; begin … P; … Q; … end; procedure P is p1: Float; p2: Integer; begin … Q; … end; procedure Q is q: Integer; begin … end;

3-60 Example: storage for global and local variables (2)  Storage layout as the program runs: g2 g1 g2 g1 p1 p2 g2 g1 p1 p2 q g2 g1 p1 p2 g2 g1 g2 g1 q call P call Q return from Q return from P call Q

3-61 Storage for heap variables (1)  A heap variable’s lifetime starts when the heap variable is created and ends when it is destroyed or becomes unreachable. There is no pattern in their lifetimes.  Heap variables occupy a storage region called the heap. At any given time, the heap contains all currently-live heap variables, interspersed with unallocated storage space. When a new heap variable is to be created, some unallocated storage space is allocated to it. When a heap variable is to be destroyed, its storage space reverts to being unallocated.

3-62 Storage for heap variables (2)  A heap manager (part of the run-time system) keeps track of allocated and unallocated storage space.  If the programming language has no explicit deallocator, the heap manager must be able to find any unreachable heap variables. (Otherwise heap storage will eventually be exhausted.) This is called garbage collection.  A garbage collector must visit all heap variables in order to find the unreachable ones. This is time-consuming.  But garbage collection eliminates some common errors: omitting to destroy unreachable heap variables destroying heap variables that are still reachable.

3-63 Example: storage for heap variables (1)  Consider the Ada program on slides 3-24 and 3-25.  Storage layout as the program runs:

3-64 Example: storage for heap variables (2) call and return from B primes odds collect garbage call and return from A heap (initially unallocated) 2 3 5 7 2 3 5 7 2 3 7

3-65 Mark-scan garbage collection algorithm  To collect garbage: 1.For each variable v in the heap: 1.1.Mark v as unreachable. 2.For each pointer p in the stack: 2.1.Scan all variables that can be reached from p. 3.For each variable v in the heap: 3.1.If v is marked as unreachable: 3.1.1.Deallocate v.  To scan all variables that can be reached from p: 1.Let variable v be the referent of p. 2.If v is marked as unreachable: 2.1.Mark v as reachable. 2.2.For each pointer q in v: 2.2.1.Scan all variables that can be reached from q.

3-66 Representation of dynamic/flexible arrays (1)  The array indexing operation will behave unpredictably if the index value is out-of-range. To avoid this, in general, we need a run-time range check on the index value.  A static array’s index range is known at compile-time. So the compiler can easily generate object code to perform the necessary range check.  However, a dynamic/flexible array’s index range is known only at run-time. So it must be stored as part of the array’s representation: If the lower bound is fixed, only the length need be stored. Otherwise, both lower and upper bounds must be stored.

3-67 Representation of dynamic/flexible arrays (2)  Example (Ada): type Vector is array (Integer range <>) of Float; v1: Vector(1.. 4); v2: Vector(0.. 2); 1 0.0 2 0 upper lower 2 0 v2 1 1.0 2 3 0.5 5.0 3.5 4 upper lower 4 1 v1

3-68 Representation of dynamic/flexible arrays (3)  Example (Java): float[] v1 = new float[4]; float[] v2 = new float[3]; 1 0.0 2 0 length 3 tag float[] v2 1 1.0 2 3 0.5 5.0 3.5 length 4 0 tag float[] v1

3-1 © 2004, D.A. Watt, University of Glasgow 3 Variables and Storage  A simple storage model.  Simple and composite variables.  Copy semantics vs reference.

Similar presentations

Presentation on theme: "3-1 © 2004, D.A. Watt, University of Glasgow 3 Variables and Storage  A simple storage model.  Simple and composite variables.  Copy semantics vs reference."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

3-1 © 2004, D.A. Watt, University of Glasgow 3 Variables and Storage  A simple storage model.  Simple and composite variables.  Copy semantics vs reference.

Similar presentations

Presentation on theme: "3-1 © 2004, D.A. Watt, University of Glasgow 3 Variables and Storage  A simple storage model.  Simple and composite variables.  Copy semantics vs reference."— Presentation transcript:

Similar presentations

About project

Feedback