Names, Bindings, Type Checking, and Scopes

Names, Bindings, Type Checking, and Scopes
Chapter 5 Names, Bindings, Type Checking, and Scopes

Chapter 5 Topics Introduction Names Variables The Concept of Binding
Type Checking Strong Typing Type Compatibility Scope and Lifetime Referencing Environments Named Constants

5.1 Introduction Imperative languages are abstractions of von Neumann architecture Memory Processor Variables characterized by attributes “Type”, most important To design, must consider scope, lifetime, type checking, initialization, and type compatibility

5.2 Names Associated with labels, subprograms, formal parameters, etc.
Design issues for names: Maximum length? Are connector characters allowed? Are names case sensitive? Are special words reserved words or keywords?

Names (continued) Length If too short, they cannot be connotative
Language examples: FORTRAN I: maximum 6 COBOL: maximum 30 FORTRAN 90 and ANSI C: maximum 31 Ada and Java: no limit, and all are significant C++: no limit, but implementers often impose one

Names (continued) Forms
A letter followed by a string consisting of letters, digits, and underscore characters(_). In C-based languages, the underscore is replaced by the “camel” notation (e.g. “myStack”).

Names (continued) Case sensitivity
C, C++, and Java names are case sensitive. The names in other languages are not. e.g. in C++, rose, ROSE, Rose are different Disadvantage: readability (names that look alike are different) worse in C++ and Java because predefined names are mixed case (e.g. IndexOutOfBoundsException)

Names (continued) Special words An aid to readability
A keyword is a word that is special only in certain contexts, e.g., in Fortran Real Apple (Real is a data type followed with a name, therefore Real is a keyword) Real = 3.4 (Real is a variable) A reserved word is a special word that cannot be used as a user-defined name E.g. in Java – string, int, main, public, void … Predefined names – names defined in other program units, such as Java packages, and C and C++ libraries. Can be made visible to a program only if explicitly imported. Once imported, they cannot be redefined.

5.3 Variables A variable is an abstraction of a memory cell
Variables can be characterized as a sextuple of attributes: Name Address Value Type Lifetime Scope

Variables Attributes Name - not all variables have them, e.g. a pointer pointing to an integer Address - the memory address with which it is associated Variables with the same name may have different addresses at different times during execution e.g. void sub(){ int count; … while (…) { count++; }

Variables Attributes If two variable names can be used to access the same memory location, they are called aliases E.g. x = max(a, b); … int max(int m, int n){ } Aliases are created via pointers, reference variables, C and C++ unions Two pointer variables are aliases when they point to the same memory location Aliasing allows a variable to have its value changed by an assignment to a different variable. It is harmful to readability (program readers must remember all of them)

Variables Attributes (continued)
Type - determines the range of values of variables and the set of operations that are defined for values of that type; E.g. Java int: to Value - the contents of the memory cells with which the variable is associated Memory cells – abstract memory cells E.g. a floating point value may occupy 4 physical bytes, we think of it as occupying a single abstract memory cell

5.4 The Concept of Binding The l-value of a variable is its address, because that is required when a variable appears in the left side of an assignment statement. The r-value of a variable is its value To access the r-value, the l-value must be determined first A binding is an association, such as between an attribute and an entity, or between an operation and a symbol Binding time is the time at which a binding takes place.

Possible Binding Times
Language design time -- bind operator symbols to operations, e.g. * means multiplication Language implementation time-- bind floating point type to a representation Compile time -- bind a variable to a type in C or Java Load time -- bind a FORTRAN 77 variable to a memory cell (or a C static variable) Runtime -- bind a nonstatic local variable to a memory cell

Possible Binding Times
count = count + 5; The type of count is bound at ________ time The set of possible values of count is bound at ________________time The meaning of the operator symbol + is bound at ___________time, when the type of its operands have been determined The internal representation of the literal 5 is bound at ________________time The value of count is bound at ___________time with this statement compile compiler design compile compiler design execution

Static and Dynamic Binding
A binding is static if it first occurs before run time and remains unchanged throughout program execution. A binding is dynamic if it first occurs during execution or can change during execution of the program

Type Binding Before a variable can be referenced, it must be bound to a data type. How is a type specified? When does the binding take place?

Variable Declarations
explicit declaration: a program statement that lists variable names and specifies that they are a particular type implicit declaration: a default mechanism for specifying types of variables (the first appearance of the variable in the program) Most languages require explicit declaration FORTRAN, PL/I, BASIC, JavaScript, vbScript, and Perl provide implicit declarations

e.g. vbScript dim a, b a = 10 b = “Hello!” … function m(){ m = result } Advantage: writability, convenient in programming Disadvantage: reliability e.g. variables that are accidentally left undeclared by the programmer are given default types and unexpected attributes

Less problem with Perl If a name begins with $: it is a scalar, which can store either a string or a numeric value @: it is an array %: it is an hash structure

Dynamic Type Binding A variable is bound to a type when it is assigned a value in an assignment statement (JavaScript, vbScript and PHP) Specified through an assignment statement e.g., JavaScript list = [2, 4.33, 6, 8]; list = 17.3; Advantage: flexibility (generic program units) Disadvantages: High cost (dynamic type checking and interpretation) Type error detection by the compiler is difficult Has to be implemented using pure interpreters

Type Inference Rather than by assignment statement, types are determined from the context of the reference (ML, Miranda, and Haskell) E.g. fun circumf(r) = *r*r; - r and result are floating point fun time10(x) = 10 * x - x and result are int

Storage Bindings & Lifetime
Allocation - getting a cell from some pool of available cells De-allocation - putting a cell back into the pool The lifetime of a variable is the time during which it is bound to a particular memory cell Scalar (unstructured) variables can be separated into four categories: static, stack-dynamic, explicit heap-dynamic and implicit heap-dynamic Storage allocation -

Categories of Variables by Lifetimes
Static bound to memory cells before execution begins and remains bound to the same memory cell throughout execution, e.g., all FORTRAN 77 variables, C static variables, global variables Advantages: efficiency (direct addressing, see supplementary material- memory addressing “ history-sensitive subprogram support Disadvantage: lack of flexibility (no recursion), storage cannot be shared among variables

C and C++ static variables (see supplementary material “Using the Static Keyword-

Stack-dynamic Storage bindings are created for variables when their declaration statements are elaborated (gets executed at run time). Types are statically bound Does not matter where the declarations occur Allocated from run-time stack E.g. local variables In Java, C++ and C#, variables defined in methods are by default stack-dynamic Advantage: allows recursion; conserves storage Disadvantages: Overhead of allocation and de-allocation Subprograms cannot be history-sensitive Inefficient references (indirect addressing)

Explicit heap-dynamic Allocated and de-allocated by explicit directives, specified by the programmer, which take effect during execution Heap is a collection of storage cells whose organization is highly disorganized because of the unpredictability of its use Referenced only through pointers or references variables, e.g. dynamic objects in C++ (via new and delete), all objects in Java Has two variables associated with it: a pointer or reference variable through which the heap-dynamic variable can be accessed, and the heap-dynamic variable itself e.g. int *intnode; //create a pointer that points to an interger intnode = new int; //create the heap-dynamic variable delete intnode; //de-allocate the heap-dynamic variable

Java: all data except the primitive scalars are objects. Java objects are explicitly heap-dynamic and are accessed through reference variables. Implicit garbage collection is used C#: has both explicit heap-dynamic and stack-dynamic objects, all of which are implicitly de-allocated. Also supports C++ style pointers to interoperate with C and C++ components. Advantage: provides for dynamic storage management Disadvantage: inefficient and unreliable

Implicit heap-dynamic Bound to heap storage only when they are assigned values all variables in APL; all strings and arrays in Perl and JavaScript E.g. list = [3.5, 4.1] list = 47 Advantage: flexibility Disadvantages: Inefficient, because all attributes are dynamic Loss of error detection

5.5 Type Checking Type checking is the activity of ensuring that the operands of an operator are of compatible types Generalize the concept of operands and operators to include subprograms and assignments E.g. int a, b, x; a = 4; b = 5; x = max(4,5); … float max(float m, float n { …} string result; int a, b; a = 4; b = 5; result = a + b;

Type Checking A compatible type is one that is either legal for the operator, or is allowed under language rules to be implicitly converted, by compiler- generated code, to a legal type This automatic conversion is called a coercion. e.g. int a = 2; float b = 1.5 b = a + b; a = a + b; A type error is the application of an operator to an operand of an inappropriate type

Type Checking Static type checking – consistency can be checked before the program is run. All variables must be given a type Dynamic type checking – type checking at run time. Dynamic type binding. If all type bindings are static, nearly all type checking can be static If type bindings are dynamic, type checking must be dynamic (e.g. JavaScript) Advantage of static type checking – potential problems can be identified earlier Advantage of dynamic type checking - flexibility

5.6 Strong Typing A programming language is strongly typed if type errors are always detected, either at compile time or run time Fortran 95: not strongly typed. The use of “Equivalence” allows a variable of one type to refer to a value of different type Ada: nearly strong typed. Allows programmers to suspend type checking for a particular type conversion C, C++: not strongly typed. “Union” types are not type checked Java, C#: strongly typed Languages with a great deal of coercion, like Fortran, C and C++ are significantly less reliable than those with little coercion, such as Ada, Java and C#.

5.7 Type Compatibility Name type compatibility means the two variables have compatible types if they are in either the same declaration or in declarations that use the same type name e.g. int a, b; and int a; int b;

Type Compatibility Easy to implement but highly restrictive:
Subranges of integer types are not compatible with integer types e.g. Formal parameters must be the same type as their corresponding actual parameters. If a structured type is passed among subprograms through parameters, such a type must be defined only once, globally. A subprogram cannot state the type of such formal parameters in local terms type Indextype is count: Integer index: Indextype

Type Compatibility Structure type compatibility means that two variables have compatible types if their types have identical structures e.g. More flexible, but harder to implement. Entire structures of the two types must be compared. The comparison is not always simple Apple { string color; float weight; string type; } Pear { string color; float weight; string type; } Apple a = new Apple(); Pear p = new Pear();

Type Compatibility Consider the problem of two structured types:
Are two record types compatible if they are structurally the same but use different field names? Are two array types compatible if they are the same except that the subscripts are different? (e.g. [1..10] and [0..9]) Are two enumeration types compatible if their components are spelled differently? With structural type compatibility, you cannot differentiate between types of the same structure e.g. type celsius is new Float; type fahrenheit is new Float; Although compatible in structure, but shouldn’t be.

Type Compatibility Ada: uses name type compatibility, but provides two type constructs, subtypes and derived types. A derived type is a new type that is based on some previously defined type with which it is incompatible, although it may have identical structure e.g. type celsius is new Float; type fahrenheit is new Float; derived from Float, incompatible with each other and with Float A subtype is a possibly range-constrained version of an existing type. Compatible with its parent type. e.g. subtype Small_type is Integer range 0..99; variables of Small_type is compatible with Integer

Type Compatibility C: uses structure type compatibility for all types except structures and unions C++ uses name type compatibility

5.8 Scope The scope of a variable is the range of statements over which it is visible A variable is visible in a statement if it can be referenced in that statement The nonlocal variables of a program unit are those that are visible but not declared there

Scope Static Scope Binding names to nonlocal variables
The scope of a variable can be determined prior to execution Two categories of static-scoped languages Subprograms can be nested (Ada, JavaScript, PHP) Subprograms cannot be nested (C-based languages)

Scope To decide the scope, the reader must find the declaration of the variable Search process: search declarations, first locally, then in increasingly larger enclosing scopes, until one is found for the given name, e.g. find the scope of X in Sub1 Procedure Big is X: Integer; procedure Sub1 is begin … X … end; procedure Sub2 is X: Integer; begin … end; begin … end

Scope Enclosing static scopes (to a specific scope) are called its static ancestors; the nearest static ancestor is called a static parent Variables can be hidden from a unit by having a "closer" variable with the same name. e.g. Legal in C and C++, illegal in Java and C# void sub(){ int count; while (…){ int count: count++; } }

Scope In Ada, hidden variables from ancestor scopes can be accessed with selective references. E.g. Big.X C-based languages do not allow subprograms to be nested inside other subprogram definition, but they have global variables. These variables are declared outside any subprogram definition. Local variables can hide these globals. In C++, such hidden variables can be accessed using the scope operator ::, e.g. class_name::variableName

Blocks A method of creating a section of code to have its own local variables whose scope is minimized. Such a section of code is called a block. Variables are stack dynamic, so they have their storage allocated when the section is entered and de-allocated when the section is exited. C-based languages allow any compound statements (a statement sequence surrounded by matched braces) to have declarations and thus define a new scope. Blocks are treated exactly like those created by subprograms. References to variables in a block that are not declared there are connected to declarations by searching enclosing scopes.

Blocks C and C++: for (...) { int index; ... }
Examples: C and C++: for (...) { int index; ... } Ada: declare LCL : FLOAT; begin end C++ allows variable definitions to appear anywhere in functions. When a definition appears at a position other than the beginning of a function, but not within a block, that variable’s scope is from its definition to the end of the function. In C, all data declarations in a function but not in blocks within the function must appear at the beginning of the function. In classes of C++, Java and C#, instance variables’ scope is the whole class. The scope of a variable defined in a method starts at the definition.

Evaluation of Static Scoping
Static scoping provides nonlocal access, but also has problems. Assume MAIN calls A and B A calls C and D B calls A and E MAIN E A C D B

Static Scope Example Desired Calls Potential Calls MAIN MAIN A B A B C

Static Scope (continued)
Suppose the spec is changed so that E must now access some data in D Solutions: Put E in D (but then E can no longer access B) Move the data from D that E needs to MAIN (but then all procedures can access them - possibility of incorrect access) Overall: static scoping often encourages many global variables

Dynamic Scope Based on calling sequences of program units, not their textual layout Determined at run time References to variables are connected to declarations by searching back through the chain of subprogram calls that forced execution to this point

Scope Example MAIN calls SUB1 SUB1 calls SUB2 SUB2 uses x Main
- declaration of x SUB1 - declaration of x - ... call SUB2 SUB2 - reference to x - call SUB1 … MAIN calls SUB1 SUB1 calls SUB2 SUB2 uses x

Scope Example Static scoping Dynamic scoping
Reference to x is to MAIN's x Dynamic scoping Reference to x is to SUB1's x

Evaluation of Dynamic Scoping
Problems From when subprogram begins till the execution ends, the local variables of the subprogram are all visible to any other executing subprogram. There is not way to protect local variables. Less reliable than static scoping. Inability to statically type check references to nonlocals Poor readability since it’s based on calling sequence Longer access time Merit No need to pass parameters

Scope and Lifetime Scope and lifetime are sometimes closely related, but are different concepts Consider a static variable in a C or C++ function A static variable declared within a function Scope: function Lifetime: entire execution of the program Another example: void printheader(){…} void compute(){ int sum; … printheader(); } Scope of sum: compute() Lifetime: compute() + printheader()

Referencing Environments
The referencing environment of a statement is the collection of all names that are visible in the statement In a static-scoped language, it is the local variables plus all of the visible variables in all of the enclosing scopes

Procedure Example is A. B: Integer; … procedure Sub1 is X, Y : Integer begin …  point 1 end; procedure Sub2 is X : Integer procedure Sub3 is …  point 2 …  point 3 Referencing environments Point 1: X and Y of Sub1, A and B of Example Point 2: X of Sub3, (X of Sub2 is hidden), A and B of Example Point 3: X of Sub2, A and B of

A subprogram is active if its execution has begun but has not yet terminated In a dynamic-scoped language, the referencing environment is the local variables plus all visible variables in all active subprograms

Main calls sub2, which calls sub1 void sub1(){ int a, b; … < point 1 } void sub2(){ int b, c; … < point 2 sub1(); void main(){ int c, d; … < point 3 sub2(); Referencing environments Point 1: a and b of sub1, c of sub2, d of main (c of main and b of sub2 are hidden) Point 2: b and c of sub2, d of main (c of main is hidden) Point 3: c and d of main

Named Constants A named constant is a variable that is bound to a value only when it is bound to storage Named constants vs. static variable Static variable: initiated once, value can change Named constants: initiated once, value cannot change Advantages: readability and modifiability use of PI instead of the constant used to parameterize programs

Named Constants void example (){ int [] intList = new int [100];
String[] strList = new String[100]; … for (index = 0; index < 100; index++){…} average = sum/100; … } final int len = 100; int [] intList = new int [len]; String[] strList = new String[len]; for (index = 0; index < len; index++){…} average = sum/ len;

Named Constants The binding of values to named constants can be either static (called manifest constants) or dynamic FORTRAN 90: constant-valued expressions (manifest constants) Ada, C++, and Java: expressions of any kind E.g. const int result = 2 * width + 1;

Variable Initialization
The binding of a variable to a value at the time it is bound to storage is called initialization Initialization is often done on the declaration statement, e.g., in Java int sum = 0;

Names, Bindings, Type Checking, and Scopes

Similar presentations

Presentation on theme: "Names, Bindings, Type Checking, and Scopes"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Names, Bindings, Type Checking, and Scopes

Similar presentations

Presentation on theme: "Names, Bindings, Type Checking, and Scopes"— Presentation transcript:

Similar presentations

About project

Feedback