Presentation on theme: "Topic 3 -Binding Time and Symbol Tables Dr. William A. Maniatty Assistant Prof. Dept. of Computer Science University At Albany CSI 511 Programming Languages."— Presentation transcript:
Topic 3 -Binding Time and Symbol Tables Dr. William A. Maniatty Assistant Prof. Dept. of Computer Science University At Albany CSI 511 Programming Languages and Systems Concepts Fall 2002 Monday Wednesday 2:30-3:50 LI 99
Introduction to Binding Binding refers to associating an entity with a value, such as Variable name with address0 Result of expression with ephemeral storage Constant with its value Seperately compiled function with address
Binding Time Binding time refers to when entities are associated with their values is made.
Design Binding Times There are extra binding times available to programming language designers. Language Design Time - Choose fundamental primitives, reserved words, etc. Compiler/Interpreter Implementation Time -How to internally represent language constructs. Programming Time - Language users pick the algorithms and data structures.
Object - What does it mean? The word Object has many meanings in program languages. Object Module -A compiled (but not linked) module of a program. Object (OOP sense) - An instance of a class in Object Oriented Programming. Object (Programming Language Sense) - The entities which are bound to values. Use the programming language for now.
Binding Time Design Issues Late binding of objects indicates that interpreters. Dynamic Type Systems Care needs to be taken to avoid ambiguity when binding. Name Space Collisions Polymorphism (Overloading)
Object Attributes Objects have many attributes Lifetime (Persistence) Type Scope Value/Address Language should: Precisely specify attributes Be Orthogonal -Separate Controls
Object Persistence vs. Lifetime Persistence -Persistant objects last longer than the process that created it. Examples - Files, databases. Memory for nonpersistent objects is called volatile (you lose data if powered down). Lifetime - When is the storage allocated to an object available?
Events Impacting Object Lifetime Life Time has several aspects. Creation of objects Creation of bindings References to variables/subroutines/types/etc. (Re)activation and Deactivation of bindings Destruction of bindings Destruction of Objects
Allocation and Object Lifetime How can objects be allocated? Statically -Exist during Program's Lifetime Stack -Used for ephemeral objects and ephemeral objects. Heap Objects -Have controlled lifetimes Deallocation: How is it indicated? Explicitly - Destructors/free/delete Implicitly - Garbage collection Initialization - Separate (Constructors)
Static Allocation Done at compile time Literals (and constants) bound to values Variables bound to addresses Compiler notes undefined symbols Library functions Global Variables and System Constants Linker (and loader if DLLs used) resolve undefined references.
Stack Based Allocation Stack Layout determined at compile time Variables bound to offsets from top of stack. Layout called stack frame or activation record Compilers use registers Function parameters and results need consistent treatment across modules C/C++ use prototypes Eiffel/Java/Oberon use single definition
Parameter Passing Conventions Actual Parameters -at the call site Formal Parameters - at the subroutine declaration Address - a memory location, data objects containing addresses can be called: Pointer - use explicit dereferencing operation. Reference - use implicit dereferencing.
Parameter Passing Conventions Call by value - Copy to the function Call by reference - Pass reference Call by address - Pass address to function Call by result - Pass result back to caller Call by value result - Copy inputs to the function and copy results to caller. Parameters can be on stack or in registers.
Call Site Code Generation for Stack Allocation Call Setup Push Register Values on stack (if caller saves) Push parameters on stack (or load into registers) Call Function Push Return Address on stack Goto Function's Start Address Call Cleanup (if caller saves)
Subroutine Code Generation for Stack Allocation Prologue -Push Registers that will be overwritten on stack (if callee saves) Body of function Call Cleanup (if caller saves) Copy results (if any) Pop Parameters off stack. Pop registers Return
Stack and Frame Layout Stack here grows toward low addresses.
Heap Allocation Heap provides dynamic memory management. Not to be confused with binary heap or binomial heap data structures. Under the hood, may periodically need to request additional memory from the O/S. Requested large regions (requests are expensive). Done using a library (e.g. C) Or as part of the language (C++, Java, Lisp).
Heap Data Structures Must track allocated/Free Memory. Metadata is added (pointers, size, etc).
Memory Management Holes can form where memory is freed. Coalesce adjacent holes Small holes fragment the memory. Suppose you allocate a smaller chunk, which hole do we take it from? First fit - The first hole found that it fits into Best fit - The smallest segment it fits into Worst fit - The largest segment it fits into
When to Free Memory Depends on language. Explicit deallocation -needed for library approaches (e.g. C). Implicit Deallocation - aka garbage collection Garbage is unreferenced memory. Compaction moves allocated memory to contiguous addresses (coalescing all holes). Can cause timing variations (care is needed in real time systems).
Speeding Up Searching for a Free Block Recall all fitting scheme require finding sufficiently large blocks. Idea: Organize Free List according to block size. Fibbonacci Heap - Use Fibbonacci numbers for block sizes. Buddy System -Use Block sizes of 2 k
Introduction to Scope Scope refers to the region of a program during which a binding is active. Consider the following code segment, what should the output be?
Scope Rules Two popular answers to the problem. Static (lexical) scope -Use compile time analysis. Normally in block structured languages, the containing scope is preferred, output is 1 in this case. Dynamic Scope -Value found at run time by resolving to nearest stack frame in which the value is defined, output is 2 in this case. Lexical scope is more popular.
Variants of Static Scope Single Global Scope (BASIC) - simplest Global and Local (Fortran) Fortran Common Blocks Supports separate compilation Gives base address of region Each program specifies (possibly different) layout Block Structured (Pascal)
Modules and Separate Compilation Modules support encapsulation (much like classes). Found in Modula 2, Euclid, Oberon and Ada. For separate compilation define interfaces (data and subroutines) Export statements - published interfaces Import statements - uses published interfaces Classes are extensions of modules
More Notation Fundamental question: Does the scope need to be explicitly imported to be visible? Yes - Referred to as closed scope. No - Referred to as open scope. Aliasing -having more than one way to refer to the same object.
Classes and Scope Classes provide encapsulation in object oriented programming (OOP). Supports aggregating heterogeneous data and operations together. Interfaces are published C++ public section in classes Internals can be hidden (ala private section in C++) Constructors and destructors supported.
OOP Features I think of OOP as providing Encapsulation -groups data with operations Inheritance -permits extension of more general base classes (and overriding behaviors) Polymorphism (overloading) - allows operators/subroutines to have behaviors dependent on the types of arguments and results expected.
Dynamic Scope Dynamic scoping prefers the instance defined in the most recently invoked function. Not very popular currently (hard to debug) Found in interpreted languages (APL, older Lisp dialects, e.g. EMACS Lisp). Fans claim that it makes customizing subroutines easier.
Symbol Table Design Criteria Symbol tables require: Fast insertion Fast lookup Occasional deletion (should be fast). Which motivates the use of hash tables. But ordinary hash tables are not good with nesting (ala classes/records/subroutines)
Operations on Symbol Tables (Static Scope) A Symbol Table should support: Entering Scope Leaving Scope Inserting a symbol (with scope information) Looking up a symbol (with scope information) It is often useful to store symbol table in object/executables e.g. For debugging or source level analysis
LeBlanc-Cook Symbol Table Lookup 1/5 LeBlanc-Cook Symbol Table Lookup Each Scope is assigned a serial number Elements are never deleted from the table A Scope Counter is maintained The first scope is 0 Every new scope encountered increments the counter To track nesting, a scope stack is maintained. Push to enter scope, pop when leaving scope
LeBlanc-Cook Symbol Table Lookup 2/5 Put all symbols in a single hash table. Keywords not inserted (can use another hash). Entries indexed using both name and scope. To lookup a name Look in the hash table for (name,scope) pair. If not found: Parent scope is found using stack Test if parent scope is open or exports symbol
LeBlanc-Cook Symbol Table Lookup 3/5 About Hashing and Hash Functions: Is the universe of keys known in advance? Yes - perfect minimal hashing may be possible. No - must handle collisions e.g. Quadratic Rehash or Chaining Symbol Table Algorithm has to handle collisions if hashing is used.
Dynamic Scope and Symbol Table Management Dynamic scope has different symbol table management needs than static scope Needs insert, lookup, enter scope, leave scope. Just like static scope Competing Approaches: simplicity vs. speed Association Lists -Simple, fast scope entry/exit. Central Reference Table -Like Leblanc-Cook sans reference stack. Faster Lookup (common case?), slower scope entry/exit.
Association Lists Association Lists (A-Lists) combine list and stack treatment. When a new scope is entered Push its symbols on the stack Use a unidirectional linked list to implement stack. To find an item Scan stack starting at top of stack. When leaving a scope Pop all symbols in scope from the stack.
Central Reference Tables (1) Central Reference Tables use hashing Elements are keyed by symbol Each element is a stack So we have one stack per symbol Newest Scope is on top Use a unidirectional linked list to implement stack.
Central Reference Tables (2) To insert a symbol/scope Hash on symbol, push symbol/scope on stack. To find a symbol in a scope Hash to symbol's stack Use scope at top of stack. When leaving scope Pop all symbols in that scope from top of their respective stacks.
Resolving Static Scope at Run Time Consider a function F containing G. i.e. F and G are nested functions Suppose G uses an identifier in F's scope. How can G find F's frame pointer at run time? If G is always invoked by F, just do base + offset Called static chaining - offset computed at compile time. But what if G is separated by recursive invocations Use pointer jumping (exploit transitivity and associativity) Called dynamic chaining - requires run time support
Subroutine Closures Consider when a function, F, is passed as an argument to another function, G E.g. Comparison Operators for sorting When G invokes F, how can we determine the scope? Subroutine closures describe a function's scope and instruction space address
Overloading Defined An overloaded function or operator selects its semantics based on the types of its parameters and result Implicit overloading - provided by language e.g. addition in Pascal can handle real or integers Write and Writeln in Pascal Explicit overloading - programmers resolve actions e.g. Overloaded operators and methods in C++
Some thoughts on Overloading Should user defined overloading of operators be permitted? Pro: Permits consistent interface e.g. A = B * C; good for integer, real, complex... Cons: You may need to read the entire program to understand a single line of code. e.g. A = B * C; What if B and C are objects? Inheritance? What to do with ephemeral objects? e.g A * B * C
More Thoughts Meyer's Eiffel overloads A(i) Single parameter function Single index array Because functions and arrays are often interchangeable! Operator vs. function overloading Operator - Syntactic Sugar Function - Programmers know to read code
Challenges of Overloading Compiler needs to be smart about types Separate compilation hard e.g. Unix Linker - Predates C++ Name Space Mangling Can break system tools (profilers/debuggers) Compiler creates a unique name based on operator/function name and parameter/result types. No standard defined Hard to link code compiled by different compilers
Templates Templates in C++ are used for container classes. The base type describes elements in the container. The base type is a parameter to the template passed when instantiated (or in a typedef). Makes separate compilation hard Typically interface needs to be compiled by both publisher and user (header files)
Templates Pros and Cons Templates promote code reuse But also promotes compiled code bloat Recovering from syntax errors is hard! Make a small STL error, get pages of errors And the error messages are not helpful! Vandevoorde's Xroma - Have template developer give compiler hints (also for code generation).
Summary Binding associates names and values Scope rules govern which name binds to which value in the event that a name is reused. Naming combined with type information permits overloading (promoting code reuse).