Chap. 8, Declaration Processing and Symbol Tables J. H. Wang Dec. 13, 2011.

Chap. 8, Declaration Processing and Symbol Tables J. H. Wang Dec. 13, 2011

Outline Constructing a Symbol Table Block-Structured Languages and Scopes Basic Implementation Techniques Advanced Features Declaration Processing Fundamentals Variable and Type Declarations Class and Method Declarations An Introduction to Type Checking

Constructing a Symbol Table We walk (make a pass over) the AST for two purposes –To process symbol declarations –To connect each symbol reference with its declaration An AST node is enriched with a reference to the name’s entry in the symbol table

Static Scoping Static scope: includes its defining block as well as any contained blocks that do not contain a declaration for the identifier Global scope: a name space shared by all compilation units Scopes might be opened and closed by braces ({ } as in C and Java), or by reserved keywords ( begin and end as in Ada, Algol)

A Symbol Table Interface Methods –OpenScope() –CloseScope() –EnterSymbol(name, type) –RetreiveSymbol(name) –DeclaredLocally(name) Ex. –(Fig. 8.2) Code to build the symbol table for the AST in Fig. 8.1

Block-Structured Languages and Scopes Block-structured languages: languages that allow nested name scopes –Concepts introduced by Algol 60 Handling Scopes – Current scope: the innermost context – Open scopes (or currently active scopes): the current scope and its surrounding scopes – Closed scopes: all other scopes

Some common visibility rules –Accessible names are those in the current scope and in all other open scopes –If a name is declared in more than one scope, then a reference to the name is resolved to the innermost declaration –New declarations can be made only in the current scope Global scope –Extern: in C –Public static: in Java

Compilation-unit scope: in C and C++ –Declared outside of all methods Package-level scope: in Java Every function definition is available in the global scope, unless it has static attribute In C++ and Java, names declared within a class are available to all methods in the class –Protected members are available to the class’s subclasses Names declared within a statement-block are available to all contained blocks, unless it’s redeclared in an inner scope

One Symbol Table or Many? Two common approaches to implementing block-structured symbol tables –A symbol table associated with each scope –Or a single, global table

An Individual Table for Each Scope Because name scope are opened and closed in a last-in first-out (LIFO) manner, a stack is an appropriate data structure for a search –The innermost scope appears at the top of stack –OpenScope(): pushes a new symbol table –CloseScope(): pop –(Fig. 8.3) Disadvantage –Need to search a name in a number of symbol tables –Cost depending on the number of nonlocal references and the depth of nesting

One Symbol Table All names in the same table –Uniquely identified by the scope name or depth RetrieveSymbol() need not chain through scope tables to locate a name More details in Sec.8.3.3 –(Fig. 8.8)

Basic Implementation Techniques Entering and Finding Names The Name Space An Efficient Symbol Table Implementation

Entering and Finding Names Examine the time needed to insert symbols, retrieve symbols, and maintain scopes –In particular, we pay attention to the cost of retrieving symbols –Names can be declared no more than once in each scope, but typically referenced multiple times Various approaches –Unordered list –Ordered list –Binary search trees –Balanced trees –Hash tables

Unordered List Simplest –Array –Linked list or resizable array All symbols in a given scope appear adjacently –Insertion: fast –Retrieval: linear scan Impractically slow

Ordered List Binary search: O(log n) –How to organize the ordered lists for a name in multiple scopes? An ordered list of stacks (Fig. 8.4) –RetrieveSymbols() locates stacks using binary search –CloseScope() examines each stack and pops those stacks whose top symbol is declared in the abandoned scope To avoid such checking, we maintain a separate linking of symbol table entries that are declared at the same scope level (Sec.8.3.3) Fast retrieval, but expensive insertion –Advantageous when the space of symbols is known Reserved keywords

Binary Search Trees Insert, search: O(log n), given random inputs –Average-case performance does not necessarily hold for symbol tables –Programmers do not choose identifiers at random! Advantage –Simple, widely known implementation

Balanced Trees The worst-case scenario for binary trees can be avoided if the tree is balanced –E.g.: red-black trees, splay trees –Insert, search: O(log n)

Hash Tables Most common, due to its excellent performance –Insert, search: O(1), given A sufficiently large table A good hash function Appropriate collision-handling techniques –(Sec.8.3.3)

The Name Space Properties to consider –The name of a symbol does not change during compilation –Symbol names persist throughout compilation –Great variance in the length of identifier names –Unless an ordered list is maintained, comparisons of symbol names involve only equality and inequality In favor of one logical name space (Fig. 8.5)

Names are inserted, but never deleted Two fields –Origin –Length

An Efficient Symbol Table Implementation A symbol table entry containing –Name –Type –Hash –Var –Level –Depth

Two index structures –Hash table –Scope display Symbols at the same level –(Fig. 8.7) & (Fig. 8.8)

Advanced Features Extensions of the simple symbol table framework to accommodate advanced features of modern programming languages –Name augmentation (overloading) –Name hiding and promotion –Modification of search rules

Records and Typenames Aggregate data structures –struct, record E.g. a.b.c.d –C, Ada, Pascal: completely specifying the containers and the field –COBOL, PL/I: intermediate containers can be omitted if the reference can be unambiguously resolved »a.c, c.d: more difficult to read Can be nested arbitrarily deeply –Tree typedef: alias for a type –Symbol table

Overloading and Type Hierarchies Method overloading allowed in object-oriented languages such as C++ and Java –If each definition has a unique type signature Number and types of the parameters and return type –E.g.: print(int), print(String) –To view a method definition not only in terms of its names but also its type signature To encode type signature of a method along with its name –E.g.: M(int): void To record a method along with a list of its overloaded definitions

Operator overloading: allowed in C++, Ada Ada allows literals to be overloaded –E.g.: diamond in two different enumeration types: as a playing card, and as a gem Pascal, Fortran allow the same symbol to represent the invocation of a method and the value of the method’s result –Two entries in the symbol table C: the same name as a local variable, a struct name, and a label

–E.g.: (in Ex. 17) main() { struct xxx { int a, b; } c; int xxx; xxx: c.a = 1; } Type extension through subclassing allowed in Java, C++ –resize(Shape) vs. resize(Rectangle)

Implicit Declarations In some languages, the appearance of a name in a certain context serves to declare the name as well –E.g.: labels in C –In Fortran: inferred from the identifier’s first letter –In Ada: an index is implicitly declared to be of the same type as the range specifier –A new scope is opened for the loop so that the loop index cannot clash with an existing variable E.g. for (int i=1; i<10; i++) { … }

Export and Import Directives Export: some local scope names are to become visible outside that scope –Typically associated with modularization features such as Ada packages, C++ classes, C compilation units, and Java classes Java: public attribute, String class in java.lang package C: all methods are known outside unless the static attribute is specified In a large software system, the space of global names can become polluted and disorganized –C, C++: Header files –Java: import –Ada: use

Altered Search Rules To alter the way in which symbols are found in symbol table –In Pascal: with R do S First try to resolve an identifier as a field of the record R Advantageous if R is a complex name Can usually generate efficient code Forward reference in recursive data structures or methods –A portion of the program will reference a definition that has not yet been processed –Must be announced in some languages

Symbol Table Summary The symbol table organization in this chapter efficiently represents scope- declared symbols in a block-structured language Most languages include rules for symbol promotion to a global scope Issues such as inheritance, overloading, and aggregate data types must be considered

Declaration Processing Fundamentals Attributes in the symbol table –Internal representations of declarations –Identifiers are used in many different ways in a modern programming language Variables, constants, types, procedures, classes, and fields Every identifier will not have the same set of attributes –We need a data structure to store the variety of information Using a struct that contains a tag, and a union for each possible value of the tag Using object-based approach, Attributes and appropriate subclasses

Type Descriptor Structures

Type Checking Using an Abstract Syntax Tree Using the visitor pattern (in Chap. 7) –SemanticsVisitor: a subclass of Visitor The top-level visitor for processing declarations and doing semantic checking on the AST nodes –TopDeclVisitor A specialized visitor invoked by SemanticsVisitor for processing declarations –TypeVisitor A specialized visitor used to handle an identifier that represents a type or a syntactic form that defines a type (such as an array)

Variable and Type Declarations Simple variable declarations –A type name and a list of identifiers (Fig. 8.12) Visitor actions: (Fig. 8.13)

Visit Method

Handling Type Names

Type Declarations A name and a description of the type to be associated with it –(Fig. 8.15) –Visit method: (Fig. 8.16)

Thanks for Your Attention!

Chap. 8, Declaration Processing and Symbol Tables J. H. Wang Dec. 13, 2011.

Similar presentations

Presentation on theme: "Chap. 8, Declaration Processing and Symbol Tables J. H. Wang Dec. 13, 2011."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chap. 8, Declaration Processing and Symbol Tables J. H. Wang Dec. 13, 2011.

Similar presentations

Presentation on theme: "Chap. 8, Declaration Processing and Symbol Tables J. H. Wang Dec. 13, 2011."— Presentation transcript:

Similar presentations

About project

Feedback