Types and Programming Languages Lecture 5 Simon Gay Department of Computing Science University of Glasgow 2006/07.

Slides:



Advertisements
Similar presentations
Types and Programming Languages Lecture 1 Simon Gay Department of Computing Science University of Glasgow 2006/07.
Advertisements

Types and Programming Languages Lecture 4 Simon Gay Department of Computing Science University of Glasgow 2006/07.
Computing Science 1P Large Group Tutorial 19 Simon Gay Department of Computing Science University of Glasgow 2006/07.
Types and Programming Languages Lecture 8 Simon Gay Department of Computing Science University of Glasgow 2006/07.
Types and Programming Languages Lecture 13 Simon Gay Department of Computing Science University of Glasgow 2006/07.
Types and Programming Languages Lecture 15 Simon Gay Department of Computing Science University of Glasgow 2006/07.
Types and Programming Languages Lecture 7 Simon Gay Department of Computing Science University of Glasgow 2006/07.
CS1Q Computer Systems Lecture 14
CPSC 388 – Compiler Design and Construction
1 Programming Languages (CS 550) Mini Language Interpreter Jeremy R. Johnson.
Semantics Static semantics Dynamic semantics attribute grammars
Intermediate Code Generation
1 Mooly Sagiv and Greta Yorsh School of Computer Science Tel-Aviv University Modern Compiler Design.
Winter Compiler Construction T7 – semantic analysis part II type-checking Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv.
Getting started with ML ML is a functional programming language. ML is statically typed: The types of literals, values, expressions and functions in a.
ML: a quasi-functional language with strong typing Conventional syntax: - val x = 5; (*user input *) val x = 5: int (*system response*) - fun len lis =
Compiler Construction
Abstract Syntax Trees Compiler Baojian Hua
Elaboration or: Semantic Analysis Compiler Baojian Hua
Cse321, Programming Languages and Compilers 1 6/19/2015 Lecture #18, March 14, 2007 Syntax directed translations, Meanings of programs, Rules for writing.
Chapter 2 A Simple Compiler
Cs164 Prof. Bodik, Fall Symbol Tables and Static Checks Lecture 14.
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Compilation (Chapter 3) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
Semantic Analysis (Generating An AST) CS 471 September 26, 2007.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Chapter Twenty-ThreeModern Programming Languages1 Formal Semantics.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 8: Semantic Analysis and Symbol Tables.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
Formal Semantics Chapter Twenty-ThreeModern Programming Languages, 2nd ed.1.
CS 153: Concepts of Compiler Design October 5 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
1 Languages and Compilers (SProg og Oversættere) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to Norm Hutchinson.
CS 153: Concepts of Compiler Design September 16 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Contextual Analysis (Chapter 5) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
CS536 Semantic Analysis Introduction with Emphasis on Name Analysis 1.
Types and Programming Languages Lecture 6 Simon Gay Department of Computing Science University of Glasgow 2006/07.
CS 153: Concepts of Compiler Design September 30 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
Types and Programming Languages Lecture 11 Simon Gay Department of Computing Science University of Glasgow 2006/07.
CMSC 330: Organization of Programming Languages Operational Semantics a.k.a. “WTF is Project 4, Part 3?”
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
 Fall Chart 2  Translators and Compilers  Textbook o Programming Language Processors in Java, Authors: David A. Watts & Deryck F. Brown, 2000,
Semantic Analysis II Type Checking EECS 483 – Lecture 12 University of Michigan Wednesday, October 18, 2006.
Types and Programming Languages
CS412/413 Introduction to Compilers Radu Rugina Lecture 13 : Static Semantics 18 Feb 02.
CPSC 388 – Compiler Design and Construction Parsers – Syntax Directed Translation.
CMSC 330: Organization of Programming Languages Operational Semantics.
Types and Programming Languages Lecture 10 Simon Gay Department of Computing Science University of Glasgow 2006/07.
CS412/413 Introduction to Compilers Radu Rugina Lecture 11: Symbol Tables 13 Feb 02.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
1 Languages and Compilers (SProg og Oversættere) Semantic Analysis.
ADTS, GRAMMARS, PARSING, TREE TRAVERSALS Lecture 13 CS2110 – Spring
Lecture 9 Symbol Table and Attributed Grammars
CS 153: Concepts of Compiler Design September 14 Class Meeting
ML: a quasi-functional language with strong typing
A Simple Syntax-Directed Translator
Constructing Precedence Table
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Semantic Analysis with Emphasis on Name Analysis
Languages and Compilers (SProg og Oversættere)
CS 536 / Fall 2017 Introduction to programming languages and compilers
Syntax-Directed Translation
CSE401 Introduction to Compiler Construction
Syntax-Directed Translation
CS 432: Compiler Construction Lecture 11
Course Overview PART I: overview material PART II: inside a compiler
Compiler Construction
6.001 SICP Interpretation Parts of an interpreter
Compiler Construction
Presentation transcript:

Types and Programming Languages Lecture 5 Simon Gay Department of Computing Science University of Glasgow 2006/07

Types and Programming Languages Lecture 5 - Simon Gay2 A Practical Interlude We want to understand how to convert the formal specification of a type system into an implemented typechecker. We will build typecheckers for the simple expression language and the simple functional language, and use them as the basis for implementations of more complex type systems later. The process is fairly straightforward, but we need to take care with some details: correct handling of variables and environments production of useful error messages. We will use Java for our implementations. You might find it interesting to look at Pierces implementations in OCaml.

2006/07Types and Programming Languages Lecture 5 - Simon Gay3 Implementing Typechecking Typechecking consists of traversing the AST, checking that the typing rules are obeyed. This requires establishing the type of each expression. Later stages of compilation require type information so the type of each expression must be stored. Variables must be matched with declarations - so scoping rules are also checked. The process is often called contextual analysis - perhaps this means more than typechecking, but well just refer to typechecking. The process of establishing the type of every expression is sometimes called elaboration.

2006/07Types and Programming Languages Lecture 5 - Simon Gay4 Example: Typechecking a Triangle Program The following simple program is written in Triangle, a Pascal-like language defined by Watt and Brown in their book Programming Language Processors in Java. let var n : Integer; var c : Char in begin c := &; n := n + 1 end

2006/07Types and Programming Languages Lecture 5 - Simon Gay5 Example: Typechecking a Triangle Program Its abstract syntax tree: Ident CharLitIdent OpIntLit nIntegercCharc&nn+1 SimpleT SimpleV VarDec VnameExprIntExpr BinaryExpression AssignCommand CharExpr AssignCommand SequentialCommandSequentialDeclaration LetCommand Program

2006/07Types and Programming Languages Lecture 5 - Simon Gay6 Example: Typechecking a Triangle Program Traversal: Ident CharLitIdent OpIntLit nIntegercCharc&nn+1 SimpleT SimpleV VarDec VnameExprIntExpr BinaryExpression AssignCommand CharExpr AssignCommand SequentialCommandSequentialDeclaration LetCommand Program

2006/07Types and Programming Languages Lecture 5 - Simon Gay7 Example: Typechecking a Triangle Program Ident CharLitIdent OpIntLit nIntegercCharc&nn+1 SimpleT SimpleV VarDec VnameExprIntExpr BinaryExpression AssignCommand CharExpr AssignCommand SequentialCommandSequentialDeclaration LetCommand Program : char : int Checking:

2006/07Types and Programming Languages Lecture 5 - Simon Gay8 Implementing Typechecking The details depend on the representation of ASTs, which in turn depends partly on the implementation language. For example, in a functional language we define a datatype corresponding to the abstract syntax of the language. In ML the datatype for SEL might look like this: datatype expr = IntLit of int | BoolLit of bool | Eq of expr * expr | Plus of expr * expr | And of expr * expr | Cond of expr * expr * expr

2006/07Types and Programming Languages Lecture 5 - Simon Gay9 Implementing Typechecking (ML example) The typechecker is a function from expr to … what? datatype ty = Int | Bool datatype typed_expr = IntLit of int | BoolLit of bool | Eq of typed_expr * typed_expr | Plus of typed_expr * typed_expr | And of typed_expr * typed_expr | Cond of typed_expr * typed_expr * typed_expr * ty We can define another datatype for elaborated ASTs. In general this must represent the type of every expression, but in SEL the only expression whose type is not obvious is the conditional: Compare this with Pierces approach.

2006/07Types and Programming Languages Lecture 5 - Simon Gay10 Implementing Typechecking (ML example) The typechecker is a function from expr to typed_expr * ty. fun check (IntLit n) = (IntLit n, Int) | check (BoolLit b) = (BoolLit b, Bool) | check (Eq(e,f)) = let val (e,t) = check e val (f,u) = check f in if (t = Int) andalso (u = Int) then (Eq(e,f), Bool) else error |... | check (Cond(c,e,f)) = let val (c,t) = check c val (e,u) = check e val (f,v) = check f in if (t = Bool) andalso (u = v) then (Cond(c,e,f,u), u) else error What do we do here? Also we need to consider different error cases.

2006/07Types and Programming Languages Lecture 5 - Simon Gay11 Implementing Typechecking Were going to use Java, which means that we use an object-oriented representation of ASTs we dont need to rebuild the elaborated AST because we can store type information by updating the original AST or another data structure we have more choice about how to implement AST traversal. The representation of ASTs uses a natural OO style: define an abstract class for each kind of phrase define a class for each specific way of constructing a phrase. Watt and Browns book describes this in detail.

2006/07Types and Programming Languages Lecture 5 - Simon Gay12 Classes for the Simple Expression Language abstract class Expr { } class IntLitExpr extends Expr { int value; } class BoolLitExpr extends Expr { boolean value; } class EqExpr extends Expr { Expr left, right; }

2006/07Types and Programming Languages Lecture 5 - Simon Gay13 Classes for the Simple Expression Language class PlusExpr extends Expr { Expr left, right; } class AndExpr extends Expr { Expr left, right; } class CondExpr extends Expr { Expr cond, then_br, else_br; }

2006/07Types and Programming Languages Lecture 5 - Simon Gay14 Implementing Tree Traversal: instanceof One possibility is to copy the functional language approach and implement a case-analysis on the class of an Expr object. Type check(Expr e) { if (e instanceof IntLitExpr) return representation of type int else if (e instanceof BoolLitExpr) return representation of type bool else if (e instanceof EqExpr) { Type t = check(((EqExpr)e).left); Type u = check(((EqExpr)e).right); if (t == representation of type int && u == representation of type int) return representation of type bool...

2006/07Types and Programming Languages Lecture 5 - Simon Gay15 Implementing Tree Traversal: instanceof This approach leads to a messy nested if, which cant be converted into a switch because Java has no mechanism for switching on the class of an object. Also this technique is not very object-oriented: instead of explicitly using instanceof, we prefer to arrange for analysis of an objects class to be done via the built-in mechanisms of overloading and dynamic method dispatch.

2006/07Types and Programming Languages Lecture 5 - Simon Gay16 Implementing Tree Traversal: Visitor Pattern A more object-oriented approach is to use the visitor design pattern. (See Watt and Brown for more details.) A visitor class implements the Visitor interface, and therefore contains a method for each kind of expression: interface Visitor { void visitIntLitExpr(IntLitExpr e); void visitBoolLitExpr(BoolLitExpr e); void visitEqExpr(EqExpr e); void visitPlusExpr(PlusExpr e); void visitAndExpr(AndExpr e); void visitCondExpr(CondExpr e); }

2006/07Types and Programming Languages Lecture 5 - Simon Gay17 Implementing Tree Traversal: Visitor Pattern The abstract class Expr contains a visit method: abstract class Expr { abstract void visit(Visitor v); } and each class defines visit so that the appropriate method from the Visitor object is called: class EqExpr extends Expr { Expr left, right; void visit(Visitor v) { v.visitEqExpr(this); } }

2006/07Types and Programming Languages Lecture 5 - Simon Gay18 Implementing Tree Traversal: Visitor Pattern The typechecker is defined as a class which implements the Visitor interface: class Checker implements Visitor { void visitIntLitExpr(IntLitExpr e) { store the type Int in association with e }... void visitCondExpr(CondExpr e) { e.cond.visit(this); e.then_br.visit(this); e.else_br.visit(this); inspect the types of cond, then_br, else_br, and store type of e }

2006/07Types and Programming Languages Lecture 5 - Simon Gay19 Implementing Typechecking: Tools If we want to implement a typechecker (for SEL or SFL, say) then we also need a parser. It is convenient to use an automated tool to generate as much as possible of the front-end machinery. We will use SableCC, a compiler construction tool developed at McGill University in Canada. SableCC is given an annotated grammar, and generates Java class definitions to represent syntax trees supporting the use of visitors a parser a more flexible (in some ways) version of the visitor pattern

2006/07Types and Programming Languages Lecture 5 - Simon Gay20 A SableCC Grammar for SEL A grammar for SEL, suitable for SableCC, begins with a specification of tokens: Package sel; Helpers digit = ['0'.. '9']; tab = 9; cr = 13; lf = 10; space = ' '; graphic = [[ ] + tab]; Tokens blank = (space | tab | cr | lf)* ; comment = '//' graphic* (cr | lf); int = digit digit*; plus = '+'; and = '&'; eq = '=='; if = 'if'; then = 'then'; else = 'else'; true = 'true'; false = 'false'; lparen = '('; rparen = ')'; Ignored Tokens blank, comment;

2006/07Types and Programming Languages Lecture 5 - Simon Gay21 A SableCC Grammar for SEL Followed by the productions: Productions expression = {term} term | {plus} [left]:term plus [right]:term | {and} [left]:term and [right]:term | {eq} [left]:term eq [right]:term | {cond} if [cond]:expression then [then_branch]:expression else [else_branch]:expression; term = {int_lit} int | {bool_lit} bool | {exp} lparen expression rparen; bool = {true} true | {false} false;

2006/07Types and Programming Languages Lecture 5 - Simon Gay22 A SableCC Grammar for SEL Exercise: Draw a parse tree for the expression 1 + (2 + 3). Why are the brackets necessary and why has the grammar been defined in a way that makes them necessary?

2006/07Types and Programming Languages Lecture 5 - Simon Gay23 Syntax Tree Classes for SEL abstract class PExpression extends Node {} For each non-terminal in the grammar, SableCC generates an abstract class, for example: where Node is a pre-defined class of syntax tree nodes which provides some general functionality. Similarly we get abstract classes PTerm and PBool. The names of these classes are systematically generated from the names of the non-terminals.

2006/07Types and Programming Languages Lecture 5 - Simon Gay24 Syntax Tree Classes for SEL For each production, SableCC generates a class, for example: class APlusExpression extends PExpression { PTerm _left_; PTerm _right_; public void apply(Switch sw) { ((Analysis) sw).caseAPlusExpression(this); } There are also set and get methods for _left_ and _right_, constructors, and other housekeeping methods which we wont use.

2006/07Types and Programming Languages Lecture 5 - Simon Gay25 Using SableCCs Visitor Pattern The main way of using SableCCs visitor pattern is to define a class which extends DepthFirstAdapter. By over-riding the methods inAPlusExpression or outAPlusExpression etc. we can specify code to be executed when entering or leaving each node during a depth first traversal of the syntax tree. If we want to modify the order of traversal then we can over-ride caseAPlusExpression etc. but this is often not necessary. The in and out methods return void, but the class provides HashTable in, out; which we can use to store types of expressions.

2006/07Types and Programming Languages Lecture 5 - Simon Gay26 Typechecking SEL We define class Checker extends DepthFirstAdapter and over-ride the out methods. We use the out Hashtable to store and retrieve the type of each expression, using methods setOut and getOut. We represent types by means of an abstract class Type with subclasses IntType and BoolType. Errors are added to an ErrorTable by creating an object of the right error class. At the end of typechecking, errors are reported.

2006/07Types and Programming Languages Lecture 5 - Simon Gay27 Typechecking SEL: PlusExpression public void outAPlusExpression(APlusExpression node) { Type leftType = (Type)getOut(node.getLeft()); Type rightType = (Type)getOut(node.getRight()); if (leftType != null) { if (!(leftType instanceof IntType)) { errorTable.add(node.getPlus().getLine(), new PlusLeftError(leftType.name()));}}; if (rightType != null) { if (!(rightType instanceof IntType)) { errorTable.add(node.getPlus().getLine(), new PlusRightError(rightType.name()));}}; if ((leftType instanceof IntType) && (rightType instanceof IntType)) { setOut(node, new IntType());}; }

2006/07Types and Programming Languages Lecture 5 - Simon Gay28 The SEL Typechecker An implementation of a typechecker for SEL can be found on the course web page. You should study the implementation, the accompanying notes, and Worksheet 3. Any questions about the implementation of the typechecker can be dealt with in a future tutorial.

2006/07Types and Programming Languages Lecture 5 - Simon Gay29 Implementing an SFL Typechecker An implementation of a typechecker for the Simple Functional Language can be found on the course web page and is described in the accompanying notes. You should study them in comparison with the SEL typechecker. The typechecker is based on the SEL typechecker, with two main differences: expressions are typechecked with respect to an environment, so we need an implementation of environments function definitions and function applications must be checked, and type information for functions must be stored in the environment. There are of course some changes to the grammar, including the fact that there is now syntax for the types int and bool.

2006/07Types and Programming Languages Lecture 5 - Simon Gay30 Implementing Environments An environment is essentially a lookup table, indexed by strings (identifier names) and containing two kinds of entry: variable with type function name with parameter types and result type We can use a Hashtable.

2006/07Types and Programming Languages Lecture 5 - Simon Gay31 Nested Scopes We must deal with nesting of scopes. Even though SFL does not have nested functions, there is still a global scope (containing type information for all functions) and a local scope within each function. The class Env implements a stack of Hashtables. To look up a variable or function name, first look in the Hashtable on top of the stack. If it is not there, keep looking down the stack. We will be able to use the same Env class for environments in languages with full scope nesting.

2006/07Types and Programming Languages Lecture 5 - Simon Gay32 Example: Nested Scopes { int x; bool b; { float x; int y; code…x…y…b… } code…x…b… } x : float, y : int x : int, b : bool search this way openScope( ) creates a new Hashtable on the stack closeScope( ) removes the top Hashtable put(String n, EnvEntry e), get(String n)

2006/07Types and Programming Languages Lecture 5 - Simon Gay33 Mutual Recursion The SEL typechecker makes a single traversal of the syntax tree. If we want to typecheck SFL in a single pass, then in order to support mutually recursive functions we need to follow Pascal: function f(x:int):int; forward; function g(x:int):int begin g := f(x); end; function f(x:int):int begin f := g(x); end; or Standard ML: fun f(x:int) = g(x) and g(x:int) = f(x) Instead, to stick closely to the formal definition of SFL, we use two passes: the first just looks at function definitions and builds an initial environment containing their type information.

2006/07Types and Programming Languages Lecture 5 - Simon Gay34 Making SFL More Powerful We have a formal definition of the syntax, operational semantics and type system of SFL and we have proved that the type system is sound. Our design of the language itself was rather ad hoc, and we have seen that functions in SFL lack flexibility. To make SFL look more like a real functional language, we need to build on a suitable theoretical foundation: the lambda calculus ( calculus). When we have seen how to introduce functions properly, well go on to look at structured data types (e.g. records).

2006/07Types and Programming Languages Lecture 5 - Simon Gay35 Exercise for Tutorial The aim of next weeks tutorial is to ensure that you understand the SEL typechecker. Please work through the exercises, which have the following main tasks, in advance. Using SableCC to build the syntax tree classes for SEL, then compiling and testing the typechecker. Understanding the structure of directories and files containing the SableCC-generated classes, the Checker class, and the error-reporting mechanism. Adding a new operator to the SEL grammar, using SableCC to rebuild the generated classes, extending Checker and defining appropriate new error classes. In the tutorial we will discuss these exercises and any further details of the SEL example.