Semantic Analysis (Generating An AST) CS 471 September 26, 2007.

Slides:



Advertisements
Similar presentations
Abstract Syntax Mooly Sagiv html:// 1.
Advertisements

Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,
9/27/2006Prof. Hilfinger, Lecture 141 Syntax-Directed Translation Lecture 14 (adapted from slides by R. Bodik)
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
1 Semantic Processing. 2 Contents Introduction Introduction A Simple Compiler A Simple Compiler Scanning – Theory and Practice Scanning – Theory and Practice.
Abstract Syntax Trees Compiler Baojian Hua
Context-Free Grammars Lecture 7
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
CS 330 Programming Languages 09 / 23 / 2008 Instructor: Michael Eckmann.
Chapter 2 A Simple Compiler
Abstract Syntax Mooly Sagiv html://
Lexical and syntax analysis
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
Parser construction tools: YACC
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
Semantic Analysis CS 671 February 5, CS 671 – Spring The Compiler So Far Lexical analysis Detects inputs with illegal tokens –e.g.: main$
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
CISC 471 First Exam Review Game Questions. Overview 1 Draw the standard phases of a compiler for compiling a high level language to machine code, showing.
1 Semantic Analysis Aaron Bloomfield CS 415 Fall 2005.
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 8: Semantic Analysis and Symbol Tables.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Syntax and Semantics Structure of programming languages.
Abstract Syntax Mooly Sagiv Schrierber Wed 10:00-12:00 html://
Chapter 2. Design of a Simple Compiler J. H. Wang Sep. 21, 2015.
CPS 506 Comparative Programming Languages Syntax Specification.
Abstract Syntax Trees Compiler Baojian Hua
Introduction Lecture 1 Wed, Jan 12, The Stages of Compilation Lexical analysis. Syntactic analysis. Semantic analysis. Intermediate code generation.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Chapter 1 Introduction Study Goals: Master: the phases of a compiler Understand: what is a compiler Know: interpreter,compiler structure.
1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.
. n COMPILERS n n AND n n INTERPRETERS. -Compilers nA compiler is a program thatt reads a program written in one language - the source language- and translates.
YACC. Introduction What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (.y file) YACC (Yet Another.
Muhammad Idrees, Lecturer University of Lahore 1 Top-Down Parsing Top down parsing can be viewed as an attempt to find a leftmost derivation for an input.
Introduction CPSC 388 Ellen Walker Hiram College.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
Yacc. Yacc 2 Yacc takes a description of a grammar as its input and generates the table and code for a LALR parser. Input specification file is in 3 parts.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Syntax and Semantics Form and Meaning of Programming Languages Copyright © by Curt Hill.
C H A P T E R T W O Linking Syntax And Semantics Programming Languages – Principles and Paradigms by Allen Tucker, Robert Noonan.
1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.
Chap. 7, Syntax-Directed Compilation J. H. Wang Nov. 24, 2015.
LECTURE 3 Compiler Phases. COMPILER PHASES Compilation of a program proceeds through a fixed series of phases.  Each phase uses an (intermediate) form.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
Syntax-Directed Definitions CS375 Compilers. UT-CS. 1.
MiniJava Compiler A multi-back-end JIT compiler of Java.
1 Topic 4: Abstract Syntax Symbol Tables COS 320 Compiling Techniques Princeton University Spring 2016 Lennart Beringer.
YACC Primer CS 671 January 29, CS 671 – Spring Yacc Yet Another Compiler Compiler Automatically constructs an LALR(1) parsing table from.
Parser Generation Tools (Yacc and Bison) CS 471 September 24, 2007.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
Lecture 9 Symbol Table and Attributed Grammars
Compiler Design (40-414) Main Text Book:
A Simple Syntax-Directed Translator
Constructing Precedence Table
Introduction to Parsing
CS510 Compiler Lecture 4.
Introduction to Parsing (adapted from CS 164 at Berkeley)
PROGRAMMING LANGUAGES
CS 536 / Fall 2017 Introduction to programming languages and compilers
Basic Program Analysis: AST
Syntax-Directed Translation
CSE 3302 Programming Languages
Compilers B V Sai Aravind (11CS10008).
Abstract Syntax Prabhaker Mateti 1.
Programming Languages 2nd edition Tucker and Noonan
Chapter 10: Compilers and Language Translation
Presentation transcript:

Semantic Analysis (Generating An AST) CS 471 September 26, 2007

CS 471 – Fall Semantic Analysis Source code Lexical Analysis Parsing Semantic Analysis Valid programs: decorated AST lexical errors syntax errors semantic errors tokens AST

CS 471 – Fall Goals of a Semantic Analyzer Compiler must do more than recognize whether a sentence belongs to the language… Find all possible remaining errors that would make program invalid –undefined variables, types –type errors that can be caught statically Figure out useful information for later phases –types of all expressions –data layout

CS 471 – Fall Semantic Actions Can do useful things with the parsed phrases –Each terminal and nonterminal may be associated with type, e.g. exp: INT type is int –For rule: A  B C D Type must match A Value can be built with BCD

CS 471 – Fall Semantic Actions Semantic action executed when grammar production is reduced Recursive-descent parser: semantic code interspersed with control flow Yacc: fragments of C code attached to a grammar production

CS 471 – Fall Interpreter Could develop an interpreter that executes the program as part of the semantic actions! Example Grammar: E  id E  E + E E  E – E E  E * E E  -E

CS 471 – Fall Unions in Yacc %union allows us to declare a union datatype used to package the types/attributes of symbols %union { int pos; int ival; string sval; struct { int intval; enum Types valtype; } constantval; A_exp exp; } Exported as YYSTYPE

CS 471 – Fall Types in Yacc Using the values of union structs, tell Yacc the types Terminals %token ID STRING %token INT %token COMMA SEMI LBRACE RBRACE … And Nonterminals (use %type) %type expression program LHS of productiontype

CS 471 – Fall Symbols in Yacc The symbol $n (n > 0) refers to the attribute of nth symbol on the RHS The symbol $$ refers the attribute of the LHS The symbol $n (n  0) refers to contextual information Note: actions in middle contribute as a symbol! expr: expr1 PLUS expr2 $$ $1 $3

CS 471 – Fall Interpreter in Yacc %{ declarations of yylex and yyerror %} %union {int num; string id} % token INT % token ID % type exp % start exp %left PLUS MINUS %left TIMES %left UMINUS % [please fill in solution] E  id E  E + E E  E – E E  E * E E  -E Recall expr : expr1 PLUS expr2 $$ $1 $3

CS 471 – Fall Internally: A Semantic Stack Implemented using a stack parallel to the state stack StackInputAction * 3 $shift INT: * 3 $ reduce exp: * 3 $shift exp: 1 +: 2 * 3 $shift exp: 1 +: INT: 2 * 3 $ reduce exp: 1 +: exp: 2 3 $shift exp: 1 +: exp: 2 *: $ shift exp: 1 +: exp: 2 *: INT: 3$ reduce exp: 1 +: exp: 2 *: exp: 3 $ reduce exp: 1 +: exp: 6 $ reduce exp: 7 $ accept

CS 471 – Fall Inlined TypeChecker and CodeGen You can even type check and generate code: expr : expr PLUS expr { if ($1.type == $3.type && ($1.type == IntType || $1.type == RealType)) $$.type = $1.type else error(“+ applied on wrong type!”); GenerateAdd($1, $3, $$); }

CS 471 – Fall Problems Difficult to read Difficult to maintain Compiler must analyze program in order parsed Instead … we split up tasks

CS 471 – Fall Compiler ‘main program’ void Compile() { TokenStream l = Lexer(input); AST tree = Parser(l); if (TypeCheck(tree)) IR ir = genIntermediateCode(tree); emitCode(ir); }

CS 471 – Fall Thread of control Input Stream Lexer Parser characters tokens AST compile parse getToken readStream AST

CS 471 – Fall Producing the Parse Tree Separates issues of syntax (parsing) from issues of semantics (type checking, translation to machine code) One leaf for every token One internal node for every reduction during parsing Concrete parse tree represents concrete syntax But … parse tree has problems Punctuation tokens redundant Structure of the tree conveys this info Enter the Abstract Syntax Tree

CS 471 – Fall AST Abstract Syntax Tree is a tree representation of the program. Used for –semantic analysis (type checking) –some optimization (e.g. constant folding) –intermediate code generation (sometimes intermediate code = AST with somewhat different set of nodes) Compiler phases = recursive tree traversals

CS 471 – Fall Do We Need An AST? Old-style compilers: semantic actions generate code during parsing Problems: hard to maintain limits language features not modular! expr ::= expr PLUS expr {: emitCode(add); :} input parser code stack

CS 471 – Fall Interesting Detour Old compilers didn’t create ASTs … not enough memory to store entire program Can also see reasons for C requiring forward declarations - avoids an extra compilation pass

CS 471 – Fall Positions In one pass compiler – errors reported using position of the lexer as approximation (global var) Abstract syntax data structures must have pos fields Line number Char number Line number is unambiguous Char number is a matter of style

CS 471 – Fall Abstract Syntax for Tiger /* absyn.h */ typedef struct A_var_ * A_var; struct A_var_ { enum {A_simpleVar,A_fieldVar,A_subscriptVar}kind; A_pos pos; union {S_symbol simple; struct {A_var var; S_symbol sym;} field; struct {A_var var; A_exp exp;} subscript; } u; };

CS 471 – Fall More Syntax (Constructors…p.98) A_var A_SimpleVar(A_pos pos, S_symbol sym); … A_exp A_WhileExp(A_pos pos, A_exp test, A_exp body); … A_expList A_ExpList(A_exp head, A_expList tail);

CS 471 – Fall Tiger Program (a := 5; a+1) translates to: A_SeqExp(2, A_ExpList(A_AssignExp(4, A_SimpleVar(2, S_Symbol(“a”)), A_IntExp(7,5)), A_ExpList((A_OpExp(11,A_plusOp, A_VarExp(A_SimpleVar(10, S_Symbol(“a”))),A_IntExp(12,1))), NULL))) AssignExp choose column of “:=“ for pos OpExp choose column of “+” for pos

CS 471 – Fall Some Odd Tiger Features Tiger allows mutually recursive declarations: let var a + 5 function f() : int = g(a) function g(i: int) = f() in f() end Thus: FunctionDec constructor takes a list of functions

CS 471 – Fall Correlation to Yacc (and your project) (Demo) Checklist 1.Detailed look at the Tiger AST (absyn.h) 2.Edit tiger.grm 3.The Tiger Language Manual PA3 and PA4 make heavy use of it Follow the structure to generate your yacc file