C Chuen-Liang Chen, NTUCS&IE / 11 A SIMPLE COMPILER Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University.

Slides:



Advertisements
Similar presentations
CH4.1 Type Checking Md. Fahim Computer Engineering Department Jamia Millia Islamia (A Central University) New Delhi –
Advertisements

Chapter 2-2 A Simple One-Pass Compiler
Abstract Syntax Mooly Sagiv html:// 1.
Compilers and Language Translation
Compilers: Parse Tree/9 1 Compiler Structures Objective – –extend the expressions language compiler to generate a parse tree for the input program,
Lexical and Syntactic Analysis Here, we look at two of the tasks involved in the compilation process –Given source code, we need to first break it into.
CPSC Compiler Tutorial 9 Review of Compiler.
Chapter 2 Chang Chi-Chung Lexical Analyzer The tasks of the lexical analyzer:  Remove white space and comments  Encode constants as tokens.
Chapter 2 A Simple Compiler
Chapter 2 Chang Chi-Chung Lexical Analyzer The tasks of the lexical analyzer:  Remove white space and comments  Encode constants as tokens.
Yu-Chen Kuo1 Chapter 2 A Simple One-Pass Compiler.
CH2.1 CSE4100 Chapter 2: A Simple One Pass Compiler Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371.
1 1 Chapter 2 A Simple Compiler Prof Chung. 1 NTHU SSLAB7/2/2015.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 3 Lexical and Syntactic Analysis Syntactic.
Invitation to Computer Science 5th Edition
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
1 Chapter 2 A Simple Compiler. 2 Outlines 2.1 The Structure of a Micro Compiler 2.2 A Micro Scanner 2.3 The Syntax of Micro 2.4 Recursive Descent Parsing.
Compilers: topDown/5 1 Compiler Structures Objective – –look at top-down (LL) parsing using recursive descent and tables – –consider a recursive.
Compilers: Attr. Grammars/8 1 Compiler Structures Objective – –describe semantic analysis with attribute grammars, as applied in yacc and recursive.
1 Week 4 Questions / Concerns Comments about Lab1 What’s due: Lab1 check off this week (see schedule) Homework #3 due Wednesday (Define grammar for your.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
Lexical Analysis - An Introduction. The Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source.
1 Top Down Parsing. CS 412/413 Spring 2008Introduction to Compilers2 Outline Top-down parsing SLL(1) grammars Transforming a grammar into SLL(1) form.
The TINY sample language and it’s compiler
C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Topic #2: Infix to Postfix EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
D. M. Akbar Hussain: Department of Software & Media Technology 1 Compiler is tool: which translate notations from one system to another, usually from source.
Lexical and Syntax Analysis
C Chuen-Liang Chen, NTUCS&IE / 51 CONTEXT-FREE GRAMMARS Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University.
Chapter 2. Design of a Simple Compiler J. H. Wang Sep. 21, 2015.
CPS 506 Comparative Programming Languages Syntax Specification.
Compilers: lex analysis/2 1 Compiler Structures Objective – –what is lexical analysis? – –look at a lexical analyzer for a simple 'expressions'
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
D Goforth COSC Translating High Level Languages Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126.
Muhammad Idrees, Lecturer University of Lahore 1 Top-Down Parsing Top down parsing can be viewed as an attempt to find a leftmost derivation for an input.
Chapter 1 Introduction Major Data Structures in Compiler
Week 6(10.7): The TINY sample language and it ’ s compiler The TINY + extension of TINY Week 7 and 8(10.14 and 10.21): The lexical of TINY + Implement.
INTRODUCTION TO COMPILERS(cond….) Prepared By: Mayank Varshney(04CS3019)
Syntax (2).
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
C Chuen-Liang Chen, NTUCS&IE / 35 SCANNING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University Taipei,
CPSC 388 – Compiler Design and Construction Parsers – Syntax Directed Translation.
Lesson 4 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
©SoftMoore ConsultingSlide 1 Lexical Analysis (a.k.a. Scanning)
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
Compiler Principle and Technology Prof. Dongming LU Apr. 15th, 2015.
CS510 Compiler Lecture 1. Sources Lecture Notes Book 1 : “Compiler construction principles and practice”, Kenneth C. Louden. Book 2 : “Compilers Principles,
LEX & Yacc Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
Chapter 2: A Simple One Pass Compiler
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 12–Compilers.
System Software Theory (5KS03).

Compiler Design (40-414) Main Text Book:
Programming Languages 2nd edition Tucker and Noonan
Constructing Precedence Table
Programming Languages Translator
Short introduction to compilers
Compiler Designs and Constructions
Lexical and Syntactic Analysis
Chapter 2: A Simple One Pass Compiler
Govt. Polytechnic,Dhangar
Designing a Predictive Parser
Compiler Structures 2. Lexical Analysis Objectives
9. Creating and Evaluating a
Lexical and Syntax Analysis
Presentation transcript:

c Chuen-Liang Chen, NTUCS&IE / 11 A SIMPLE COMPILER Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, TAIWAN

c Chuen-Liang Chen, NTUCS&IE / 12 Structures of compilers (2/3) calling tree (1 pass) machine code main parser scanner semantic routines optimizer symbol table attribute table token SS : syntactic structure (parse tree) source code SS pass 1 code generator

c Chuen-Liang Chen, NTUCS&IE / 13 Language specification grammar 1.  begin end 2.  { } 3.  ID := ; 4.  read ( ) ; 5.  write ( ) ; 6.  ID {, ID } 7.  {, } 8.  { } 9.  ( ) 10.  ID 11.  INTLITERAL 12.    SCANEOF  Backus-Naur form (BNF) IDletter { letter | digit | underline } * INTLITERALdigit digit * comment- - anything EOL

c Chuen-Liang Chen, NTUCS&IE / 14 Tokens sequence of characters having a collective meaning example 1.  begin end 2.  { } 3.  ID := ; 4.  read ( ) ; 5.  write ( ) ; 6.  ID {, ID } 7.  {, } 8.  { } 9.  ( ) 10.  ID 11.  INTLITERAL 12.    SCANEOF

c Chuen-Liang Chen, NTUCS&IE / 15 Scanner (1/3) called by parser, usually to group input characters into tokens type of tokens -- begin end read write identifier integer ( ) ;, + - :=  excluding -- comment, blank, tab,... QUIZ: benefit ? – QUIZ: benefit ?  including -- End-Of-File QUIZ: if exclude EOF, then... ? – QUIZ: if exclude EOF, then... ? key issues  do not read too many  how to distinguish different identifiers (integers) ?  how to recognize begin end read write from identifiers ? comments  ungetc() -- for lookahead  buffer_char() -- save in_char into token buffer  check_reserved() -- check whether token in buffer is a reserved word & return BEGIN, END, READ, WRITE, or ID (token code) –BEGIN, END, READ, WRITE and ID are integer constants, usually

c Chuen-Liang Chen, NTUCS&IE / 16 Scanner (2/3) #include /* character classification macros */ #include extern char token_buffer[ ]; token scanner(void) { int in_char, c; clear_buffer(); if (feof(stdin)) return SCANEOF; while ((in_char = getchar()) != EOF) { if (isspace(in_char)) continue; /* do nothing */ else if ( ??? ) { ??? } else lexical_error(in_char); } else if (isalpha(in_char)) { /* * ID ::= LETTER| ID LETTER *| ID DIGIT *| ID UNDERSCORE */ buffer_char(in_char); for (c = getchar(); isalnum(c) || c == '_'; c = getchar()) buffer_char(c); ungetc(c, stdin); return check_reserved(); }

c Chuen-Liang Chen, NTUCS&IE / 17 Scanner (3/3) else if (isdigit(in_char)) { /* * INTLITERAL ::=DIGIT | *INTLITERAL DIGIT */ buffer_char(in_char); for (c = getchar(); isdigit(c); c = getchar()) buffer_char(c); ungetc(c, stdin); return INTLITERAL; } else if (in_char == '(') return LPAREN; else if (in_char == ')') return RPAREN; else if (in_char == ';') return SEMICOLON; else if (in_char == ',') return COMMA; else if (in_char == '+') return PLUSOP; else if (in_char == ':') { /* looking for ":=" */ c = getchar(); if (c == '=') return ASSIGNOP; else { ungetc(c, stdin); lexical_error(in_char); } } else if (in_char == '-') { /* is it --, comment start */ c = getchar(); if (c == '-') { do in_char = getchar(); while (in_char != '\n'); } else { ungetc(c, stdin}; return MINUSOP; }

c Chuen-Liang Chen, NTUCS&IE / 18 Parser (1/5) main program of a compiler (analysis part, at least) to check structure by context-free grammar recursive decent parsing  left-hand-side –one nonterminal one routine  right-hand-side –one nonterminal one routine call –one terminal one “match”  not work for all context-free grammar comments  match() -- call scanner; if match: OK, skip this token; else error handling  next_token() -- just see the next token, not skip (lookahead)

c Chuen-Liang Chen, NTUCS&IE / 19 Parser (2/5) void system_goal(void) { /* ::= SCANEOF */ program(); match(SCANEOF); } void program(void) { /* ::= BEGIN END */ match(BEGIN) statement_list(); match(END); } void statement_list(void) { /* ::= { } */ statement(); while (TRUE) { switch (next_token()) { case ID: case READ: case WRITE: statement(); break; default: return; } QUIZ: Why ID, READ, WRITE ?

c Chuen-Liang Chen, NTUCS&IE / 20 Parser (3/5) void statement(void) { token tok = next_token(); switch (tok) { case ID: /* ::= ID := ; */ match(ID); match(ASSIGNOP); expression(); match(SEMICOLON); break; case READ: /* ::= READ ( ) ; */ match(READ); match(LPAREN); id_list(); match(RPAREN); match(SEMICOLON); break; case WRITE: /* ::= WRITE ( ) ; */ match(WRITE); match(LPAREN); expr_list(); match(RPAREN); match(SEMICOLON); break; default: syntax_error(tok); break; }

c Chuen-Liang Chen, NTUCS&IE / 21 Parser (4/5) void id_list(void) { /* ::= ID {, ID } */ match(ID); while (next_token() == COMMA) { match(COMMA); match(ID); } void expression(void) { /* ::= { } */ token t; primary(); for (t = next_token(); t == PLUSOP || t == MINUSOP; t = next_token()) { add_op(); primary(); } void expr_list(void) { /* ::= {, } */ expression(); while (next_token() == COMMA) { match(COMMA); expression(); } void add_op(void) { /* ::= PLUSOP I MINUSOP */ token tok = next_token(); if (tok == PLUSOP || tok == MINUSOP) match(tok); else syntax_error(tok); }

c Chuen-Liang Chen, NTUCS&IE / 22 Parser (5/5) void primary(void) { token tok = next_token(); switch (tok) { case LPAREN: /* ::= ( ) */ match(LPAREN); expression(); match(RPAREN); break; case ID: /* ::= ID */ match(ID); break; case INTLITERAL: /* ::= INTLITERAL */ match(INTLITERAL); break; default: syntax_error(tok); break; }

c Chuen-Liang Chen, NTUCS&IE / 23 Action symbols to determine when to call semantic routines 1.  #start begin end 2.  { } 3.  := #assign ; 4.  read ( ) ; 5.  write ( ) ; 6.  #read_id {, #read_id } 7.  #write_expr {, #write_expr } 8.  { #gen_infix } 9.  ( ) 10.  11.  INTLITERAL #process_literal 12.  + #process_op 13.  - #process_op 14.  ID #process_id 15.  SCANEOF #finish  possibly, with some modifications

c Chuen-Liang Chen, NTUCS&IE / 24 Semantic record to keep semantic information associated with grammar symbol #define MAXIDLEN 33 typedef char string[MAXIDLEN]; /* for operators */ typedef struct operator { enum op { PLUS, MINUS } operator; } op_rec; /* for and */ enum expr { IDEXPR, LITERALEXPR, TEMPEXPR }; typedef struct expression { enum expr kind; union { string name;/* for IDEXPR, TEMPEXPR */ int val;/* for LITERALEXPR */ }; } expr_rec;

c Chuen-Liang Chen, NTUCS&IE / 25 Temporary using Temp&1, Temp&2,... char *get_temp(void) { /* max temporary allocated so far */ static int max_temp = 0; static char tempname[MAXIDLEN]; max_temp++; sprintf(tempname, "Temp&%d", max_temp); check_id(tempname); return tempname; }

c Chuen-Liang Chen, NTUCS&IE / 26 Parser + semantic routines void expression(void) { token t; /* ::= { } */ primary(); for (t = next_token(); t == PLUSOP || t == MINUSOP; t = next_token()) { add_op(); primary(); } void expression (expr_rec *result) { expr_rec left_operand, right_operand; op_rec op; /* ::= { #gen_infix } */ primary(&left_operand) while (next_token() == PLUSOP || next_token() == MINUSOP) { add_op(&op); primary(&right_operand); left_operand = gen_infix(left_operand, op, right_operand); } *result = left_operand; } QUIZ: where is syntatic structure?

c Chuen-Liang Chen, NTUCS&IE / 27 to determine when to call semantic routines 1.  #start begin end 2.  { } 3.  := #assign ; 4.  read ( ) ; 5.  write ( ) ; 6.  #read_id {, #read_id } 7.  #write_expr {, #write_expr } 8.  { #gen_infix } 9.  ( ) 10.  11.  INTLITERAL #process_literal 12.  + #process_op 13.  - #process_op 14.  ID #process_id 15.  SCANEOF #finish possibly, with some modifications Action symbols

c Chuen-Liang Chen, NTUCS&IE / 28 Semantic routines (1/3) to produce targat language (quadruple intermediate file) comments  generate() -- produce output  extract() -- get semantic information void start(void) { /* Semantic initializations, none needed. */ } void finish(void) { /* Generate code to finish program. */ generate("Halt", "", "", ""); }

c Chuen-Liang Chen, NTUCS&IE / 29 Semantics routines (2/3) expr_rec process_id(void) { /* Declare ID and build a corresponding semantic record. */ expr_rec t; check_id(token_buffer); t.kind = IDEXPR; strcpy(t.name, token_buffer); return t; } void read_id(expr_rec in_var) { /* Generate code for read. */ generate("Read", in_var.name, "Integer", ""); } expr_rec process_literal(void) { /* Convert literal to a numeric represen- tation and build semantic record. */ expr_rec t; t.kind = LITERALEXPR; (void) sscanf(token_buffer, "d", &t.val); return t; } op_rec process_op(void) { /* Produce operator descriptor. */ op_rec o; if (current_token == PLUSOP) o.operator = PLUS; else o.operator = MINUS; return o; }

c Chuen-Liang Chen, NTUCS&IE / 30 Semantics routines (3/3) expr_rec gen_infix(expr_rec e1, op_rec op, expr_rec e2) { /* * Generate code for infix operation. * Get result temp and set up semantic * record for result. */ expr_rec e_rec; /* An expr_rec with temp variant set. */ e_rec.kind = TEMPEXPR; strcpy(e_rec.name, get_temp()); generate(extract(op), extract(e1), extract(e2), e_rec.name); return e_rec; } void write_expr(expr_rec out_expr) { /* Generate code for write. */ generate("Write", extract(out_expr), "Integer", ""); } void assign(expr_rec target, expr_rec source) { /* Generate code for assignment. */ generate("Store", extract(source), target.name, ""); }

c Chuen-Liang Chen, NTUCS&IE / 31 Symbol table just for space allocation /* Is s in the symbol table? */ extern int lookup(string s); /* Put s unconditionally into symbol table. */ extern void enter(string s); void check_id(string s) { if (! lookup(s)) { enter(s); generate("Declare", s, "Integer", ""); }

c Chuen-Liang Chen, NTUCS&IE / 32 Tracing example (1/2) Step Parser ActionRemaining InputGenerated Code begin A:=BB-314+A; end SCANEOF (1)Call system_goal()begin A:=BB-314+A; end SCANEOF (2)Call program()begin A:=BB-314+A; end SCANEOF (3)Semantic Action: start() begin A:=BB-314+A; end SCANEOF (4)match(BEGIN)A:=BB-314+A; end SCANEOF (5)Call statement_list()A:=BB-314+A; end SCANEOF (6)Call statement()A:=BB-314+A; end SCANEOF (7)Call ident()A:=BB-314+A; end SCANEOF (8)match(ID):=BB-314+A; end SCANEOF (9)Semantic Action: process_id() :=BB-314+A; end SCANEOFDeclare A,lnteger (10)match(ASSIGNOP)BB-314+A; end SCANEOF (11)Call expression()BB-314+A; end SCANEOF (12)Call primary()BB-314+A; end SCANEOF (13)Call ident()BB-314+A; end SCANEOF (14)match(ID)-314+A; end SCANEOF (15)Semantic Action: process_id() -314+A; end SCANEOFDeclare BB,lnteger (16)Call add_op()-314+A; end SCANEOF (17)match(MINUSOP)314+A; end SCANEOF (18)Semantic Action: process_op() 314+A; end SCANEOF

c Chuen-Liang Chen, NTUCS&IE / 33 Tracing example (2/2) Step Parser ActionRemaining InputGenerated Code (19)Call primary()314+A; end SCANEOF (20)match(INTLITERAL)+A; end SCANEOF (21)Semantic Action: process_literal() +A; end SCAN EOF (22)Semantic Action: gen_infix() +A; end SCANEOFDeclare Temp&1,Integer Sub BB,314,Temp&1 (23)Call add_op()+A; end SCANEOF (24)match(PLUSOP)A; end SCANEOF (25)Semantic Action: process_op() A; end SCANEOF (26)Call primary()A; end SCANEOF (27)Call ident()A; end SCANEOF (28)match(ID); end SCANEOF (29)Semantic Action: process_id() ; end SCANEOF Declaration is unnecessary (30)Semantic Action: gen_infix() ; end SCANEOFDeclare Temp&2,Integer Add Temp&1,A,Temp&2 (31)Semantic Action: assign() ; end SCANEOFStore Temp&2,A (32)match(SEMICOLON)end SCANEOF (33)match(END)SCANEOF (34)match(SCANEOF) (35)Semantic Action: finish() Halt