Parsing. Language A set of strings from an alphabet which may be empty, finite or infinite. A language is defined by a grammar and we may also say that.

Slides:



Advertisements
Similar presentations
Prolog programming....Dr.Yasser Nada. Chapter 8 Parsing in Prolog Taif University Fall 2010 Dr. Yasser Ahmed nada prolog programming....Dr.Yasser Nada.
Advertisements

Programming Languages Third Edition Chapter 6 Syntax.
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Translator Architecture Code Generator ParserTokenizer string of characters (source code) string of tokens abstract program string of integers (object.
Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,
Lecture # 7 Chapter 4: Syntax Analysis. What is the job of Syntax Analysis? Syntax Analysis is also called Parsing or Hierarchical Analysis. A Parser.
Context-Free Grammars Sipser 2.1 (pages 99 – 109).
Context-Free Grammars Sipser 2.1 (pages 99 – 109).
CSE 3302 Programming Languages Chengkai Li, Weimin He Spring 2008 Syntax Lecture 2 - Syntax, Spring CSE3302 Programming Languages, UT-Arlington ©Chengkai.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Context-free.
Compiler Design Lexical Analysis Syntactical Analysis Semantic Analysis Optimization Code Generation.
Specifying Languages CS 480/680 – Comparative Languages.
Lecture 9UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 9.
COP4020 Programming Languages
Compiler Construction 1. Objectives Given a context-free grammar, G, and the grammar- independent functions for a recursive-descent parser, complete the.
The College of Saint Rose CIS 433 – Programming Languages David Goldschmidt, Ph.D. from Concepts of Programming Languages, 9th edition by Robert W. Sebesta,
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Languages & Strings String Operations Language Definitions.
ICS611 Introduction to Compilers Set 1. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
Introduction Syntax: form of a sentence (is it valid) Semantics: meaning of a sentence Valid: the frog writes neatly Invalid: swims quickly mathematics.
Prof. Bodik CS 164 Lecture 51 Building a Parser I CS164 3:30-5:00 TT 10 Evans.
Compiler1 Chapter V: Compiler Overview: r To study the design and operation of compiler for high-level programming languages. r Contents m Basic compiler.
Parsing arithmetic expressions Reading material: These notes and an implementation (see course web page). The best way to prepare [to be a programmer]
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
Syntax and Backus Naur Form
CS 461 – Oct. 7 Applications of CFLs: Compiling Scanning vs. parsing Expression grammars –Associativity –Precedence Programming language (handout)
Syntax: 10/18/2015IT 3271 Semantics: Describe the structures of programs Describe the meaning of programs Programming Languages (formal languages) -- How.
Lexical Analysis I Specifying Tokens Lecture 2 CS 4318/5531 Spring 2010 Apan Qasem Texas State University *some slides adopted from Cooper and Torczon.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
CS 461 – Sept. 19 Last word on finite automata… –Scanning tokens in a compiler –How do we implement a “state” ? Chapter 2 introduces the 2 nd model of.
Review: Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Intermediate code generator Code optimizer Code generator Symbol.
CPS 506 Comparative Programming Languages Syntax Specification.
Chapter 3 Describing Syntax and Semantics
Summary. likes(tom,jerry). likes(mary,john). likes(tom,mouse). likes(tom,jerry). likes(jerry,cheeze). likes(mary,fruit). likes(john,book). likes(mary,book).
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
. n COMPILERS n n AND n n INTERPRETERS. -Compilers nA compiler is a program thatt reads a program written in one language - the source language- and translates.
Chapter 1 Introduction Major Data Structures in Compiler
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.
Syntax and Semantics Form and Meaning of Programming Languages Copyright © by Curt Hill.
Syntax Analyzer (Parser)
Parsing and Code Generation Set 24. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program,
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
LECTURE 4 Syntax. SPECIFYING SYNTAX Programming languages must be very well defined – there’s no room for ambiguity. Language designers must use formal.
GRAMMARS & PARSING. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program, referred to as a.
Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.
CSC 4181 Compiler Construction
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 3.
7.2 Programming Languages - An Introduction to Informatics WMN Lab. Hye-Jin Lee.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
Context Free Grammars & Parsing CPSC 388 Fall 2001 Ellen Walker Hiram College.
CS 3304 Comparative Languages
System Software Unit-1 (Language Processors) A TOY Compiler
A Simple Syntax-Directed Translator
CS 326 Programming Languages, Concepts and Implementation
PROGRAMMING LANGUAGES
Compiler Construction
Compiler Construction
Method of Language Definition
Basic Program Analysis: AST
Compiler Design 4. Language Grammars
Review: Compiler Phases:
Compilers B V Sai Aravind (11CS10008).
CMPE 152: Compiler Design August 21/23 Lab
High-Level Programming Language
COSC 3340: Introduction to Theory of Computation
Formal Languages Context free languages provide a convenient notation for recursive description of languages. The original goal of formalizing the structure.
Faculty of Computer Science and Information System
Presentation transcript:

Parsing

Language A set of strings from an alphabet which may be empty, finite or infinite. A language is defined by a grammar and we may also say that a grammer generates a language. Basic Definitions Symbol A symbol is an entity that has no meaning by itself (A character). Alphabet An alphabet is a finite set of symbols. Examples: B = {0, 1} C = {a, b, c} E = {a,b,…,z,A,B,…,Z,1,2,…,9,…} String (Sentence) A finite sequence of symbols from an alphabet. Examples: 0, 1, 111, abc, cbba, bbc, aaabbbccc English, Mary, school, John plays football.

Grammar A grammar G is defined as a four tuple (V, T, P, S), where: V={S} T={a,b} P: S -> aSb S -> ab This grammer generates strings containing an equal number of a’s and b’s starting with a’s at the beginning of the string and b’s following the a’s. To Generate the string aaaaabbbbb: S => aaSbb => aaaSbbb => aaaaSbbbb => aaaaabbbbb V is a set of symbols called variables (S, A, B, …) T is a set of symbols called terminal ( 0, 1, a, b, …) S is the start variable (an element of V) P is a set of grammer rules An example grammer:

Scanning (Lexical Analysis) Phases of compilation Parsing (Syntax Analysis) Intermediate Code Gen. (Semantics Analysis) Optimization Code Generation Tokens ([x, +, ‘(’, y, *, x, ‘)’] ) Parse tree Intermediate code Optimized code Object Code x+(y*x)

A parser is a program which embodies the rules of a grammar. Parsing is the task of showing whether or not a string of symbols conforms to the syntactic rules (grammer) of a language. If the string conforms to the rules we say that the grammar generates the string. Scanner while A > = B Do [while, A, > =, B, Do] Parser Yes, this conforms to the syntactic rules

A simple grammar for expressions expr ::= term | term addop expr term ::= factor | factor multop term factor ::= ‘x’ | ‘y’ | lbr expr rbr addop ::= ‘+’ | ‘-’ multop ::= ‘*’ | ‘/’ lbr ::= ‘(’ rbr ::= ‘)’ E -> T E -> T A E T -> F T -> F M T … Parsing Expressions Arithmetic Expressions x x+y x+(y*x)

Representing this grammar in Prolog We should write a relation expr, such that expr(E) succeeds if E can be parsed as an expression. Suppose we represent an expression as a list of symbols as follows: [x, +, ‘(’, y, *, x, ‘)’] How can we parse this as an expression? The grammar tells us that there are 2 ways that a list of symbols can represent an expression. expr ::= term | term addop expr The first clause is simple: expr(E) :- term(E). % E is an expression if it is a term

The second clause says that a list of symbols is an expression if it starts with a term followed by an addop and ends with an expression. So we must write code to find the first part of the list which can be parsed as a term. We will use append(L1, L2, L3) to split the list up and find all the possible first parts of the list: ?- append([a,b],[c,d],L). L = [a, b, c, d] ?- append(L1,L2,[a,b,c,d]). L1 = [] L2 = [a, b, c, d] ; L1 = [a] L2 = [b, c, d] ; L1 = [a, b] L2 = [c, d] ; L1 = [a, b, c] L2 = [d] ; L1 = [a, b, c, d] L2 = [] ; No

append(Start, Finish, [x, +, '(', y, *, x, ')']) Start = [], Finish = [x, +, '(', y, *, x, ')']; Start = [x], Finish = [+, '(', y, *, x, ')']; Start = [x, +], Finish = ['(', y, *, x, ')']; Start = [x, +, '('], Finish = [y, *, x, ')']; Start = [x, +, '(', y], Finish = [*, x, ')']; Start = [x, +, '(', y, *], Finish = [x, ')']; Start = [x, +, '(', y, *, x], Finish = [')']; Start = [x, +, '(', y, *, x, ')'], Finish = []; No [x, +, ‘(’, y, *, x, ‘)’]

expr(L) :- append(First, Rest, L), % Split the list up term(First), % find a term at the start append(L1, L2, Rest), %split the Rest up addop(L1), %find an addop expr(L2). %find an expr expr ::= term | term addop expr

term(L) :- append(First, Rest, L), % Split the list up factor(First), % find a factor at the start append(L1, L2, Rest), %split the Rest up multop(L1), %find a multop term(L2). %find a term Term term ::= factor | factor multop term Clause 1 term(L):- factor(L). Clause 2

factor([x]). factor([y]). factor(L) :- append(First, Rest, L), lbr(First), append(L1, L2, Rest), expr(L1), rbr(L2). Factors factor ::= ‘x’ | ‘y’ | lbr expr rbr A list of symbols is a factor if it is ‘x’ or ‘y’

Parsing operators and brackets addop ::= ‘+’ | ‘-’ multop ::= ‘*’ | ‘/’ lbr ::= ‘(’ rbr ::= ‘)’ Defining operators and the brackets addop([+]). addop([-]). multop([*]). multop([-]). lbr(['(']). rbr([')']).

?- expr([x, +, '(', y, *, x, ')']). Yes ?- expr([y, -, '(', x, +, y, ')']). Yes ?- expr([y, -, '(', x, +, y]). No ?- expr([y,/, '(', x, +, y,')']). Yes ?- expr([y,/, (, x, +, y,)]). ERROR: Syntax error: Operator expected ERROR: expr([y,/, (, ERROR: ** here ** ERROR: x, +, y,)]). ?- Now we can user expr to parse strings of symbols as expressions: x+(y*x) x-(x+y) y-(x+y y/(x+y)

It is possible with most Prolog implementations to write a grammar directly in Prolog and use a special predicate to parse strings. we have seen that we can construct a parser in Prolog from a grammar. It is even possible to write a Prolog program which takes a grammar and produces a parser. Such a program is called a parser generators.

expr --> term. expr --> term, addop, expr. term --> factor. term --> factor, multop, term. factor --> [x]. factor --> [y]. factor --> lbr, expr, rbr. addop --> ['+']. addop --> ['-']. multop --> ['*']. multop --> ['/']. lbr --> ['(']. rbr --> [')']. Here is the grammar that we presented earlier rewritten using Prolog grammer rules:

The Prolog grammar rules are just a bunch of facts, like any others we might have in our database. To use them for parsing we need to give Prolog the query: phrase(P, List) ?- phrase(expr, [y, '*', '(', x, '+', x, ')']) Yes ?- phrase(factor, [y]) Yes ?- phrase(rbr, [')']) Yes ?- phrase(factor, [y, '*', x]) No ?- phrase(expr, [y, '*', x]). Yes ?- phrase(factor, ['(',y, '*', x,')']). Yes ?- ?- phrase(expr, L). L = [x] ; L = [y] ; L = ['(', x, ')'] ; L = ['(', y, ')'] ;

A Grammar for a very small fragment of English sentence --> noun_phrase, verb_phrase. noun_phrase --> determiner, noun. noun_phrase --> proper_noun. determiner --> [the]. determiner --> [a]. proper_noun --> [pedro]. noun --> [man]. noun --> [apple]. verb_phrase --> verb, noun_phrase. verb_phrase --> verb. verb --> [eats]. verb --> [sings].

?- phrase(sentence, [the, man, eats]). yes ?- phrase(sentence, [the, man, eats, the, apple]). yes ?- phrase(sentence, [the, apple, eats, a, man]). yes ?- phrase(sentence, [pedro, sings, the, pedro]). no ?- phrase(sentence,[eats, apple, man]). no ?- phrase(sentence,L).

L = [the, man, eats, the, man] ; L = [the, man, eats, the, apple] ; L = [the, man, eats, a, man] ; L = [the, man, eats, a, apple] ; L = [the, man, eats, pedro] ; L = [the, man, sings, the, man] ; L = [the, man, sings, the, apple] ; L = [the, man, sings, a, man] ; L = [the, man, sings, a, apple] ; L = [the, man, sings, pedro] ; L = [the, man, eats] ; L = [the, man, sings] ; L = [the, apple, eats, the, man] ; L = [the, apple, eats, the, apple] ; L = [the, apple, eats, a, man] ; L = [the, apple, eats, a, apple] ; L = [the, apple, eats, pedro] ; L = [the, apple, sings, the, man] ; L = [the, apple, sings, the, apple] ; L = [the, apple, sings, a, man] ;

L = [the, apple, sings, a, apple] ; L = [the, apple, sings, pedro] ; L = [the, man, sings, pedro] ; L = [the, man, eats] ; L = [the, man, sings] ; L = [the, apple, eats, the, man] ; L = [the, apple, eats, the, apple] ; L = [the, apple, eats, a, man] ; L = [the, apple, eats, a, apple] ; L = [the, apple, eats, pedro] ; L = [the, apple, sings, the, man] ; L = [the, apple, sings, the, apple] ; L = [the, apple, sings, a, man] ; L = [the, apple, sings, a, apple] ; L = [the, apple, sings, pedro] ; L = [a, apple, eats, the, man] ; L = [a, apple, eats, the, apple] ; L = [a, apple, eats, a, man] ; L = [a, apple, eats, a, apple] ; L = [a, apple, eats, pedro] ; L = [a, apple, sings, the, man] ;

L = [a, apple, sings, the, apple] ; L = [a, apple, sings, a, man] ; L = [a, apple, sings, a, apple] ; L = [a, apple, sings, pedro] ; L = [a, apple, eats] ; L = [a, apple, sings] ; L = [pedro, eats, the, man] ; L = [pedro, eats, the, apple] ; L = [pedro, eats, a, man] ; L = [pedro, eats, a, apple] ; L = [pedro, eats, pedro] ; L = [pedro, sings, the, man] ; L = [pedro, sings, the, apple] ; L = [pedro, sings, a, man] ; L = [pedro, sings, a, apple] ; L = [pedro, sings, pedro] ; L = [pedro, eats] ; L = [pedro, sings] ; No ?-