Copyright © 2003-2014 Curt Hill Languages and Grammars This is not English Class. But there is a resemblance.

Slides:



Advertisements
Similar presentations
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Advertisements

Regular Expressions, Backus-Naur Form and Reverse Polish Notation.
COGN1001: Introduction to Cognitive Science Topics in Computer Science Formal Languages and Models of Computation Qiang HUO Department of Computer.
Grammars.
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
ISBN Chapter 3 Describing Syntax and Semantics.
ISBN Chapter 3 Describing Syntax and Semantics.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
CS 330 Programming Languages 09 / 18 / 2007 Instructor: Michael Eckmann.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Normal forms for Context-Free Grammars
CS 330 Programming Languages 09 / 16 / 2008 Instructor: Michael Eckmann.
Chapter 3: Formal Translation Models
Finite State Machines Data Structures and Algorithms for Information Processing 1.
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
Lee CSCE 314 TAMU 1 CSCE 314 Programming Languages Syntactic Analysis Dr. Hyunyoung Lee.
Grammars.
Lecture 21: Languages and Grammars. Natural Language vs. Formal Language.
Languages & Strings String Operations Language Definitions.
Introduction Syntax: form of a sentence (is it valid) Semantics: meaning of a sentence Valid: the frog writes neatly Invalid: swims quickly mathematics.
CSI 3120, Grammars, page 1 Language description methods Major topics in this part of the course: –Syntax and semantics –Grammars –Axiomatic semantics (next.
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler.
A sentence (S) is composed of a noun phrase (NP) and a verb phrase (VP). A noun phrase may be composed of a determiner (D/DET) and a noun (N). A noun phrase.
Languages, Grammars, and Regular Expressions Chuck Cusack Based partly on Chapter 11 of “Discrete Mathematics and its Applications,” 5 th edition, by Kenneth.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Grammars CPSC 5135.
C H A P T E R TWO Syntax and Semantic.
ISBN Chapter 3 Describing Syntax and Semantics.
Copyright © by Curt Hill Grammar Types The Chomsky Hierarchy BNF and Derivation Trees.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
CS 461 – Sept. 19 Last word on finite automata… –Scanning tokens in a compiler –How do we implement a “state” ? Chapter 2 introduces the 2 nd model of.
Complexity and Computability Theory I Lecture #9 Instructor: Rina Zviel-Girshin Lea Epstein.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
Copyright © Curt Hill Finite State Machines The Simplest and Least Capable Automaton.
Chapter 3 Part II Describing Syntax and Semantics.
Copyright © Curt Hill Finite State Automata Again This Time No Output.
Compilation With an emphasis on getting the job done quickly Copyright © – Curt Hill.
ISBN Chapter 3 Describing Syntax and Semantics.
1 / 48 Formal a Language Theory and Describing Semantics Principles of Programming Languages 4.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Parser Generation Using SLK and Flex++ Copyright © 2015 Curt Hill.
Grammars A grammar is a 4-tuple G = (V, T, P, S) where 1)V is a set of nonterminal symbols (also called variables or syntactic categories) 2)T is a finite.
Syntax and Semantics Form and Meaning of Programming Languages Copyright © by Curt Hill.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Copyright © Curt Hill Other Trees Applications of the Tree Structure.
Mathematical Foundations of Computer Science Chapter 3: Regular Languages and Regular Grammars.
C H A P T E R T W O Syntax and Semantic. 2 Introduction Who must use language definitions? Other language designers Implementors Programmers (the users.
Week 14 - Friday.  What did we talk about last time?  Simplifying FSAs  Quotient automata.
BNF A CFL Metalanguage Some Variations Particular View to SLK Copyright © 2015 – Curt Hill.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Formal Languages and Automata FORMAL LANGUAGES FINITE STATE AUTOMATA.
Modeling Arithmetic, Computation, and Languages Mathematical Structures for Computer Science Chapter 8 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesAlgebraic.
Regular Expressions, Backus-Naur Form and Reverse Polish Notation
Chapter 3 – Describing Syntax
Describing Syntax and Semantics
BCT 2083 DISCRETE STRUCTURE AND APPLICATIONS
Introduction to Formal Languages
Lecture 1 Theory of Automata
What does it mean? Notes from Robert Sebesta Programming Languages
Automata and Languages What do these have in common?
Natural Language Processing - Formal Language -
Regular Grammar.
September 13th Grammars.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
High-Level Programming Language
COMPILER CONSTRUCTION
Presentation transcript:

Copyright © Curt Hill Languages and Grammars This is not English Class. But there is a resemblance.

Introduction We have already determined that some computations are impossible –The halting problem was one and there are others What we want are models of computation that give us insight into what is and is not computable Strangely enough, models of computation are closely related to the complexity of languages Copyright © Curt Hill

Languages Every natural language is spoken Usually written as well Such languages are extremely complicated Every language has a syntax and semantics Syntax – the form that the language must have Semantics – the meaning A sentence that violates the syntax may be difficult or impossible to assign meaning Copyright © Curt Hill

Grammar The grammar of a language describes the syntax of a language Since natural languages are extremely complicated, we would expect their grammar’s to also be complicated Perhaps you recall diagramming sentences from high school –This is confirming the syntax of a sentence Copyright © Curt Hill

Natural Languages These are extremely complicated The grammar for a language is a volume of books See the text for some simple examples from English Copyright © Curt Hill

Formal Languages In contrast with natural languages are formal languages –Artificial languages These are typically not designed for person to person communication –Rather person to machine or machine to machine In comparison with natural languages they: –Have very few rules –Very few exceptions to these rules Copyright © Curt Hill

Examples The largest class of these is likely programming languages There are others as well Mathematical notation may be considered a formal language even though it is designed for a form of person to person communication Copyright © Curt Hill

Noam Chomsky Professor emeritus of linguistics at MIT Developed a theory of generative grammars This includes a language hierarchy –AKA Chomsky-Schützenberger Hierarchy Most of the theory of this section was developed by Chomsky Copyright © Curt Hill

Phrase Structure Grammar A grammar, G, is a four tuple: G=(V,T,S,P) V is the alphabet or vocabulary T is a set of terminal elements S is the start symbol or distinguished symbol P is a set of productions –Productions are rewrite rules A grammar should be able to enumerate any legal sentence of the language Copyright © Curt Hill

Formal Grammars Each grammar consists of four things V – a finite set of non-terminals (aka variables) T – a finite set of terminal symbols –Words made up from an alphabet S – the start symbol –Must be an element of V P – a set of productions Copyright © Curt Hill

V A set of elements or symbols We may think about this as the character set –Although that is a little misleading –The alphabet of English is made up of letters, digits and punctuation –But not every combination of letters is a word Perhaps the better way to think about V is as words and stand-alone symbols –There is usually a rule for construction Copyright © Curt Hill

T and N There is a set of terminal symbols, T, as well as a set of non-terminal symbols, N –T is a subset of V Terminals can exist in a legal instance of the language Non-terminals are concepts that need to be instantiated, that is converted into concrete terminals Copyright © Curt Hill

Examples In English any legal word is a terminal A concept like “noun phrase” is a non-terminal –This can be instantiated in a myriad of actual words In C++ the reserved word for or an identifier would be terminals In C++ the concept “if statement” is a non-terminal Copyright © Curt Hill

P A set of productions A production is a rewrite rule Form: – X  Y This means that we can rewrite X as Y –Since  is hard to type we often use ::= Each production must have at least one non-terminal on the left The complexity of these rules determines the type of language Copyright © Curt Hill

S The start symbol or distinguished symbol This is a non-terminal from which all derivations start In English this is usually something like “sentence” In most programming languages it is something like “program” or “unit” Copyright © Curt Hill

Grammars We should be able to produce two things from a grammar –A generator –A recognizer A generator should produce any legal string in the language A recognizer should determine if a string is legal or not –This process is part of parsing Copyright © Curt Hill

Language Recognizer Automaton that reads in a purported construction in the language It answers yes or no if this is indeed in the language Sometimes a reference recognizer is produced A recognizer is not a compiler –Only purpose is to classify Copyright © Curt Hill

Language generators Generates correct statements or correct programs If given enough time (  ) should generate every correct statement in the language Since it generates random correct statements it has some use in learning the syntax Copyright © Curt Hill

Some Examples Lets consider a simple grammar that generates and bit string G = {V, T, S, P} V = {Z, B, 0, 1} T = {0, 1} S = Z P = {Z  B, B  BB, B  0, B  1} Terminals are 0 and 1 Non terminals are Z and B Copyright © Curt Hill

Derivations Is the above grammar able to generate all possible bit strings? Let’s consider a few: 1 (start with Z) –Z  B (B) –B  1 (1) 10 (start with F) –Z  B (B) –B  BB (BB) –B  1 (1B) –B  0 (10) Copyright © Curt Hill

One More 010 (start with Z) –Z  B (B) –B  BB (BB) –B  0 (0B) –B  BB (0BB) –B  1 (01B) –B  0 (010) Are you convinced? Copyright © Curt Hill

Definitions A string may be derived from the start symbol if it is a legal construct of the language A string is a direct derivation from another if it needs only one production A string is a derivation from another if it needs one or more production applications Copyright © Curt Hill

A Language Definition: The language of a grammar is the set of all possible strings that may be derived from a grammar –The finished string must only contain non-terminals Copyright © Curt Hill

Other Way Let us now try one where we want a particular language and we have to come up the grammar Lets consider the set of bit strings that start with 00 and end with a sequence of 1s As a regular expression: 00(0|1)*1+ –001, , – , among others Copyright © Curt Hill

The Grammar There is not a single way to do the previous G = (V, T, S, P) T = {0,1} S = S What is P? Copyright © Curt Hill

P Must have at least one production starting with S: –S  00 B 1 B then looks like the bit string of before: –B0–B0 –B1–B1 –B  BB What other possibilities could we have? Copyright © Curt Hill

Audience Participation What is the grammar for the bit strings that look like this: 0 h 1 j 0 k where h>0,j>0,k>0 This includes: –010, , among others Copyright © Curt Hill

One Last Thing (or not) Finally lets look at an example programming language A subset of C Copyright © Curt Hill

C Subset as an Example V – set of non-terminals –Statement –Declaration –For-statement T – set of terminals –Reserved words –Punctuation –Identifiers Copyright © Curt Hill

C example again S – Start symbol –Independently compilable part –Program –Function –Constant P – set of productions –Rewrite rules –Start at the start symbol –End at terminals Copyright © Curt Hill

C For Production For-statement  for ( expression; expression; expression) statement This contains the terminals: –For ( ; ) Non-terminals –Expression –Statement Copyright © Curt Hill

Productions Again Each non-terminal should have one or more productions that define it –Every non-terminal must have one or more productions Multiple productions usually signify alternation Recursion is allowed Copyright © Curt Hill

Recursion Productions may be recursive Recall for-statement, here is Statement Statement  expression ; Statement  for-statement ; Statement  if-statement ; Statement  while-statement ; Statement  compound-statement Etc. Copyright © Curt Hill

Exercises 13.1a –1, 5, 13 Copyright © Curt Hill