Presentation is loading. Please wait.

Presentation is loading. Please wait.

Languages & Strings String Operations Language Definitions.

Similar presentations


Presentation on theme: "Languages & Strings String Operations Language Definitions."— Presentation transcript:

1 Languages & Strings String Operations Language Definitions

2 Strings A string x (over alphabet A) is a finite sequence x = x 1 x 2.. x n where x i  A. Length – the length of x is the number of characters, n, in the sequence. Empty String – λ denotes the empty string of length 0. Recursive definition of the set of strings A * over alphabet A – Basis : The empty string λ  A * – Recursive Step : If x  A * and a  A, then xa  A * – Closure : A * contains no other strings

3 Languages and String Operations Languages –A language L over alphabet A is any subset of A * –Concatenation : The concatenation of two strings x, y is xy, a string of length of x + length of y. – The concatenation of two languages : The concatenation of two languages L and M is LM, where LM = { z | z = xy where x  L, y  M. Example: T = D* and O = {“+”,”-”} where D = {0,1,..,9}. Then TOT is the language {“1+1, 12+24,...}

4 Recursive Definition of Regular Sets Let A be an alphabet. The regular sets over A are: –Basis : , {λ} and {a} are each regular sets –Recursive Step : If X, Y are regular sets, so is X  Y XY X* –Closure : X is a regular set over A iff it can be obtained by a finite number of applications of the recursive step

5 Regular Set Examples Signed and unsigned integers Unparethesized expressions with variable operands and binary operators if variables are formed by l – letter followed by string of string of l,d where d - digit English sentences with structure with –Lexical categories : d – determiner, a – adjective, n – noun, x – adverb, v - verb

6 Regular Set Examples Signed and unsigned integers –({ }  {+}  {-}){d}{d} * Expressions without parentheses – (({l}({l}  {d}) * )(({+}  {*})(({l}({l}  {d}) * )) * Sentences –({d}{a} * {n})(({ }  {x}){v})({d}{a} * {n})

7 Regular Expressions The set of strings which begin with an “a” and end with a “b” is a regular set over {a,b} since it equals {a}({a}  {b})*{b}. Regular expressions represent regular sets as follows: – , λ and a represent , {λ} and {a}. –If u and v are regular expressions (representing reguar sets) then (u  v), (uv) and (u*) are regular expressions representing their union, concatenation and Kleene closure. –Dropping superfluous parentheses, a(a,b)*b represents the regular set: all strings starting with a and ending with b.

8 Grammars A context free grammar G is a 4-tuple : G = ( V, ,P,S ) where 1. V is a set of nonterminals (or string variables), each representing a sublanguage from which the variable takes its values. Examples are which can take on values such as “the big box” and T which can take on string values used to represent products in an algebraic expression. 2.  is a finite alphabet. Examples are the English vocabulary (consisting of over a hundred thousand words, each treated as an atomic symbol). Another example is the printable ASCII character set. The binary alphabet consists of {0,1}. The alphabet contains the symbols from which language strings are formed.

9 Grammars Continued 3. P is a finite set of productions or rules used to define the sublanguages represented by the nonterminals. In a context free grammar, a rule has the format A  X where A  V and X  ( V   ) *. The interpretation is that the strings in the sublanguage represented by A can be constructed according to the format indicated by X. For a terminal character in X, the terminal character is used in the A string and for a variable in X, a string in the sublanguage is substituted for the variable. Examples are  and T  a * T. 4. S is a designated variable (referred to as the start symbol or the head of the language). It represents the language being defined by the grammar G.

10 Grammar Examples Signed and unsigned integers Unparethesized expressions with variable operands and binary operators if variables are formed by l – letter followed by string of string of l,d where d - digit English sentences with structure with –Lexical categories : d – determiner, a – adjective, n – noun, x – adverb, v - verb

11 Grammar Examples Signed and unsigned integers –I  SD, S  + | - |, D  dD, D  d Unparethesized expressions with variable operands and binary operators if variables are formed by l – letter followed by string of string of l,d where d – digit –E  VE, E  V, V  lU, U  lU, U  dU, U  English sentences with structure with –Lexical categories : d – determiner, a – adjective, n – noun, x – adverb, v - verb

12 Grammars and Derivations Derivations If u,v are strings in ( V   ) *, A is in V and A  X is in P, then uAv  uXv, referred to as uAv “derives” uXv by application of the rule A  X. For repeated applications of 0 or more rules, the symbol  * is used. Language Definition The language L(G) defined by G is { x | x   *, S  * x }

13 Language Definition Language Definition is a means of specifying which strings belong to the language. Two approaches to language definition are Acceptive – Given a string, a device specifies whether or not it belongs to the language. An automaton A which processes a language string x accepts x as belonging to the language if it’s final state belongs to set of legal final states. A parser constructed from the grammar defining the language accepts the string if it can parse it. Generative – Given an alphabet, a generative device tells how strings in the language are formed A language manual which tells how strings are formed can be used to generate language strings. A grammar is a generative means of specification. Any string which can be derived from the start symbol by applying gramar rules is in the language.

14 Grammars and Derivations Derivations If u,v are strings in ( V   ) *, A is in V and A  X is in P, then uAv  uXv, referred to as uAv “derives” uXv by application of the rule A  X. For repeated applications of 0 or more rules, the symbol  * is used. Language Definition The language L(G) defined by G is { x | x   *, S  * x }

15 Finite state automata and language recognition S I D dd d · Finite state automaton has  = {d,}, start state S and legal final states I and D. The transition function is represented by above diagram or table below: d S I F I I D F D D D - Accepts : ddd, d.dd,.ddd Rejects d.dd.d · F d

16 Automata as Acceptors S I D dd d · · F d The string ddd.d produces the state sequence : SIIIDD is accepted in L because the last state D is a legal final state. The string.dd produces the state sequence : SFD is accepted because D is legal. The string ddd produces the state sequence : SIII is accepted because I is legal

17 Parsing Given a Grammar G with distinguished nonterminal S and a string X over the alphabet, does S  * X? Parsing attempts to find a sequence of rules by which – S  * X

18 Grammar for Decimal Numbers I  d I I  d I  D D  d D D  d Parse tree for d d. d d d I d I d I D d D d A parse tree has intermediate nodes for nonterminals, a child node for each RHS character in the production used to replace the nonterminal, a leaf node for each character in the language string produced by the derivation. The language is the set of strings for which there exist parse trees.


Download ppt "Languages & Strings String Operations Language Definitions."

Similar presentations


Ads by Google