SIE 550 – Formal Languages Lecture for SIE 550 Matt Dube Doctoral Student – Spatial IGERT Fellow
Languages What do we think of when we think of languages? lettersnumberspunctuation syllables words parts of speech grammar phrases sentences BUILDING BLOCKS BUILDINGS IN THE LANGUAGE
What are we going to do today? Not quite what you bargained for We are going to learn how to write in a foreign language that none of us can actually speak We will then use that language to set up the constructs that govern computers and database constructs
The Language Is…. KOREAN
Korean Words 말하다 This is a one word English physical action. So everyone take a guess by doing a physical action. SPEAK
Now That You Understand Korean… (pause for raucous laughter) Korean appears to be a graphical language, like Chinese and quite unlike English Looks are very deceiving though 말하다
KOREAN ALPHABET Korean is an alphabetical language! ConsonantsVowels
ConsonantsVowels 말하다 But wait….Those characters aren’t in the alphabet! Let’s take a closer look: So what exactly is each character in the phrase “speak”?
Syllables Each character in written Korean is actually a syllable! Any patterns?
ConsonantsVowels 말하다
Syllables Korean characters are syllables! Any patterns? –Consonant followed by a vowel –Consonant followed by a vowel followed by a consonant All characters in written Korean follow either of those two patterns A syllable is thus a composite of characters in a specified order (and they all exist ironically enough). Words, phrases, sentences, etc. are composite of syllables.
What About English? NOT SO SIMPLE!
Example What then does “qeb” mean? ABSOLUTELY NOTHING! “dog” CUTE, CUDDLY, FURRY ANIMAL THAT BARKS
English Language Syllables do not work with our alphabet Thus some are not valid Syllables are phonetically based Korean is a phonetically based language English is the hardest language to learn Korean is the easiest language to learn
Why Did We Do That? Necessity Computers and databases need a “Korean” type of language as opposed to an English type of language –Parsing –Relevancy Korean is a formal language
Formal Languages Characters –Letters, numbers, punctuation –Building blocks –We call these terminal symbols or axioms Uses of characters –Syllables, words, phrases, grammar, sentences, etc. –Buildings –We call these non-terminal symbols or production rules or predicates
Where do we find them? Music Mathematics World Languages Art Computer Languages Almost Anywhere
Conventions The production rules form the syntax (grammar/spelling) of a language Valid combinations under the syntax are called well formed formulas (sentences) Example: –Jack in the box –We should say: Jack is in the box. JACK BOX IN
Forming a WFF Operators are needed for these languages: | = exclusive or ::= = is replaced by [ ] = optional (0 or 1) { } = optional (0 to many) “ ” = designators of terminal symbols Examples of situations for these? Assigning a sign (+ or -) Fullname ::= First Last Area code Letters in a name
Programming Korean Syllables Start with what we are trying to create
Korean Syllables start ::= syllable start “is replaced by” syllable
Programming Korean Syllables Start with what we are trying to create Establish the form of the creation
Korean Syllables start ::= syllable syllable ::= consonant vowel [consonant] consonant followed by a vowel another consonant if necessary
Programming Korean Syllables Start with what we are trying to create Establish the form of the creation Establish the terminal symbols
Korean Syllables can be substituted for consonant can be substituted for vowel start ::= syllable syllable ::= consonant vowel [consonant] consonant ::= “A” | “B” | “C” | “D” | “E” | “F” | “G” | “H” | “I” | “J” | “K” | “L” | “M” | “N” vowel ::= “0” | “1” | “2” | “3” | “4” | “5” | “6” | “7” | “8” | “9”
Let’s Try to Program Korean Syllables Start with what we are trying to create Establish the form of the creation Establish the terminal symbols Congratulations! We can now generate any Korean syllable mechanically Let’s test a few examples
start ::= syllable syllable ::= consonant vowel [consonant] consonant ::= “A” | “B” | “C” | “D” | “E” | “F” | “G” | “H” | “I” | “J” | “K” | “L” | “M” | “N” vowel ::= “0” | “1” | “2” | “3” | “4” | “5” | “6” | “7” | “8” | “9” Korean Syllables A3A?100? AAA?K9?M4? O7K? 3D? C3P0?
Syllables aren’t enough though… We speak and write in words What do we need to do to make our program generate possible Korean words?
Korean Words start ::= word word ::= {syllable} arbitrary number of syllables What is wrong with this? A word can have 0 syllables??? How can we deal with this?
Korean Words Revised start ::= word word ::= syllable {syllable} one syllable 0 to n more possible Now add the lines from syllable
Korean Words start ::= word word ::= syllable {syllable} syllable ::= consonant vowel [consonant] consonant ::= “A” | “B” | “C” | “D” | “E” | “F” | “G” | “H” | “I” | “J” | “K” | “L” | “M” | “N” vowel ::= “0” | “1” | “2” | “3” | “4” | “5” | “6” | “7” | “8” | “9”
This goes to show that semantics are not accounted for in a formal language. The Problem with English… DOGSHYBEEAXE QEV No matter what, we can’t define syllables or words such that we get all “words” as results (provided we don’t code all words in)! start ::= syllable syllable ::= consonant vowel [consonant] consonant ::= “B” | “C” | “D” | “F” | “G” | “H” | “J” | “K” | “L” | “M” | “N” | “P” | “Q” | “R” | “S” | “T” | “V” | “W” | “X” | “Y” | “Z” | vowel ::= “A” | “E” | “I” | “O” | “U”
Positive Integers Let’s construct a language for positive integers on the board.
Why Formal Languages? Mathematical Induction –How many syllables exist in Korean? 14 Consonants * 10 Vowels * (14 Consonants + 1 Blank) = 2,100 –How many syllables in English? Can’t tell without counting one by one
Why Formal Languages? Parsing –Conversion to other useful information Φ = “ph” 1/2 = ½
Why Formal Languages? Grafiti Palm Pilots –Allow for symbolic recognition –Writing on a small object easier than typing on one (texting on a cell phone)
Why Formal Languages? Sequential Operations on a Computer –Selecting text –Drawing –Menu browsing
Why Formal Languages? Mechanical recognition of commands –Spell checkers –Proper commands in DOS prompt
To think about Won’t be collected, but for your own exercise: Create a formal language that will output addition and subtraction questions for positive integers, namely 4+9-7=? –Should be able to do arbitrary amount of calculations –No leading zeroes for a number Discuss on Friday