Presentation is loading. Please wait.

Presentation is loading. Please wait.

Transformational Grammars and PROSITE Patterns Roland Miezianko CIS 595 - Bioinformatics Prof. Vucetic.

Similar presentations


Presentation on theme: "Transformational Grammars and PROSITE Patterns Roland Miezianko CIS 595 - Bioinformatics Prof. Vucetic."— Presentation transcript:

1 Transformational Grammars and PROSITE Patterns Roland Miezianko CIS 595 - Bioinformatics Prof. Vucetic

2 Agenda Transformational GrammarsTransformational Grammars –Definition –The Chomsky Hierarchy Finite State AutomataFinite State Automata –FMR-1 Triplet Repeat Region –Regular Grammar Example PROSITEPROSITE –Patterns in Regular Grammar Form

3 Assumptions Treated biological sequences as one-dimensional strings of independent and uncorrelated symbols.Treated biological sequences as one-dimensional strings of independent and uncorrelated symbols. Need to address interaction among base pairs to understand secondary structures.Need to address interaction among base pairs to understand secondary structures.

4 Secondary Structures The 3-D folding of proteins and nucleic acids involves extensive physical interactions between residues that are not adjacent in primary sequence. [1]The 3-D folding of proteins and nucleic acids involves extensive physical interactions between residues that are not adjacent in primary sequence. [1] Require a model for secondary structure that reflect the interaction among base pairs.Require a model for secondary structure that reflect the interaction among base pairs.

5 Modeling Strings General theories for modeling strings of symbols has been developed by computational linguistsGeneral theories for modeling strings of symbols has been developed by computational linguists –Chomsky in 1956, 1959 –Interested in how a brain or computer program could algorithmically determine whether a sentence was grammatical or not

6 Transformational Grammars Transformational Grammars consist of:Transformational Grammars consist of: –Symbols Abstract Nonterminal SymbolsAbstract Nonterminal Symbols Terminal SymbolsTerminal Symbols –Rewriting Rules (Productions) A --> BA --> B

7 Transformational Grammars, Example Example Grammar Two-letter terminal alphabet: {a, b} Single nonterminal letter: S Three Productions: S->aS S->bS S->e (e=special blank terminal symbol) Example derivation of our simple grammar: S->aS->abS->abbS->abb

8 Chomsky Hierarchy Four types of restrictions on grammar’s productions resulted on four classes of grammars.Four types of restrictions on grammar’s productions resulted on four classes of grammars. –Regular Grammars –Context-Free Grammars –Context-Sensitive Grammars –Unrestricted Grammars

9 Chomsky Hierarchy regular context-free context-sensitive unrestricted

10 Automata Each grammar has a corresponding abstract computational device called: automatonEach grammar has a corresponding abstract computational device called: automaton GrammarParsing Automaton RegularFinite State Context-FreePush-Down Context-SensitiveLinear Bounded UnrestrictedTuring Machine

11 FRM-1 Triplet Repeat Region FRM-1 gene sequence contains CGG which is repeated number of timesFRM-1 gene sequence contains CGG which is repeated number of times Number of triplets is highly variable between individualsNumber of triplets is highly variable between individuals Increased copy number is associated with a genetic diseaseIncreased copy number is associated with a genetic disease

12 FRM-1 Triplet Repeat Region FSA will match any string from the “language” that contains the strings:FSA will match any string from the “language” that contains the strings: GCG CTG GCG CGG CTG GCG CGG CGG CTG GCG CGG CGG CGG CGG … CTG

13 FRM-1 Triplet Repeat Region

14 Regular Grammar for our Finite State Automaton finds any number of copies of CGG

15 PROSITE Patterns PROSITE database is an example of a biological application of regular grammarsPROSITE database is an example of a biological application of regular grammars –Unlike methods which assign scores to alignments, PROSITE patterns either match a sequence or do not.

16 PROSITE Patterns Consists of a string of pattern elements separated by dashes and terminated by a periodConsists of a string of pattern elements separated by dashes and terminated by a period –Pattern Element – single letter –[ ] - any one letter –{ } – anything but enclosed letters –X – any residue can occur –X(y) – any letter of length y

17 PROSITE Patterns [RK]-G-{EDRKHPCG}-[AGSCI]-[FY]-[LIVA]-x-[FYM]. RNP-1 Motif

18 Conclusion Transformational grammars are useful in developing acceptors of different length sequences and for matching specific multi- sequence regions.Transformational grammars are useful in developing acceptors of different length sequences and for matching specific multi- sequence regions. Higher order grammars in the Chomsky hierarchy are more difficult to program and applyHigher order grammars in the Chomsky hierarchy are more difficult to program and apply

19 References [1] Durbin, R. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. University of Cambridge Press, 1998. [2] Gibson, G. A Primer of Genome Science. Sinauer Associates, Inc. Publishers, 2002. [4] PROSITE Database http://us.expasy.org/prosite/ [3] Mount, D. Bioinformatics: Sequence and Genome Analysis. Cold Spring Harbor Laboratory Press, 2001.

20 Transformational Grammars and PROSITE Patterns QuestionsAndAnswers


Download ppt "Transformational Grammars and PROSITE Patterns Roland Miezianko CIS 595 - Bioinformatics Prof. Vucetic."

Similar presentations


Ads by Google