Dynamic Programming (cont’d) CS 466 Saurabh Sinha.

Slides:



Advertisements
Similar presentations
RNA Secondary Structure Prediction
Advertisements

Gene expression From Gene to Protein
RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Chapter 7 Dynamic Programming.
6 - 1 Chapter 6 The Secondary Structure Prediction of RNA.
6 -1 Chapter 6 The Secondary Structure Prediction of RNA.
Predicting RNA Structure and Function. Non coding DNA (98.5% human genome) Intergenic Repetitive elements Promoters Introns mRNA untranslated region (UTR)
Predicting RNA Structure and Function
RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
RNA Secondary Structure aagacuucggaucuggcgacaccc uacacuucggaugacaccaaagug aggucuucggcacgggcaccauuc ccaacuucggauuuugcuaccaua aagccuucggagcgggcguaacuc.
Pattern Discovery in RNA Secondary Structure Using Affix Trees (when computer scientists meet real molecules) Giulio Pavesi& Giancarlo Mauri Dept. of Computer.
RNA Secondary Structure Prediction
Predicting RNA Structure and Function. Nobel prize 1989Nobel prize 2009 Ribozyme Ribosome RNA has many biological functions The function of the RNA molecule.
7 -1 Chapter 7 Dynamic Programming Fibonacci Sequence Fibonacci sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, … F i = i if i  1 F i = F i-1 + F i-2 if.
RNA structure analysis Jurgen Mourik & Richard Vogelaars Utrecht University.
. Class 5: RNA Structure Prediction. RNA types u Messenger RNA (mRNA) l Encodes protein sequences u Transfer RNA (tRNA) l Adaptor between mRNA molecules.
Finding Common RNA Pseudoknot Structures in Polynomial Time Patricia Evans University of New Brunswick.
Structural Alignment of Pseudoknotted RNAs Banu Dost, Buhm Han, Shaojie Zhang, Vineet Bafna.
CISC667, F05, Lec19, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) RNA secondary structure.
Predicting RNA Structure and Function
Predicting RNA Structure and Function. Nobel prize 1989 Nobel prize 2009 Ribozyme Ribosome.
© Wiley Publishing All Rights Reserved. Biological Sequences.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 3 Cell Structures and Their Functions Dividing Cells.
Gene expression.
RNA Secondary Structure Prediction Introduction RNA is a single-stranded chain of the nucleotides A, C, G, and U. The string of nucleotides specifies the.
RNA informatics Unit 12 BIOL221T: Advanced Bioinformatics for Biotechnology Irene Gabashvili, PhD.
Non-coding RNA gene finding problems. Outline Introduction RNA secondary structure prediction RNA sequence-structure alignment.
DNA, RNA, and Proteins.  Students know and understand the characteristics and structure of living things, the processes of life, and how living things.
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
Dynamic Programming. Well known algorithm design techniques:. –Divide-and-conquer algorithms Another strategy for designing algorithms is dynamic programming.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
Motif finding with Gibbs sampling CS 466 Saurabh Sinha.
1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing systems.
RNA Secondary Structure Prediction. 16s rRNA RNA Secondary Structure Hairpin loop Junction (Multiloop)Bulge Single- Stranded Interior Loop Stem Image–
Protein Synthesis 1 Background Information All information is stored in DNA All information is stored in DNA RNA “reads” the DNA code RNA “reads” the.
Lecture 9 CS5661 RNA – The “REAL nucleic acid” Motivation Concepts Structural prediction –Dot-matrix –Dynamic programming Simple cost model Energy cost.
RNA Structure and Protein Synthesis Chapter 10, pg
RNA secondary structure RNA is (usually) single-stranded The nucleotides ‘want’ to pair with their Watson-Crick complements (AU, GC) They may ‘settle’
Spliceosome attachs to hnRNA and begins to snip out non-coding introns mRNA strand composed of exons is free to leave the nucleus.
Gene expression. The information encoded in a gene is converted into a protein  The genetic information is made available to the cell Phases of gene.
Questions?. Novel ncRNAs are abundant: Ex: miRNAs miRNAs were the second major story in 2001 (after the genome). Subsequently, many other non-coding genes.
Prediction of Secondary Structure of RNA
Section 2: Replication of DNA
Lecture 4: Transcription in Prokaryotes Chapter 6.
Motif Search and RNA Structure Prediction Lesson 9.
You have been given a mission:  You must crack the code that you have been given. How many letters does it look like it requires to make just one English.
The Discovery of DNA as the genetic material. Frederick Griffith.
But how to count? An RNA could be very long; there may be many possible ways that base pairs can be formed: e.g., ……ACGGUACGUC….. conflicting pairs A-U,
DNA, RNA & PROTEIN SYNTHESIS CHAPTER 10. DNA = Deoxyribonucleic Acid What is the purpose (function) of DNA? 1. To store and transmit the information that.
Rapid ab initio RNA Folding Including Pseudoknots via Graph Tree Decomposition Jizhen Zhao, Liming Cai Russell Malmberg Computer Science Plant Biology.
RNAs. RNA Basics transfer RNA (tRNA) transfer RNA (tRNA) messenger RNA (mRNA) messenger RNA (mRNA) ribosomal RNA (rRNA) ribosomal RNA (rRNA) small interfering.
molecule's structure prediction
Section 2: Replication of DNA
RNA sequence-structure alignment
Section 2: Replication of DNA
Predicting RNA Structure and Function
RNA Secondary Structure Prediction
RNA Secondary Structure Prediction
Section 2: Replication of DNA
Transcription & Translation – ‘Patterns of Life’ pg
Dynamic Programming (cont’d)
Predicting the Secondary Structure of RNA
RNA Secondary Structure Prediction
RNA 2D and 3D Structure Craig L. Zirbel October 7, 2010.
Amino Acids An amino acid is any compound that contains an amino group (—NH2) and a carboxyl group (—COOH) in the same molecule.
CISC 467/667 Intro to Bioinformatics (Spring 2007) RNA secondary structure CISC667, S07, Lec19, Liao.
Dynamic Programming II DP over Intervals
Protein Synthesis.
4a. Know the general pathway by which ribosomes synthesize proteins, using tRNAs to translate genetic information in mRNA.
Presentation transcript:

Dynamic Programming (cont’d) CS 466 Saurabh Sinha

RNA secondary structure prediction

RNA RNA is similar to DNA chemically. It is usually only a single strand. T(hyamine) is replaced by U(racil) Some forms of RNA can form secondary structures by “pairing up” with itself. This can change its properties dramatically. linear and 3D view:

RNA There’s more to RNA than mRNA RNA can adopt interesting non-linear structures, and catalyze reactions tRNAs (transfer RNAs) are the “adapters” that implement translation

Secondary structure Several interesting RNAs have a conserved secondary structure (resulting from base- pairing interactions) Sometimes, the sequence itself may not be conserved for the function to be retained It is important to tell what the secondary structure is going to be, for homology detection

Conserved secondary structure N-Y A N-N’ R N-N’ / N Consensus binding site for R17 phage coat protein. N = A/C/G/U, N’ is a complementary base pairing to N, Y is C/U, R is A/G Source: DEKM

Basics of secondary structure G-C pairing: three bonds (strong) A-U pairing: two bonds (weaker) Base pairs are approximately coplanar

Basics of secondary structure

G-C pairing: three bonds (strong) A-U pairing: two bonds (weaker) Base pairs are approximately coplanar Base pairs are stacked onto other base pairs (arranged side by side): “stems”

Secondary structure elements Loop: single stranded subsequences bounded by base pairs loop at the end of a stem stem loop single stranded bases within a stem … only on one side of stem … on both sides of stem

Non-canonical base pairs G-C and A-U are the canonical base pairs G-U is also possible, almost as stable

Nesting Base pairs almost always occur in a nested fashion If positions i and j are paired, and positions i’ and j’ are paired, then these two base-pairings are said to be nested if: i < i’ < j’ < j OR i’ < i < j < j’ Non-nested base pairing: pseudoknot

Pseudoknot (9, 18) (2, 11) NOT NESTED

Pseudoknot problems Pseudoknots are not handled by the algorithms we shall see Pseudoknots do occur in many important RNAs But the total number of pseudoknotted base pairs is typically relatively small

Secondary structure prediction Find the secondary structure with most base pairs. Nussinov’s algorithm Recursive: finds best structure for small subsequences, and works its way outwards to larger subsequences

Nussinov’s algorithm: idea There are only four possible ways of getting the best structure for subsequence (i,j) from the best structures of the smaller subsequences (1) Add unpaired position i onto best structure for subsequence (i+1,j) i i+1 j

Nussinov’s algorithm: idea There are only four possible ways of getting the best structure for subsequence (i,j) from the best structures of the smaller subsequences (2) Add unpaired position j onto best structure for subsequence (i,j-1) j j-1i

Nussinov’s algorithm: idea There are only four possible ways of getting the best structure for subsequence (i,j) from the best structures of the smaller subsequences (3) Add (i,j) pair onto best structure for subsequence (i+1,j-1) j i+1j-1 i

Nussinov’s algorithm: idea There are only four possible ways of getting the best structure for subsequence (i,j) from the best structures of the smaller subsequences (4)Combine two optimal substructures (i,k) and (k+1,j) i kk+1j

Nussinov RNA folding algorithm Given a sequence s of length L with symbols s 1 … s L. Let  (i,j) = 1 if s i and s j are a complementary base pair, and 0 otherwise. We recursively calculate scores g(i,j) which are the maximal number of base pairs that can be formed for subsequence s i …s j. Dynamic programming

Recursion Starting with all subsequences of length 2, to length L g(i,j) = max of g(i+1, j) g(i,j-1) g(i+1,j-1) +  (i,j) max i < k < j [g(i,k) + g(k+1,j)] Initialization g(i,i-1) = 0 g(i,i) = 0 O(n 2 ) ? No. O(n 3 )

Traceback As usual in sequence alignment ? Optimal sequence alignment is a linear path in the dynamic programming table Optimal secondary structure can have “bifurcations” Traceback uses a pushdown stack

Traceback Push (1,L) onto stack Repeat until stack is empty: pop (i,j) if i >= j continue else if g(i+1,j) = g(i,j) push (i+1,j) else if g(i,j-1) = g(i,j) push (i,j-1) else if g(i+1,j-1) +  (i,j) = g(i,j) record (i,j) base pair push (i+1,j-1) else for k = i+1 to j-1, if g(i,k)+g(k+1,j) g(i,j) push (k+1,j) push (i,k) break (for loop)