Presentation is loading. Please wait.

Presentation is loading. Please wait.

For Prediction of microRNA Genes Vertebrate MicroRNA Genes Lee P. Lim, et. al. SCIENCE 2003 The microRNAs of Caenorhabditis elegans Lee P. Lim, et al GENES.

Similar presentations


Presentation on theme: "For Prediction of microRNA Genes Vertebrate MicroRNA Genes Lee P. Lim, et. al. SCIENCE 2003 The microRNAs of Caenorhabditis elegans Lee P. Lim, et al GENES."— Presentation transcript:

1 For Prediction of microRNA Genes Vertebrate MicroRNA Genes Lee P. Lim, et. al. SCIENCE 2003 The microRNAs of Caenorhabditis elegans Lee P. Lim, et al GENES & DEVELOPMENT 2003 presented by Nam jin-wu

2 Direction of research Learning of RNA Structural Grammar using Genetic Programming KISS 2003 (only structure) Learning of Structure & Words CBGI 2003 (Deadline 2003. 5. 1) Prediction of miRNA site in Precursor sequence Wha-Jin Lee Confimation of hypothetic miRNA with Experiments Dr Kim Prediction of promoter region of miRNA Prediction of miRNA target

3 Why miRNA? rRNA miRNA RNAs have active roles in gene expression !!! – “ miRNA ” RNAs have passive roles in gene expression ??? arrest

4 miRNAs generated by biogenesis of precursor Polycistronic structure in miRNA pre precursor miRNA precursor ~70nt precursor which has a hairpin fold

5 miRNAs generated by biogenesis of precursor

6 miRNA? ~20nt miRNA molecule from Dicer cutting the stem loop

7 How did they discover miRNA? P : predicted with BLAST C : Cloned N : Northern blot confirmed

8 All Small RNAs are negative regulator? Positive regulation by small RNAs.

9 Vertebrate MicroRNA Genes Lee P. Lim, et. al. SCIENCE 2003

10

11 1 2 3 4 5

12

13

14

15 Problems of miRScan For predicting miRNAs, must need a other species ’ miRNA sequences having high homology but if it is not … noble miRNAs having new feature are difficult to be detected need common structural learning …

16 Distant sequence and structure similarity >gb|AE002602|AE002602.trna11-AlaAGC (12229502-12229574) Ala (AGC) 73 bp Sc: 59.84 GGGGATGTAGCTCAGATGGTAGAGCGCTCGCTTAGCATGTGAGAGGTACGGGGATCGATG CCCCGCATCTCCA >gb|AE002708|AE002708.trna16-AlaAGC (9890622-9890694) Ala (AGC) 73 bp Sc: 63.54 GGGGATGTAGCTCAGATGGTAGAGCGCTCGCTTAGCATGTGAGAGGTACGGGGATCGATA CCCCGCATCTCCA >gb|AE002575|AE002575.trna6-AlaCGC (4289242-4289313) Ala (CGC) 72 bp Sc: 76.51 GGGGACGTAGCTCAGTGGTAGAGCGCTCGCTTCGCATGTGAGAAGTCCCGGGTTCAAACC CCGGCGTCTCCA >gb|AE002787|AE002787.trna12-AlaTGC (3350803-3350874) Ala (TGC) 72 bp Sc: 75.92 GGGGATGTAGCTCAGTGGTAGAGCGCTCGCTTTGCATGTGAGAGGCCCCGGGTTCGATCC CCGGCATCTCCA >gb|AE002708|AE002708.trna66-ArgACG (425602-425530) Arg (ACG) 73 bp Sc: 73.12 GGTCCTGTGGCGCAATGGATAACGCGTCTGACTACGGATCAGAAGATTCCAGGTTCGACT CCTGGCAGGATCG >gb|AE002769|AE002769.trna10-ArgACG (114601-114529) Arg (ACG) 73 bp Sc: 73.12 GGTCCTGTGGCGCAATGGATAACGCGTCTGACTACGGATCAGAAGATTCCAGGTTCGACT CCTGGCAGGATCG >gb|AE002602|AE002602.trna11-AlaAGC (12229502-12229574) Ala (AGC) 73 bp Sc: 59. GGGGATGTAGCTCAGATGGTAGAGCGCTCGCTTAGC ATGTGAGAGGTACGGGGATCGATGCCCCGCATC >gb|AE002708|AE002708.trna16-AlaAGC (9890622-9890694) Ala (AGC) 73 bp Sc: 63.54 GGGGATGTAGCTCAGATGGTAGAGCGCTCGCTTAGC ATGTGAGAGGTACGGGGATCGATACCCCGCATC f1f2f1f2 f1 f2 root h5 (minlen=2, maxlen=3) h5 (len=5, mispair=1) ss (len=2) h3 ss (len=4) h5 (minlen=3, maxlen=5) ss (len=3) h3 ss Discovering using PSSM Discovering with Grammar

17 Distant sequence and structure similarity Small RNA precursors have low similarity of sequence But, high similarity of structure Structural Learning is important

18 Schematic Overview First Screened Data (no false negative) Learning of RNA Structural Grammar using Genetic Programming Prediction of microRNA genes Learning both sequence and structure with a GA or others Learning of RNA Structural Grammar using Genetic Programming Learning both sequence and structure with a GA or others

19 Learning of RNA Structural Grammar using Genetic Programming Genetic programming Positive Data Set (tRNA || miRNA..) Negative Data Set (no tRNA || no miRNA) RNAmotif Grammar Parser (Counter of Hit) Evaluation Step Data set

20 Learning of RNA Structural Grammer using Genetic Programming descr H5 (minlen=8, maxlen=16, mispair=1) ss ( len=7 ) H3

21 Structural conversion of Functional Tree FunGrammarVariable f1h5 (f1 or f2) h3minlen/maxlen, len, mispair f2ssminlen/maxlen, len rootdescr h5 (minlen=2, maxlen=3) h5 (len=5, mispair=1) ss (len=2) h3 ss (len=4) h5 (minlen=3, maxlen=5) ss (len=3) h3 ss f1f2f1f2 f1 f2 root

22 RNA Structural Learning Algorithms using GP 1.Create Initial Population of Tree 2.Converse Tree to Grammar 3.Evaluation with RNAmotif 4. Repeat next steps until all new individuals created 4.1 Add Best fitness tree to new population 4.2 Select top 50% 3.3 Variation of selected trees 4.4 Add variation tree to new population 5. If it arrive end condition, stop it, but exchange old population to new population, repeat 2~5step 3.1 Local Search from X generation 6. Local Search with 7mer words C -> C++

23 Fitness Fitness = spC*Specificity + stC*Sensitivity + Complexity (1) spC + stC = 1 (spC=0.9, stC = 0.1) (2) Specificity = TP/(TP+FP) (3) Sensitivity = TP/(TP+FN) (4) Complexity = 1/(NS+PS)^2 X (iComp/ibestComp) (5) iComp = TreeDepthX10 + NodeNum (6) Fitness = (NH+1)/(PH+1)*(NS/PS) - Complexity

24 Result Average Fitness(Red) Best Fitness(Blue) descr h5(len=1 ) h5(minlen=4, maxlen=17) ss(minlen=4, maxlen=21 ) h3 ss(minlen=4, maxlen=24 ) h3 h5( mispair=5) h5(minlen=4, maxlen=21) ss( len=7) h3 ss( len=13) h3 Best Fitness Grammar

25 Additional Study Local search with words - almost done Local search of top 5 tree of each generation Make 7tuple words with training sequence Multiplication of best 100 trees on all generation to 10000 trees including word Learning both sequence and structure with GA – have ideahave idea Application of miRNA prediction Bioinformatics application notes miRNA site prediction in precursor sequence Sequence specific Dicer site Structure specific Dicer site miRNA promoter prediction – controversialcontroversial Promoter prediction of Polycistronic gene miRNA target prediction 3 ’ UTR of mRNA have low similarity of miRNA

26 Learning both sequence and structure with a GA

27

28 miRNAs may locate in Intron miRNA may locate in intron previously identified miRNAs found within annotated introns (Lau et al. 2001) 10 of 12 known Celegans miRNAs predicted to be in introns are in the same orientation as the predicted mRNAs, some of these miRNAs are not transcribed from their own promoters but instead derive from the excised pre-mRNA introns (Lee P. Lim, et al 2003 Genes & Development)


Download ppt "For Prediction of microRNA Genes Vertebrate MicroRNA Genes Lee P. Lim, et. al. SCIENCE 2003 The microRNAs of Caenorhabditis elegans Lee P. Lim, et al GENES."

Similar presentations


Ads by Google