Presentation is loading. Please wait.

Presentation is loading. Please wait.

17.02.2005 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research Group Bioinformatics of.

Similar presentations


Presentation on theme: "17.02.2005 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research Group Bioinformatics of."— Presentation transcript:

1 17.02.2005 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research Group Bioinformatics of Regulation

2 Small interfering RNAs versus small temporal RNAs Hannon. Nature. 418:244-251, 2002.

3 miRNA/target duplexes Grosshans and Slack. The Journal of Cell Biology, 156(1):17-21, 2002.

4 A direct approach Given a miRNA and a potential target: What are the energetically most favourable binding sites? Calculation of multiple mfe secondary structure duplexes

5 The language of RNA duplexes hybrid =nil ><< tt (region,region)||| unpaired_left_top ||| closed... h unpaired_left_top =ult <<< tt (base,empty) ~~~ unpaired_left_top ||| unpaired_left_bot... h unpaired_left_bot = ulb <<< tt (empty,base) ~~~ unpaired_left_bot||| edangle... h edangle = eds <<< tt (base, base) ~~~ closed ||| edt <<< tt (base,emptybase) ~~~ closed ||| edb <<< tt (emptybase,base) ~~~ closed... h

6 closed =stacking_region||| bulge_top||| bulge_bottom||| internal_loop||| end_loop... h stacking_region = sr <<< basepair ~~~ closed bulge_top= (bt <<< basepair ~~~ tt (uregion, empty)) `topbound` closed bulge_bottom = (bb <<< basepair ~~~ tt (empty, uregion)) `botbound` closed internal_loop= (il <<< basepair ~~~ tt (uregion,uregion)) `symbound` closed end_loop = el <<< basepair ~~~ tt (region,region) The language of RNA duplexes

7 Dynamic Programming recurrences Time/memory complexity: linear in target length

8 let-7/lin-41 binding sites position: 688, mfe: -28.0 kcal/mol position: 737, mfe: -29.0 kcal/mol

9 Requirements For prediction of miRNA targets in large databases we need: A fast program  Good statistics

10 Length normalisation of minimum free energies

11 p-values of individual binding sites

12 Poisson statistics of multiple binding sites Probability of k binding sites: with For small p-values: The probability of at least k binding sites:

13 Comparative analysis of orthologous targets

14 Multi-species p-values Poisson p-values: multi-species p-value: General case: k species

15 A dependence problem We should see a p-value as often as it says (blue curve), but we don‘t (red curve).

16 let-7b/NME4 (human/mouse) binding sites -GGCTCAAGCTGCCCTTACCACCCCATCCCCCACGCAGGACCAACTACCTCCGTCAGCAAGAACCCAAGCCCACATCCAAACCTGCCTGTCCCAAACCAC GGGCTTGCACTGCCTTCTGCACTTCAGGTCT-ACCCATGACCTACTACCTCTGTCAACAAGAAGTCAAGCCCCCATGC---TTCCCATGTCCCCAAAC-- **** ***** * *** ** * ** ** **** ******** **** ****** ******* *** * * ****** ** * TTACTTCCCTGTTCACCTCTGCCCCACCCCAGCCCAGAGGAGTTTGAGCCACCAACTTCAGTGCCTTTCTGTACCCCAAGCCAGCACAAGATTGGACCAA -CACTCCCTACTCCCGCTCTACCCAACTCCAGCCCAGGGGAGTCTAAGCCTCAACTCTATGTGCCTTTTTGTATCCTAAGTCAATACAATATTGGACCAT *** ** * * **** *** ** ********* ***** * **** * * * ******** **** ** *** ** **** ********* TCCTTTTTGCACCAAAGTGCCGGACAACCTTTGTGGTGGGGGGGGGTCTTCACATTATCATAACCTCTCCTCTAAAGGGGAGGCATTAAAATTCACTGTG GTCCTTGTGTACAAAAGTGCCAGACAACCTTTG--------GGGCATTGTCA-AAGGTGACTTCACCTGCCTCAAAGGAGAGACATTAAAATTT--TATG * ** ** ** ******** *********** *** * *** * * * * ** * ***** *** ********** * ** CCCAGCACATGGGTGGTACACTAATTATGACTTCCCCCAGCTCTGAGGTAGAAATGACGCCTTTATGCAAGTTGTAAGGAGTTGAACAGTAAAGAGGAAG CTTAAAAT-------------------------------------------------------------------------------------------- * * * Multi-species p-value with k = 2:1.5e-08 Multi-species p-value with k = 1.1:5.0e-05 k = 1.1 is the effective k

17 Effective number of orthologous targets

18 Requirements For prediction of miRNA targets in large databases we need: A fast program  Good statistics 

19 True and false positives and negatives Classify as Positives Classify as Negatives TP FP TN FN Positives Negatives

20 TP FP TN FN Sensitivity and specificity p-values control specificity Spec TP FP TN FN Spec

21 RNAhybrid Target prediction workflow target db miRNA registry individual p-values multi-species p- values Poisson p- values bantam #sites target gene E-value Dm Dp Ag CG13906 0.000141369 2 1 1 CG3629 0.029351532 2 2 0 CG17136 0.047489474 2 0 1 CG5123 0.048580874 2 2 0 CG13761 0.120263377 0 2 2 CG11624 0.605310610 0 3 0 CG1142 0.677123716 0 0 1 CG13333 0.714171923 2 0 0

22 Prediction of Drosophila miRNA targets 78 miRNAs 28,645 3‘UTRs (1/3 from D. mel, 1/3 from D. pseu, 1/3 from A. gamb)

23 Bantam hits targetE-value#sites Dm # sites Dp #site s Ag nervous fingers 10.00014211 Distal-less0.029220 Wrinkled (Hid)0.049220

24 miR-7 hits targetE-value#sites Dm # sites Dp #sites Ag CG83940.000095233 Twin of m40.00014220 E(spl) region transcript m3 0.0083110 E(spl) region transcript m  0.094120 CG73420.21110 CG104440.27111 Him0.30120 CG111320.86110 Arginine methyltransferase 1 0.87110

25 miR-2 hits targetE-value#sitesE- value #sitesE- value #sites grim0.0141 1 00.0711 2 00.0451 1 0 reaper0.000611 1 00.111 1 00.00951 1 0 sickle0.0542 2 0 miR-2amiR-2bmiR-2c plus a number of others

26 RNAhybrid functionality length normalisation Poisson statistics web server seed/loop constraints miRNA specific statistics effective k comparative analysis multiple binding sites RNAhybrid

27 miRNA target selection surprise miRNA target selection rank based p-values E-values user guidance p-values indicate not only biochemical possibility, but also biological function.

28 Acknowledgements Peter Steffen, Robert Giegerich, Jan Krüger Matthias Höchsmann Alexander Stark, Julius Brennecke, Stephen M. Cohen Sven Rahmann Gregor Obernosterer Robert Heinen Leonie Ringrose

29 References Rehmsmeier M, Steffen P, Höchsmann M and Giegerich R. Fast and effective prediction of microRNA/target duplexes. RNA, 10:1507-1517, 2004. bibiserv.techfak.uni-bielefeld.de/rnahybrid


Download ppt "17.02.2005 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research Group Bioinformatics of."

Similar presentations


Ads by Google