Download presentation
Presentation is loading. Please wait.
Published bySheryl Watson Modified over 9 years ago
1
Identifying and classifying functional small RNAs from pine Ryan Morin BC Genome Sciences Centre (presenting research conducted in the lab of Dr. Peter Unrau)
2
Small RNAs in the land plants
3
MicroRNA Biogenesis MicroRNA (miRNA) transcripts can be mono- or polycistronic Each miRNA derives from its own hairpin foldback region of the primary miRNA transcript A number of processing steps are required to form the mature miRNA sequence The final mature miRNA is bound to a complex of proteins known as RISC
4
MicroRNAs are post-transcriptional gene regulators MiRNAs are short (18-24nt) RNAs that are derived from larger precursor transcripts Mature miRNAs hybridize to near-complementary sites within their target messenger RNAs Messenger RNAs hybridized to microRNAs are enzymatically degraded, preventing translation
5
siRNAs can be post- or pre- transcriptional gene regulators siRNA precursors are also processed by dicer (plants have multiple Dicer paralogs) Many active mature siRNAs are produced from the primary duplex Mature siRNAs can elicit post-transcriptional regulation by a microRNA-like mechanism SiRNAs can also direct methylation of histones or DNA, altering chromatin state (so-called heterochromatin siRNAs)
6
Purification, adaptor ligation and amplification of small RNAs smRNA 5’ RNA adaptor 3’ DNA adaptor ligate Random 10mer Reverse PCR Primer complementary sequence Extract sequences in desired size range RT-PCR & PCR amplification Gel Purify (remove un-ligated sequences) 454 sequencing and amplification primer sequences Library Identification Tag (8mer) Forward PCR Primer sequence 454 sequencing and amplification primer sequences Extract Total RNA biotin Pine Rice
7
Rice and Pine sequences produced by 454 SBS technology 142,493 reads from pine, 11,436 from rice Small RNA sequence was extracted by identifying flanking adaptor sequences Random tag sequence was used to separate PCR- amplified constructs from multiple occurrences of the same small RNA species There were 58,466 unique sequences from pine and 8,615 from rice Most common sequence was observed 1168 times in pine and 89 times in rice
8
Positional annotation of small RNAs and identification of conserved pine small RNAs Alignment of both of pine small RNA sequences to the rice genome reveals perfectly conserved pine small RNAs Pine sequences aligning to rRNA or tRNA genes in the rice genome are likely degradation products of pine rRNA/tRNA Sequences aligning within annotated regions were annotated as rRNA, tRNA, miRNA, repeat-derived siRNA etc.
9
Positionally-annotated small RNAs show distinct trends in sequence length
10
1)No more than 4 unpaired nt total (or >2 consecutive unpaired nt) within miRNA/miRNA* region 2) The length of the hairpin must be at least 60nt 3) No more than one asymmetrically unpaired nucleotide 4) The pairing must extend 4nt beyond the mature miRNA Pre-miRNA structure: the ‘gold standard’ for miRNA annotation To classify a sequence as a miRNA, one usually has to be convinced it forms a suitable pre-miRNA structure Knowing the sequence of the mature miRNA helps to separate random hairpins from real pre-miRNAs Rules enforced by the Bartel lab (MIRcheck)
11
Finding clusters of related sequences Align each pair of sequences in the database including all known miRNAs (mature miRNA sequences from every plant species in miRBase) Link sequences in the database if they align over >18 nt with < 4 mismatches Treating these relationships as a graph, extract all the connected components 1 1 1 2 2 2 2 1 1 1 1 1 1 1 3 1
12
Putting clusters to use Part 1: Guilt by association Known miRNA families generally partition into the same cluster (along with un-annotated sequences) If they are low-quality (i.e. large/diverse) clusters are ignored sequences can be readily annotated as conserved miRNAs if they share a cluster with a known miRNA
13
Leveraging EST/cDNA sequences (miRNA annotation Method 1) Pine small RNAs were also aligned to the publicly available pine EST sequences miRNAs align to their ‘parent’ transcripts in a distinctive way We fished for un-annotated small RNA sequences with miRNA-like alignments to pine ESTs Folding and validation of these ESTs revealed 19 putative novel pine miRNA families miRNA sequences miRNA* sequences
14
Putting clusters to use part 2: miRNA-like Clusters (method 2) Clusters of known miRNA families had many similarities which might allow their separation from non-miRNA clusters –High information content –Mean sequence length –Low variance in sequence length –Few indels, relatively few genomic loci A support vector machine (SVM) classifier was trained on all known microRNA clusters This classifier demonstrated high specificity (0.96) and modest sensitivity (0.65) 54 clusters were predicted to be miRNA clusters (all pine-specific) and may represent novel pine microRNA families 5 of these clusters are confirmed miRNAs based on EST folding
15
Putting clusters to use, part 3: miRNA/miRNA* pairs (method 3) Maturation of a miRNA often leads to two small RNA molecules: the miRNA and its semi-complementary partner (miRNA*) Many of the clusters with small RNAs that align in forward/reverse pairs contained known microRNAs Some of these clusters contained sequences that aligned to ESTs, which could be folded and checked for good pre-miRNA structures This method revealed an additional 24 clusters of novel miRNA sequences
16
Mean information content for clusters annotated as microRNAs
17
Summary Some of the 24nt heterachromatin siRNAs (including repeat-derived siRNAs) are conserved between gymnosperms and angiosperms Pine small RNA sequences are rich in conserved microRNAs Various methods, including sequence clustering, RNA folding and machine learning, reveal many putative novel pine miRNA families
18
Acknowledgements Thanks: Peter Unrau (SFU, MBB) Cenk Sahinalp & his lab (SFU, CS) Gozde Cozen (SFU, CS) Alex Ebhardt (SFU, MBB) Elena Dolgosheina (SFU, MBB) Matt Hickenbotham (WashU) Vincent Magrini (WashU) The VanBUG Dev team Other VanBUG sponsors: IBM, Genome BC
19
Alignment of 3 ESTs from a novel pine miRNA family
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.