Presentation is loading. Please wait.

Presentation is loading. Please wait.

Work Presentation Novel RNA genes in A. thaliana Gaurav Moghe Oct, 2008-Nov, 2008.

Similar presentations


Presentation on theme: "Work Presentation Novel RNA genes in A. thaliana Gaurav Moghe Oct, 2008-Nov, 2008."— Presentation transcript:

1 Work Presentation Novel RNA genes in A. thaliana Gaurav Moghe Oct, 2008-Nov, 2008

2 Source: Nature (Commentary on ENCODE

3 Starting databases Putative Unique Transcripts (PUTs) Expressed Sequence Tags (ESTs)

4 42% of the total EST sequences in GenBank assembled into PUTs 82% of the ESTs can be mapped to a unique genomic region vs 72% of the PUTs PercentileNo. of ESTs/PUT 501 956 9929 100828 ESTs vs PUTs

5 Download PUT sequences Map them to the genome using GMAP Map to protein-coding regions Map to AT RNA genes Yes? Map to other AT features No? BLASTn against all known CDS sequences + GeneWise to confirm alignment on translated CDS sequences BLASTx against all known proteins to verify absence of any protein in the sequences Coding Index to double-verify absence of protein-like seq BLASTn against Repetitive Sequence Database No match? ~324,000 236,011 3630 551 2023 1849 1739 1453 1260

6 Download PUT sequences Map them to the genome using GMAP Map to protein-coding regions Map to AT RNA genes Yes? Map to other AT features No? BLASTn against all known CDS sequences + GeneWise to confirm alignment on translated CDS sequences BLASTx against all known proteins to verify absence of any protein in the sequences Coding Index to double-verify absence of protein-like seq BLASTn against Repetitive Sequence Database No match? ~324,000 236,011 3630 551 2023 1849 1739 1453 1260

7 Issues PUT sequences of not very good quality Use sequence of the region on the genome where these PUTs map Use EST sequences? BLAST against database does not give all hits BLAST against a different database, of a different size. PUTs extremely close to genes may be part of extended UTR regions Remove ridiculously close ones. Check directions of other PUTs.

8 What if… A sequence passes through all filters… but still is a protein sequence?

9 Issues Most of these PUTs do not show conservation Does that mean they are non-functional? Most of these PUTs do not seem to have a secondary structure like RNA Does that mean they are not RNA genes?

10 Plans for the next month Get the final list of novel PUTs Assign them directionality and estimate assembly error rates using EST mapping Conservation Secondary structure


Download ppt "Work Presentation Novel RNA genes in A. thaliana Gaurav Moghe Oct, 2008-Nov, 2008."

Similar presentations


Ads by Google