Presentation is loading. Please wait.

Presentation is loading. Please wait.

RBP1 Splicing Regulation in Drosophila Melanogaster 03-711 - Fall 2005 Jacob Joseph, Ahmet Bakan, Amina Abdulla This presentation available at

Similar presentations


Presentation on theme: "RBP1 Splicing Regulation in Drosophila Melanogaster 03-711 - Fall 2005 Jacob Joseph, Ahmet Bakan, Amina Abdulla This presentation available at"— Presentation transcript:

1 RBP1 Splicing Regulation in Drosophila Melanogaster 03-711 - Fall 2005 Jacob Joseph, Ahmet Bakan, Amina Abdulla This presentation available at http://www.jjoseph.org/biology/

2 Alternative Splicing in Dros.

3 RBP1 Regulation Involved in dsx splicing and Rbp1 auto-regulation Involved in dsx splicing and Rbp1 auto-regulation Suspected in many other related pathways Suspected in many other related pathways

4 Genome Data Sequence of all introns of known splice variants Sequence of all introns of known splice variants Two annotated genomes available Two annotated genomes available D. Melanogaster D. Melanogaster D. Pseudoobscura D. Pseudoobscura As the gene names for D. Mel. and D. Pseu. differ, a list of gene orthologs was also obtained As the gene names for D. Mel. and D. Pseu. differ, a list of gene orthologs was also obtained

5 Computational Approach Create profile HMM for each motif (B-B, B-A) Create profile HMM for each motif (B-B, B-A) Select the end of every intron (~50 bases) Select the end of every intron (~50 bases) Perform an HMM search for each intron segment, in both D. Mel. and D. Pseu. Perform an HMM search for each intron segment, in both D. Mel. and D. Pseu. Keep matches found in both species Keep matches found in both species Keep matches at the end of introns (~15 bases) Keep matches at the end of introns (~15 bases) Return alignment of both species Return alignment of both species Examine biological similarity of matches Examine biological similarity of matches

6 Data Summary

7 Hidden Markov Profile (HMM) and HMMer We needed an HMM profiler and search program. We needed an HMM profiler and search program. Revised version of what Krogh/Haussler model called Plan 7 Revised version of what Krogh/Haussler model called Plan 7 Not only global alignment Not only global alignment

8 HMMer Advantages Possible Alignments Possible Alignments Classic global alignment Classic global alignment Classic local alignment Classic local alignment Global Profile, Local Sequence alignment Global Profile, Local Sequence alignment Fully local “multihit” alignment. Ex: Fully local “multihit” alignment. Ex: Scoring Scoring Raw alignment score Raw alignment score E-value, showing the significance of the alignment E-value, showing the significance of the alignment

9 HMMer Create HMM for multiple alignment of each B-B and B-A motif Create HMM for multiple alignment of each B-B and B-A motif Genome is scanned for high scoring matches Genome is scanned for high scoring matches Only hits within a distance of 15 base pairs of the 3’ splice site are considered Only hits within a distance of 15 base pairs of the 3’ splice site are considered

10 Results: B-A Motif CG30271-RC-in_5 (27 - 39), GA15740-in_5 (27 - 39) score: -6 ctgttgaatcacttggaaagcaatcaGTCGACAATTGTTtacttttacag | |||||||||| ||||||||||||||||||||||||||||||||||| cctttgaatcactcggaaagcaatcaGTCGACAATTGTTtacttttacag CG30020-RA-in_3 (25 - 37), GA15581-in_9 (24 - 36) score: -8 ccgtcccagtgacttacaatacgaTTCTACTATTTTTtgtacgcttacag | | | | | ||||| |||| | | taaggctcttcatactttatcaaATCTACAATTTCTcaatgtaattgcag Klp3A-RA-in_3 (31 - 43), GA21186-in_3 (26 - 38) score: -9 ttgaagttcgaaaactcctgaaactaattgTTCCACAATTTTTttttatt | || || || ||| || ||||| | | tgttcaattcttaaataaaaccaatTTCGACTCTTTTTctcttctttcag na-RB-in_0 (33 - 45), GA13546-in_2 (25 - 37) score: -9 tctggtgcactgagagaaatgccatctacttcATCGATACTCTTTtgcag | | || | | || || | tgtaaacactcgttgcaaacacaaATTTACAATCAATttccatgttttat CG30428-RA-in_2 (33 - 45), GA15840-in_1 (25 - 37) score: -9 ggtaaggaagcgtaaaaataaattctttttttATCACCAATATTTttcag | || || ||||| |||| ||||| aaaatatcaagccgaaacaaatttATGTACAATTTTTtttttatggaaag CG2199-RB-in_0 (36 - 48), GA15296-in_0 (33 - 45) score: -10 ttgctactgccattataggtagtttaaaaactgttTTCTACACTCTTTct | | | | | || ||||| | | aacaaaaacaaaaatatggccctctgataattGGGGACACTTTATttcag

11 Results: B-B Motif ps-RD-in_4 (31 - 42), GA20847-in_4 (31 - 42) score: -11 catttaatatcttgaaaatatttaacataaATCTGATGCAAAtattccag | || | || |||||||||||||||||||||||||||||||| attactattcttaaaatatatttaacataaATCTGATGCAAAtattccag fru-RE-in_6 (26 - 37), GA12896-in_5 (24 - 35) score: -13 cccacccccacagtgatgacgcctaATATGAACCAAGcaaatgtttgcag | | | | | | ||| | || | | | | tgctaaataaaccaaattccaaaCTCTGATCAAAAaataccgataaaaag Ptp52F-RA-in_0 (38 - 49), GA14851-in_14 (34 - 45) score: -13 tactctttgaaaaataagcatatggatgtcactgataATATGATATTAAt | | | | || | ||| || || tctaaatcgtattcaaatcgaattgaaacataaATCGAATCCAAAaacag CG9455-RA-in_0 (32 - 43), GA21800-in_0 (27 - 38) score: -13 aatagtggctttgttttaataacaatgtaatATCTGATATTTAttctcag | | | | | ||||| | | | cagagcgtgccccgtctgatgatccgAACTGATCTGATgtttttcggtag CG8709-RA-in_2 (34 - 45), GA21271-in_9 (34 - 45) score: -13 acaaatcttaggaaataccaaagttgttctacgATCTTATCTATGgagtc | | | | | | || || | |||||| gccccatcagtgtcagtggcagctgaccccaccATTTGATCTATTtgcag CG7966-RA-in_0 (37 - 48), GA20727-in_4 (26 - 37) score: -13 tatatgtacacattgtactgcaaacacatgccctgaATCTTTGATAAAga | | ||| | | |||||| | |||| gtgttgaatgaaagaatacacttgaATCGGTTCTAAAttgcatcgcacag

12 Biomolecular Activity: B-A

13 Biomolecular Activity: B-B

14 Biomolecular activity analysis fru gene, regulated by the tra and tra2 genes is expressed at the same time as dsx gene helps validate our results. fru gene, regulated by the tra and tra2 genes is expressed at the same time as dsx gene helps validate our results. Expected presence of sxl and tra genes. Expected presence of sxl and tra genes. Functional Similarity: Functional Similarity: B-A motif: SNF4Agamma, rdgc, qtc. B-A motif: SNF4Agamma, rdgc, qtc. B-B motif: ps, ptp, CG9455. B-B motif: ps, ptp, CG9455.

15 Difficulties & Future Directions Support Vector Machines were applied Support Vector Machines were applied Lack of significant training data. Lack of significant training data. Lack of direct experimental data for cross- validation. Lack of direct experimental data for cross- validation. Since the current D. Pse. genome has far fewer intron sequences, reliance upon orthologs introduces many false negatives. Since the current D. Pse. genome has far fewer intron sequences, reliance upon orthologs introduces many false negatives.

16 Alternate Approach: Support Vector Machines (SVM) Used for data classification Used for data classification Creates hyperplanes that separate data into two classes with maximum-margin Creates hyperplanes that separate data into two classes with maximum-margin Appropriate for multidimensional classification problems Appropriate for multidimensional classification problems Examples Examples Article classification Article classification Protein classification Protein classification Critical points Critical points Feature selection Feature selection Training Training

17 HMM and SVM HMMer is used to generate features HMMer is used to generate features All genome searched for A and B consensus sequences All genome searched for A and B consensus sequences Search results for each intron combined to create features Search results for each intron combined to create features Features Features Scores of two motifs in the upstream (2) Scores of two motifs in the upstream (2) Distance of the motifs to the splice site (1) Distance of the motifs to the splice site (1) Length of consensus sequence overlap (1) Length of consensus sequence overlap (1) Length of motif (1) Length of motif (1) Does consensus sequence B precedes A (1) Does consensus sequence B precedes A (1) Number of features = 6 Number of features = 6

18 Summary Profile HMM used for modeling Profile HMM used for modeling Comparative analysis with the D.Pseu genome Comparative analysis with the D.Pseu genome High scoring alignments for both motifs further analyzed for biomolecular activity High scoring alignments for both motifs further analyzed for biomolecular activity The existence of the fru and other close matches help to validate our results The existence of the fru and other close matches help to validate our results


Download ppt "RBP1 Splicing Regulation in Drosophila Melanogaster 03-711 - Fall 2005 Jacob Joseph, Ahmet Bakan, Amina Abdulla This presentation available at"

Similar presentations


Ads by Google