Presentation is loading. Please wait.

Presentation is loading. Please wait.

NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

Similar presentations


Presentation on theme: "NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP."— Presentation transcript:

1 NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP processing tools Translated proteome ML quality: Cross validation NeuroPID prediction

2

3

4 1-log(p-value, t-test) A B

5

6 MRSRTSVLTSSLAFLYFFGIVGRSALAMEETPASSMNLQHYNN MLNPMVFDDTMPEKRAYTYVSEYKRLPVYNFGIGKRWIDTND N KRGRDYSFGLGKRRQYSFGLGKRNDNADYPLRLNLDYLPVD NP AFHSQENTDDFLEEKRGRQPYSFGLGKRAVHYSGGQPLGSK RP NDMLSQRYHFGLGKRMSEDEEESSQR

7 MRSRTSVLTSSLAFLYFFGIVGRSALAMEETPASSMNLQHYNN MLNPMVFDDTMPEKRAYTYVSEYKRLPVYNFGIGKRWIDTND N KRGRDYSFGLGKRRQYSFGLGKRNDNADYPLRLNLDYLPVD NP AFHSQENTDDFLEEKRGRQPYSFGLGKRAVHYSGGQPLGSK RP NDMLSQRYHFGLGKRMSEDEEESSQR

8 Cross validation performance

9

10

11

12

13

14

15

16 S. frugiperda (Fall armyworm) 5 H. armigera (Cotton bollworm) 6 S. gregorian (Desert locust ) 4 A. florea (Little honeybee) 0 M. rotundata (Alfalfa leafcutter bee)1 C. floridanus(Florida carpenter ant) 2 A. echinatior (Leafcutter ant) 3 A C B

17

18

19 D

20 D

21 SW Arthropods UniProt Arthropods Random Forest Gradient Boosting Linear SVC Random Forest Gradient Boosting Linear SVC Mean Accuracy0.940.950.940.92 0.86 Mean Precision0.940.950.93 0.940.95 Mean Recall0.92 0.95 0.85 Mean AUC0.940.950.940.890.900.87 SW Chordata UniProt Chrodata Random Forest Gradient Boosting Linear SVC Random Forest Gradient Boosting Linear SVC Mean Accuracy0.960.970.950.900.910.85 Mean Precision0.94 0.880.910.920.89 Mean Recall0.910.920.930.91 0.83 Mean AUC0.95 0.940.900.910.85

22 Organism # sequences UniProtKB # of full length UniProtKB # of SP # of NP & SP # NeuroPID All methods Functional annotation enrichment B. mori1790817069138669 Innate immunity;Insulin-like; Chorion, Hormne (NP) S. invicta14356841224Innate immunity D. melanogaster 399613109147521120 Innate immunity; Developmental; Channel ligand; Receptor, Hormone (NP) C.elegans26005255344642189 Hormone (NP), Channel ligand; Receptor, Protease

23

24 SW Arthropods UniProt Arthropods Random Forest Gradient Boosting Linear SVC Random Forest Gradient Boosting Linear SVC Mean Accuracy0.940.950.940.92 0.86 Mean Precision0.940.950.93 0.94 0.95 Mean Recall0.92 0.95 0.85 Mean AUC0.940.950.940.89 0.90 0.87 SW Chordates UniProt Chordates Random Forest Gradient Boosting Linear SVC Random Forest Gradient Boosting Linear SVC Mean Accuracy0.960.970.950.90 0.91 0.85 Mean Precision0.94 0.880.91 0.92 0.89 Mean Recall0.910.920.930.91 0.83 Mean AUC0.95 0.940.90 0.91 0.85

25

26 organism / taxa # of UniProt (UniRef90) # of NPP in SW (UniRef90) # of NPP in UniProt (UniRef90) Prediction NeuroPID RBF a Apis Melliferra103946197 SP in Apis Mellifera b 213957 Gallus gallus20760555 SP in Gallus gallus70111 Bombyx mori152505179 SP in Bombyx mori112559 Octopoda224444 SP in Octopoda7633

27 Updates 5 7 2013

28

29 RF 7 ExtraTree 8 SVM-SVC 16 GBR 79 4 11 2 2 1

30 RF Ext-Tree 79 SVM-SVC GBR 16 8 7 86% 42% 100% 60% 75% RF Ext-Tree 18 SVM-SVC GBR 10 9 5 100% 60% 100% 33% 80% Apis melliferaThaumeledone gunteri

31 Updates 5 7 2013

32

33

34

35

36 S. frugiperda (Fall armyworm) 5 H. armigera (Cotton bollworm) 6 S. gregorian (Desert locust ) 4 A. florea (Little honeybee) 0 M. rotundata (Alfalfa leafcutter bee)1 C. floridanus(Florida carpenter ant) 2 A. echinatior (Leafcutter ant) 3 A C B

37

38

39

40

41 AB

42 66- 97 KRL....YDFG.........LG..............KRA..YsyvSEYKRL.............................pvYN..FGLGKR 98- 120 SKM....YGFG.........LG..............KR.......DG..RM...............................YS..FGLGKR 121- 164 DYD....Y.YGeededdqqaIGdedieesdvgdlmdKR..........DRL...............................YS..FGLGKR 165- 191 ARP....YSFG.........LG..............KRA..P...SGAQRL...............................YG..FGLGKR 192- 216 GGS...lYSFG.........LG..............KR........GDGRL...............................YA..FGLGKRPVN 222- 253 GRSsgsrFNFG.........LG..............KRS..D...DIDFRE...............................LEekFAEDKR 254- 316.YPqehrFSFG.........LG..............KREveP...SELEAVrneekdnssvhdkknntndmhsgerikrslhYP..FGIRKL 347- 367 RRP....FNFG.........LG..............KRI..P........M...............................YD..FGIGKR 66- 97 KRL....YDFG.........LG..............KRA..YsyvSEYKRL........pvYN..FGLGKR 98-120 SKM....YGFG.........LG..............KR.......DG..RM..........YS..FGLGKR 121-164 DYD....Y.YGeededdqqaIGdedieesdvgdlmdKR..........DRL..........YS..FGLGKR 165-191 ARP....YSFG.........LG..............KRA..P...SGAQRL..........YG..FGLGKR 192-220 GGS...lYSFG.........LG..............KR........GDGRL..........YA..FGLGKRPVNS 221-253 GRSsgsrFNFG.........LG..............KRS..D...DIDFRE..........LEekFAEDKR 254-316 YPqehrFSFG.........LG..............KREveP...SELEAVrne(25)slhYP..FGIRKL 346-367 RRP....FNFG.........LG..............KRI..P........M..........YD..FGIGKR


Download ppt "NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP."

Similar presentations


Ads by Google