Presentation is loading. Please wait.

Presentation is loading. Please wait.

Herpes Jeff Brown Dante Kappotis Robert Vanderley Anthony Biasella.

Similar presentations


Presentation on theme: "Herpes Jeff Brown Dante Kappotis Robert Vanderley Anthony Biasella."— Presentation transcript:

1 Herpes Jeff Brown Dante Kappotis Robert Vanderley Anthony Biasella

2 Human Herpes Virus 8 Found in Kaposi’s Sarcoma Kaposi’s Sarcoma is a type of skin cancer found in patients affected with HIV Patients infected with HHV-8 and HIV are at a high risk of developing Kaposi’s Sarcoma HHV-8 is in the same family of viruses as Chicken Pox, Shingles, Mono and Herpes Simplex

3 Research Background Work based on Patrick Shaugnessy’s 2008 Thesis Investigates using one organism to create a model to prediction protein-protein interaction in other organism

4 Protein Protein Interaction PPI part of biological function Signals coming from outside of cell to inside of cell (biological function and diseases) Forming complexes to carry to another protein Modifying another protein

5 Features Domains – Predicted properties from similar proteins Secondary Structure – physical structure Localization – location within cell Primary Features – amino acid sequences from proteome Physiochemical – known chemical properties

6 Previous Testing Focused on determining which algorithm and parameters were most useful with the dataset. Algorithms – Random Forests (found to be generally the best) – SVM – Bagging – Boosting – Decision Trees

7 Next Steps Eliminate domains from testing Focus on Random Forests Algorithm (Fast Random Forests) Five datasets – all combined, leave-one-out Same Organism Performance Examine effect of varying number of examples Prediction on other organisms

8 Same Organism Testing DatasetNum FeaturesMax % CorrectMax AUC Combined138277.310.84 No Localization137277.080.84 No Physiochemical28877.550.84 No Primary Features 112274.760.83 No Secondary Structure 136676.390.84 Little variation as number of trees (500, 1000, 2000) and features were varied (0.5x, x, 2x) Best overall (77.55/0.84) with 1000 trees and X features Worst overall (73.59/0.81) Primary features appears to be most important

9 Testing Number of Examples Set% Correct/ROC AUC 25% A69.2/0.75 25% B 73.3/0.80 25% C72.9/0.77 50% A 67.1/0.79 50% B 82.3/0.90 50% C 66.4/0.66 75% A 71.0/0.77 75% A 71.0/0.78 75% C 74.5/0.79 100% 77.6/0.84

10 All vs. Varied All Training Examples Varying Number of Examples

11 Herpes and Yeast We trained the FastRandomForest algorithm on our Herpes data and tested the results on our Yeast data. The results were only slightly better than a coin flip.

12 Herpes and Yeast Data Results. NameROC AUC%Correct All.4352.08 No Localization.4252.08 No Physiochemical.4052.43 No Primary.5355.06 No Secondary.4351.69

13 Yeast ROC Area Under Curve

14 Herpes and Arabidopsis We tried multiple runs of training the FastRandomForest algorithm on our Herpes data, then testing the results on the Arabidopsis data. Our results were about as good as a coin flip.

15 Herpes and Arabi Data Results. Technical difficulties caused incomplete data NameROC AUC%Correct All.5149.85 No Localization.5450.46 No Physiochemical-- No Primary-- No Secondary.550.05

16 Arabi ROC Area Under Curve


Download ppt "Herpes Jeff Brown Dante Kappotis Robert Vanderley Anthony Biasella."

Similar presentations


Ads by Google