Predicting Protein Interactions HERPES! Team Question Mark Jeff Brown Dante Kappotis Robert Vanderley Anthony Biasella.

Slides:



Advertisements
Similar presentations
Data Mining For Credit Card Fraud: A Comparative Study
Advertisements

Florida International University COP 4770 Introduction of Weka.
Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.
Random Forest Predrag Radenković 3237/10
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Predicting White Wine Quality Scores RAPHAEL MWANGI.
A Quick Overview By Munir Winkel. What do you know about: 1) decision trees 2) random forests? How could they be used?
Ensemble Learning: An Introduction
Herpes Jeff Brown Dante Kappotis Robert Vanderley Anthony Biasella.
Team Question Mark Jeff Brown Dante Kappotis Robert Vanderley Anthony Biasella A little information on: HERPES presents.
Machine Learning: Final Presentation James Dalphond James McCauley Andrew Wilkinson Phil Kovac Data Set: Yeast GOLD TEAM.
Intelligible Models for Classification and Regression
Ensemble Learning (2), Tree and Forest
Machine Learning CS 165B Spring 2012
The identification of interesting web sites Presented by Xiaoshu Cai.
1 1 Slide Evaluation. 2 2 n Interactive decision tree construction Load segmentchallenge.arff; look at dataset Load segmentchallenge.arff; look at dataset.
Group 5 Abhishek Das, Bharat Jangir.. Project Overview We received a total responses of 119 responses. The division of the responses were as follows:
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
CLASSIFICATION: Ensemble Methods
Exploiting Wikipedia Categorization for Predicting Age and Gender of Blog Authors K Santosh Aditya Joshi Manish Gupta Vasudeva Varma
Study of Protein Prediction Related Problems Ph.D. candidate Le-Yi WEI 1.
Identification of amino acid residues in protein-protein interaction interfaces using machine learning and a comparative analysis of the generalized sequence-
Prognostic Prediction of Breast Cancer Using C5 Sakina Begum May 1, 2001.
COP5992 – DATA MINING TERM PROJECT RANDOM SUBSPACE METHOD + CO-TRAINING by SELIM KALAYCI.
Konstantina Christakopoulou Liang Zeng Group G21
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Classification Ensemble Methods 1
COMP24111: Machine Learning Ensemble Models Gavin Brown
Improving Support Vector Machine through Parameter Optimized Rujiang Bai, Junhua Liao Shandong University of Technology Library Zibo , China { brj,
Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Finding τ → μ−μ−μ+ Decays at LHCb with Data Mining Algorithms
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
Final Report (30% final score) Bin Liu, PhD, Associate Professor.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Kaggle Competition Rossmann Store Sales.
Ensemble Learning, Boosting, and Bagging: Scaling up Decision Trees (with thanks to William Cohen of CMU, Michael Malohlava of 0xdata, and Manish Amde.
Modeling Cell Proliferation Activity of Human Interleukin-3 (IL-3) Upon Single Residue Replacements Majid Masso Bioinformatics and Computational Biology.
BNFO 615 Fall 2016 Usman Roshan NJIT. Outline Machine learning for bioinformatics – Basic machine learning algorithms – Applications to bioinformatics.
Decision tree and random forest
Combining Bagging and Random Subspaces to Create Better Ensembles
Bagging and Random Forests
Zaman Faisal Kyushu Institute of Technology Fukuoka, JAPAN
Can-CSC-GBE: Developing Cost-sensitive Classifier with Gentleboost Ensemble for breast cancer classification using protein amino acids and imbalanced data.
Trees, bagging, boosting, and stacking
Object Detection with Bootstrapping Carlos Rubiano Mentor: Oliver Nina
Source: Procedia Computer Science(2015)70:
Estimating Link Signatures with Machine Learning Algorithms
Basic machine learning background with Python scikit-learn
© 2013 ExcelR Solutions. All Rights Reserved Examples of Random Forest.
© 2013 ExcelR Solutions. All Rights Reserved Data Mining - Supervised Decision Tree & Random Forest.
Feature Extraction Introduction Features Algorithms Methods
Urban Sound Classification
Introduction Feature Extraction Discussions Conclusions Results
Extra Tree Classifier-WS3 Bagging Classifier-WS3
iSRD Spam Review Detection with Imbalanced Data Distributions
Implementing AdaBoost
Multiple Decision Trees ISQS7342
Machine Learning Interpretability
Machine learning Empirical Performance Analysis
Ensemble learning.
Leverage Consensus Partition for Domain-Specific Entity Coreference
Statistics 2 Lesson 2.7 Standard Deviation 2.
Recitation 10 Oznur Tastan
For First Place Most Times Up at the Table
Reecha Khanal Mentor: Avdesh Mishra Supervisor: Dr. Md Tamjidul Hoque
Predicting Loan Defaults
Practice Project Overview
… 1 2 n A B V W C X 1 2 … n A … V … W … C … A X feature 1 feature 2
THE ASSISTIVE SYSTEM SHIFALI KUMAR BISHWO GURUNG JAMES CHOU
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Presentation transcript:

Predicting Protein Interactions HERPES! Team Question Mark Jeff Brown Dante Kappotis Robert Vanderley Anthony Biasella

Dataset Overview The Dataset was the herpes virus. There were five different files and each one contained different attributes about the virus. All five algorithms were run on the different data sets. Domains was the largest of our data sets. It had 428 cases and 23 thousand attributes per case. The smaller of the five sets had only 44 attributes to 126 attributes. These Datasets are herpes table localization, herpes table physiochemical features, herpes table primary features, herpes table secondary features, and herpes table domains.

Below is a table giving the best accuracies obtained overall for each data set. ALGORITHMDOMAINS PRIMARY FEATURES SECONDARY STRUCTURE LOCALIZATION PHYSIOCHEM- ICAL DECISION TREES BAGGING BOOSTING SVM RANDOM FORESTS

Outline of results: Overall Best Performer: SVM with an average score of 72.48% Overall Worst Performer:Boostingwith an average score of 68.15% Highest AUC ROC:Random Forests on Primary Features