Senior Project – Computer Science - 2008 Machine Learning in Football Andrew Finley Advisor – Prof. Striegnitz Research Question: Every year there are.

Slides:



Advertisements
Similar presentations
COMP3740 CR32: Knowledge Management and Adaptive Systems
Advertisements

Projects Data Representation Basic testing and evaluation schemes
Florida International University COP 4770 Introduction of Weka.
University of Sheffield NLP Module 4: Machine Learning.
An Introduction to Boosting Yoav Freund Banter Inc.
Decision Trees Decision tree representation ID3 learning algorithm
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.
Decision Tree Approach in Data Mining
Application of Stacked Generalization to a Protein Localization Prediction Task Melissa K. Carroll, M.S. and Sung-Hyuk Cha, Ph.D. Pace University, School.
Indian Statistical Institute Kolkata
By Andrew Finley. Research Question Is it possible to predict a football player’s professional based on collegiate performance? That is, is it possible.
Decision Tree Rong Jin. Determine Milage Per Gallon.
About ISoft … What is Decision Tree? Alice Process … Conclusions Outline.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Three kinds of learning
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
ML ALGORITHMS. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of classifying new examples.
Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.
An Excel-based Data Mining Tool Chapter The iData Analyzer.
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
Mohammad Ali Keyvanrad
Senior Project – Computer Science – 2015 Modelling Opponents in Board Games Julian Jocque Advisor – Prof. Rieffel Abstract Modelling opponents in a game.
WEKA - Explorer (sumber: WEKA Explorer user Guide for Version 3-5-5)
1 1 Slide Evaluation. 2 2 n Interactive decision tree construction Load segmentchallenge.arff; look at dataset Load segmentchallenge.arff; look at dataset.
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
Weka: a useful tool in data mining and machine learning Team 5 Noha Elsherbiny, Huijun Xiong, and Bhanu Peddi.
Today Ensemble Methods. Recap of the course. Classifier Fusion
CLASSIFICATION: Ensemble Methods
Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.
CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
DECISION TREE Ge Song. Introduction ■ Decision Tree: is a supervised learning algorithm used for classification or regression. ■ Decision Tree Graph:
11 Project, Part 3. Outline Basics of supervised learning using Naïve Bayes (using a simpler example) Features for the project 2.
***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Copyright  2004 limsoon wong Using WEKA for Classification (without feature selection)
Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.
Classification Cheng Lei Department of Electrical and Computer Engineering University of Victoria April 24, 2015.
Machine Learning in Practice Lecture 2 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
Machine Learning Reading: Chapter Classification Learning Input: a set of attributes and values Output: discrete valued function Learning a continuous.
Machine Learning with Spark MLlib
Introduction to Machine Learning
Prepared by: Mahmoud Rafeek Al-Farra
Chapter 6 Classification and Prediction
Data Mining 101 with Scikit-Learn
Reading: Pedro Domingos: A Few Useful Things to Know about Machine Learning source: /cacm12.pdf reading.
Introduction to Data Science Lecture 7 Machine Learning Overview
Azure Machine Learning Noam Brezis Madeira Data Solutions
Classification and Prediction
Machine Learning Week 1.
Students: Meiling He Advisor: Prof. Brain Armstrong
Intro to Machine Learning
Prepared by: Mahmoud Rafeek Al-Farra
Christophe Dubach, Timothy M. Jones and Michael F.P. O’Boyle
Opening Weka Select Weka from Start Menu Select Explorer Fall 2003
CSCI N317 Computation for Scientific Applications Unit Weka
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Intro to Machine Learning
Analysis for Predicting the Selling Price of Apartments Pratik Nikte
Lecture 10 – Introduction to Weka
Chapter 10: Compilers and Language Translation
CS639: Data Management for Data Science
Assignment 1: Classification by K Nearest Neighbors (KNN) technique
Using Bayesian Network in the Construction of a Bi-level Multi-classifier. A Case Study Using Intensive Care Unit Patients Data B. Sierra, N. Serrano,
Data Mining CSCI 307, Spring 2019 Lecture 8
Presentation transcript:

Senior Project – Computer Science Machine Learning in Football Andrew Finley Advisor – Prof. Striegnitz Research Question: Every year there are players who move from collegiate football to professional football with high expectations and never meet them. Likewise, there are players with low expectations who exceed them. This leads me to question, is it possible to accurately predict the success of NFL players based on their collegiate performance? A player is generally considered successful if he is starting a majority of his games by his third season. The goal of this project is to build a program that will predict a player’s professional statistics, given their collegiate statistics. For the sake of time, I am only looking at quarterbacks and running backs. PlayerSchoolYear1Pos1Cl1G1Rush Yds1Car1Rush TD1Yds/Car1RushYds/G1Rec Yds1Rec1Rec TD1Yds/Rec1Rec/G1RecYds/G1PR1PR Yds1PR TD1Yds/PR1PR/G1KR1KR Yds1KR TD1Yds/KR1KR/G1Ret TD1Tot Yds1Tot TD1TotYds/G1 Ronnie BrownAuburn2002RBSo Year2Pos2Cl2G2Rush Yds2Car2Rush TD2Yds/Car2RushYds/G2Rec Yds2Rec2Rec TD2Yds/Rec2Rec/G2RecYds/G2PR2PR Yds2PR TD2Yds/PR2PR/G2KR2KR Yds2KR TD2Yds/KR2KR/G2Ret TD2Tot Yds2Tot TD2TotYds/G2 2003RBJr Year3Pos3Cl3G3Rush Yds3Car3Rush TD3Yds/Car3RushYds/G3Rec Yds3Rec3Rec TD3Yds/Rec3Rec/G3RecYds/G3PR3PR Yds3PR TD3Yds/PR3PR/G3KR3KR Yds3KR TD3Yds/KR3KR/G3Ret TD3Tot Yds3Tot TD3TotYds/G3 2004RBSr HeightWeight 6'-1''230 Season1Team1G1GS1Att1RushYds1RushAvg1RushLng1RushTD1Rec1RecYds1RecAvg1RecLng1RecTD1FUM1Lost1Starting 2005MiamiDolphins TRUE Season2Team2G2GS2Att2RushYds2RushAvg2RushLng2RushTD2Rec2RecYds2RecAvg2RecLng2RecTD2FUM2Lost2Starting 2006MiamiDolphins TRUE Season3Team3G3GS3Att3RushYds3RushAvg3RushLng3RushTD3Rec3RecYds3RecAvg3RecLng3RecTD3FUM3Lost3Starting 2007MiamiDolphins TRUE Data: Step 1: Gather data by parsing it off websites (NFL.com, NCAA.org) with Python scripts, and through Collegio Football (database program). Step 2: Use more Python scripts to combine data into two large.csv files for quarterbacks and running backs Step 3: Fix any left over formatting errors, and fill in any missing statistics possible. Step 4: Input into Weka (ML software), and predict desired statistics Step 5: Evaluate accuracy using cross validation Preliminary Results: Difficulty building trees with large sets of training data, better trees made when attributes are selected by hand. Baseline for accuracy is 68%, this is given if all predictions for “starting third season” are set to false and no tree is constructed Accuracy of the program varies significantly with different feature sets, feature selection is very important Classification using Decision Trees: The idea behind this project is to use classification algorithms to train a program to predict NFL stats when given collegiate stats. Classification is the process of training a program on a set of known instances, to predict unknown ones. I am using a Decision Tree algorithm to train the program. A decision tree algorithm: Creates a graph (tree) from the training data. The leaves are the classes, and branches are attribute values Goal is to make the smallest tree possible that covers all instances Uses the tree to make a set of classification rules. Next Step: Continue with different feature selections to improve accuracy to beat baseline - Sample input for running back data, blue are inputs, red are possible outputs