Download presentation
Presentation is loading. Please wait.
Published byMabel Jenkins Modified over 8 years ago
1
Bike Sharing Demand Prediction PRESENTED BY:- AKSHAY PATIL 14MCB1031 MAIL@AKSHAY.IM RESEARCH FACILITATOR: PROF. BVANSS PRABHAKAR RAO M.TECH 1 ST YEAR RBL FIRST REVIEW PRESENTATION VIT-CHENNAI.
2
Objective Primary Objective: To build a superior statistical model to predict the number of bicycles that can be rented with availability of data. Secondary Objectives: 1)To learn how real time data is represented in datasets. 2)To understand how to pre-process such data. 3)To study comparison of results achieved by various Machine Learning techniques such as Regression, Decision Trees, RandomForests and SVM’s.
3
Research Scope Introduction to Bike Sharing Systems. Use of Data Analysis in such Systems.
4
Literature Survey Regression: Package used: lm Decision Trees: Package Used: rpart, ctree RandomForests: Package Used: randomForest SVM: Package Used: e1071
5
Proposed Methodology Fetch & Analyze Data Clean Data Partition Data Remove Missing Data Clean Data Create New Factors PreProcessing Building a Prediction Model Validate the Model Predict Values for Test Data
6
About Data: The training set is comprised of the first 19 days of each month, while the test set is the 20th to the end of the month of year 2011 and 2012. Training Data: 10866 observations of 12 variables. Test Data: 6493 observations of 9 variables.
7
Dataset Description
8
Implementation Tools R Weka
9
Work Done: Understanding Data Factorize training set and test set Create time column by stripping out timestamp Create new timestamp column Create day of week column Create and factorize Sunday variable
10
Factorized Data:
11
Timeline Till 20 th January: Finalizing RBL topic 20 th January – 5 th February: Understanding dataset and gaining domain knowledge 6 th February – 20 th February: Literature Survey and methods. 21 st February – 20 th March: Implementation 21 st March- 10 th April: Testing and improving model 11 th April – 30 th April: Writing Paper
12
Stats: “In the world of data analysis, Analysts require only 20% of the total project time in building the actual models, about 60% of the period is spent in understanding and pre-processing the data” - Mat McHogan, Data Scientist, SVDS.com
13
References 1] Bike Sharing Demand: http://www.kaggle.com/c/bike-sharing-demandhttp://www.kaggle.com/c/bike-sharing-demand 2] Fanaee-T, Hadi, and Gama, Joao, Event labeling combining ensemble detectors and background knowledge, Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg. 3]Decision Tree Learning: http://www.cs.cmu.edu/afs/cs/project/theo- 20/www/mlbook/ch3.pdf 4]A Tour of Machine Learning Algorithms: http://machinelearningmastery.com/a-tour-of-machine-learning- algorithms/http://www.cs.cmu.edu/afs/cs/project/theo- 20/www/mlbook/ch3.pdf
14
Any Suggestions?
15
Thank You
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.