Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.

Slides:



Advertisements
Similar presentations
CSC321: Introduction to Neural Networks and Machine Learning Lecture 24: Non-linear Support Vector Machines Geoffrey Hinton.
Advertisements

Data Mining Classification: Alternative Techniques
SVM—Support Vector Machines
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Evaluating Inforce Blocks Of Disability Business With Predictive Modeling SOA Spring Health Meeting May 28, 2008 Jonathan Polon FSA
Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.
Machine Learning Neural Networks
Sparse vs. Ensemble Approaches to Supervised Learning
Ensemble Learning what is an ensemble? why use an ensemble?
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Ensemble Learning: An Introduction
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
Data mining and statistical learning - lecture 13 Separating hyperplane.
Machine Learning: Ensemble Methods
Sparse vs. Ensemble Approaches to Supervised Learning
Ensemble Learning (2), Tree and Forest
Ensembles of Classifiers Evgueni Smirnov
Machine Learning CS 165B Spring 2012
Efficient Model Selection for Support Vector Machines
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
NEURAL NETWORKS FOR DATA MINING
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
ECE 8443 – Pattern Recognition Objectives: Bagging and Boosting Cross-Validation ML and Bayesian Model Comparison Combining Classifiers Resources: MN:
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensemble Methods: Bagging and Boosting
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CLASSIFICATION: Ensemble Methods
USE RECIPE INGREDIENTS TO PREDICT THE CATEGORY OF CUISINE Group 7 – MEI, Yan & HUANG, Chenyu.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Ensemble Methods in Machine Learning
Konstantina Christakopoulou Liang Zeng Group G21
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Classification Ensemble Methods 1
COMP24111: Machine Learning Ensemble Models Gavin Brown
Data Mining and Decision Support
1 January 24, 2016Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 7 — Classification Ensemble Learning.
Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
CSC 562: Final Project Dave Pizzolo Artificial Neural Networks.
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
… Algo 1 Algo 2 Algo 3 Algo N Meta-Learning Algo.
Each neuron has a threshold value Each neuron has weighted inputs from other neurons The input signals form a weighted sum If the activation level exceeds.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Machine Learning: A Brief Introduction Fu Chang Institute of Information Science Academia Sinica ext. 1819
1 An introduction to support vector machine (SVM) Advisor : Dr.Hsu Graduate : Ching –Wen Hong.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
Ensemble Classifiers.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning: Ensemble Methods
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Transformation: Normalization
Trees, bagging, boosting, and stacking
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
COMP61011 : Machine Learning Ensemble Models
Basic machine learning background with Python scikit-learn
Data Mining Practical Machine Learning Tools and Techniques
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
network of simple neuron-like computing elements
Ensemble learning.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Linear Discrimination
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Presentation transcript:

Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics

E-commerce is displacing physical stores Maximize customer interest/turnover Physical store Human contact with customers Expert advice Web shop Customize web shop experience Big spender: Upselling, package deals Small spender: Discounts But who is who? Background

De Volkskrant, 27 aug 2013

Text Data mining Webserver log filesCustomer database (CRM) 400,000 transactions by customers

24 attributes known of a customer transaction: day and time maximum price of products viewed price of products put in shopping basket stock levels of products customer age, gender, # past orders step in order process etc… Many transactions have some missing data Data mining

Order prediction: Will the customer place an order during this website visit? Which algorithm can best predict order probability, based on transaction data? How to deal with missing data? Can we improve predictions by combining prediction algorithms? Train/test data: DMC 2013 Research problem

Derive variables for a user session (visit) from all component transactions Use aggregation operators average: e.g. average price of products count: e.g. number of transactions sum: e.g. total time spent browsing the site standard deviation: e.g. similarity of prices etc… 62 features Feature extraction

Exclude incomplete sessions from analysis Imputation: replace missing value by a new generated value Mean imputation Predictive imputation Unique-value imputation: use an extreme value, such as –1 Imputation: No difference in prediction Prefer simple imputation: unique-value Handling missing data (56%)

Random Forest 500 decision trees Trees are diverse Bootstrapping Only look at some variables + Very fast + Few parameters to set + Internal error estimate Prediction algorithms

Support Vector Machine Numerical optimization Finds best separating hyperplane in high-dimensional space + Recognizes complex patterns + Guarantees wide margin – Slow tuning by cross-validation Prediction algorithms

Neural network Neurons, connections Weights adapted by training examples + Similar to human brain – Many parameters – Unstable results – Not guaranteed to converge Prediction algorithms

Stacked generalization Meta-classifier: learns from the output of various models Determines for each class the best linear combination of models Can use ordinary linear regression, SVM regression – Requires custom code in R – Extremely slow training Combining algorithms

Prediction accuracy ModelAccuracy Random Forest90.3% Support Vector Machine86.8% Neural network75.6% Stack: ordinary least-squares90.3% Stack: SVM regression90.3% Random Forest makes most accurate predictions Stacking does not help in our situation

63 teams created predictions Best accuracy = 97.2% 160 features Average of 600 decision trees from bootstrap sample (bagging) C4.5 decision tree algorithm Difference likely due to feature extraction DMC 2013 Competition

Webserver log data is collected during visit But at end of visit, it may be too late to influence customer When using only customer data, accuracy drops from 90.3% to 69.2% Accuracies may vary on other data sets Limitations

Order prediction possible to large degree Random Forest highly recommended Good accuracy Fast model For prediction problems, simple imputation methods are sufficient Iterate quickly, adding new features Stacking not useful in this regard Inflexible, no accuracy gain Principal Findings

Thank you for your attention! lifeforms.nl/thesis

Questions and Comments lifeforms.nl/thesis