Quantitative Research and Analytics, Proprietary and Confidential1 Ryan Michaluk
Introduction to Skytree Quantitative Research and Analytics Since 2012 Leading Platform for ML on Big Data
Data Size Model Complexity Some laws of data science Quantitative Research and Analytics Predictive Accuracy # of Iterations
Data Size Model Complexity Some laws of data science Quantitative Research and Analytics Predictive Accuracy # of Iterations But: Computation Time
Data Size Model Complexity Computation Time Some laws of data science Quantitative Research and Analytics Predictive Accuracy # of Iterations
Allstate - We are … Quantitative Research and Analytics Largest public personal lines insurer in US 16 million households 40,000 employees 11,000 agencies 4 brands Auto, home, life, retirement
Quotes – personal, item, offer Policies – personal, item, history Claims – loss, participant, repair, adjuster notes Telematics – events, mileage Agencies – policy, sales, region, internal Data is the foundation of our business Quantitative Research and Analytics
Pricing Fraud Prevention Underwriting Marketing Customer Experience Data drives our decisions Quantitative Research and Analytics
Pricing Fraud Prevention Underwriting Marketing Customer Experience Make the best possible decisions Data drives our decisions Quantitative Research and Analytics
Statistical models that find patterns Descriptive (good) What happened Predictive (better) What will happen Prescriptive (best) What action will produce the best outcome What is machine learning Quantitative Research and Analytics
Action Predictive Model Result Predictive model Quantitative Research and Analytics
Best Result Do This Prescriptive model Quantitative Research and Analytics Action Predictive Model Result
Make data driven decisions Adapt to change Power scales with amount of data Machine learning has many benefits Quantitative Research and Analytics
Data Challenges - accessible, usable Resource Challenges - people, tools, time Cultural Challenges - information is valuable, algorithms are useful … but can be difficult to implement Quantitative Research and Analytics
Machine learning creates a big impact Quantitative Research and Analytics
Machine learning creates a big impact Quantitative Research and Analytics Improvement comparable to including the most important variable
Better algorithms take more time GLM Random Forest GBM K-Means Topological Methods More data takes more time Machine learning is a process Machine learning is hard Quantitative Research and Analytics
Change Parameters Build Model Validate Model Model Development Cycle Machine learning is iterative Quantitative Research and Analytics
Data Size Model Complexity Iteration Time Computation Speed Iteration Time # of Iterations Data Scientist Time Data scientist time Quantitative Research and Analytics
Avoid hard / large problems Reduce data size Reduce model complexity Reduce # of iterations Increase computation speed Reducing time requires tradeoffs Quantitative Research and Analytics
Avoid hard / large problems Reduce data size Reduce model complexity Reduce # of iterations Increase computation speed Reducing time requires tradeoffs Quantitative Research and Analytics Machine Learning on Hadoop
Challenges exist Algorithms don’t parallelize easily More than just model training Options Build your own Exactly what you want, maybe Really hard Large opportunity cost Vended solution Is ML on Hadoop right for you Quantitative Research and Analytics
Bring ML to data Scalable ML environment Improve existing solutions Tackle new projects Data scientists have more time to solve problems ML on Hadoop is right for Allstate Quantitative Research and Analytics
ML on Hadoop is right for Allstate Quantitative Research and Analytics Good for the Business Good for Data Scientists
Questions Quantitative Research and Analytics