Presentation is loading. Please wait.

Presentation is loading. Please wait.

Anthony Goldbloom CEO, Kaggle Predictive modeling competitions Photo by mikebaird,

Similar presentations


Presentation on theme: "Anthony Goldbloom CEO, Kaggle Predictive modeling competitions Photo by mikebaird,"— Presentation transcript:

1 Anthony Goldbloom CEO, Kaggle Predictive modeling competitions Photo by mikebaird, making data science a sport

2 Global competitions 1½ weeks 70.8% Competition closes 77% State of the art 70% Predicting HIV viral load

3 HIV ResearchStock Price PredictionChess Ratings Dr. Derek Gatherer UK Diverse experts solving diverse problems John Blatz Baltimore Edmund & Adrian London & USA Jason Trigg Pennsylvania Chih-Li Sung & Roy Tseng Penghu & Taipei Jure Zbontar Ljubljana Thomas Mahony Canberra Emir Delic Australia Glen Maher Canberra Chris Raimondi Batimore Claudio Perlich USA Gzegorz Swiszcz Gera Edmund & Adrian London & USA Rajstennaj Barrabas USA Jason Trigg Pennsylvania Felipe Maia Uppsala University Lee Baker Las Cruces, NM Cole Harris Texas Nan Zhou Pittsburgh Uri Blass Tel-Aviv Giuseppe Ragusa Rome Robert Warsaw Jeremy Howard Australia Ivan Russian Federation Chris DuBois Portland Philipp Emanuel Widmann Heidelberg, DE Dr. Christopher Hefele, New York Travel Time PredictionGrant Application Forecasting

4 1. Motivation 2. Why host a competition? 3. Why compete? 4. How it works 5. Heritage Health Prize 6. Questions

5 I keep saying the sexy job in the next ten years will be statisticians. Hal Varian Google Chief Economist 2009

6 Mismatch between those with data and those with the skills to analyse it Crowdsourcing

7 7 Countless possible approaches to any data prediction problem. Which to choose?

8 8 18 year old beating his professors

9 1. Motivation 2. Why host a competition? 3. Why compete? 4. How it works 5. Heritage Health Prize 6. Questions

10 Forecast Error (MASE) Existing model Tourism Forecasting Competition Aug 92 weeks later 1 month later Competition End

11 Existing model (ELO) Chess Ratings Competition Aug 41 month later 2 months later Today Error Rate (RMSE)

12 Our User Base

13 neural networks logistic regression support vector machine decision trees ensemble methods adaBoost Bayesian networks genetic algorithms random forest Monte Carlo methods principal component analysis Kalman filter evolutionary fuzzy modeling Users apply different techniques

14 Benchmarking

15

16 Successful grant applications ~25% NASA tried, now its our turn

17 Ideal for complex problems

18 Successful grant applications Outcomes of a competition to predict the success of grant applications: -Better identify likely successes to avoid wasting resources on hopeless applications -Identify and communicate the characteristics of a successful application to future applicants ~25%

19 1. Motivation 2. Why host a competition? 3. Why compete? 4. How it works 5. Heritage Health Prize 6. Questions

20 Clean, Real world data Professional Reputation & Experience Interactions with experts in related fieldsPrizes Why Participants Compete More fun than Sudoku

21 User base

22

23 1. Motivation 2. Why host a competition? 3. Why compete? 4. How it works 5. Heritage Health Prize 6. Questions

24 1 23 UploadSubmitEvaluate & Exchange

25 Use the wizard to post a competition

26 Participants make their entries

27 Competitions are judged based on predictive accuracy

28 Competition Mechanics Competitions are judged on objective criteria

29 1. Motivation 2. Why host a competition? 3. Why compete? 4. How it works 5. Heritage Health Prize 6. Questions

30 An upcoming competition, powered by Kaggle De-identified dataset containing medical records of 100,000 Americans $3 million prize

31 Probability of going to hospital in the next year & Unfilled Prescriptions Diabetes & Hypertension & High Cholesterol

32 NetFlix Prize 2006 – 2009 $1 million prize 50,000 registrations 2011 $3 million prize Projected 100,000 registrations

33 1. Motivation 2. Why host a competition? 3. Why compete? 4. How it works 5. Heritage Health Prize 6. Questions

34 Chess Ratings – Elo vs. the Rest of the WorldIJCNN Social Network ChallengeTourism Forecasting (Part 2)Predict Grant Applications

35 Anthony Goldbloom Jeff Moser Jeremy Howard Nicholas Gruen

36 Photo by gidzy, What could the worlds best analysts find in your data? phone


Download ppt "Anthony Goldbloom CEO, Kaggle Predictive modeling competitions Photo by mikebaird,"

Similar presentations


Ads by Google