Presentation is loading. Please wait.

Presentation is loading. Please wait.

Real-Time Big Data Analytics From Deployment to Production 1 David Smith Revolution

Similar presentations


Presentation on theme: "Real-Time Big Data Analytics From Deployment to Production 1 David Smith Revolution"— Presentation transcript:

1 Real-Time Big Data Analytics From Deployment to Production 1 David Smith Revolution Analytics @revodavid

2 2

3 3 REAL TIME BIG DATA PREDICTIVE ANALYTICS Buzzword Bingo!

4 4Photo: Sarah&Boston (flickr: pocheco) Creative Commons BY-SA 2.0

5 5 Predictive Analytics Model Factors Scores ”IO VAPOURA” by Jaya Prime flickr.com/photos/sanjayaprime/4924462993 CC-BY 2.0 Decision Tree Logistic Regression Neural Network K-means clustering Ensemble Model Predictive Model User ID Browser Time/Date / Location Previous purchases Friend data Any known information Product of most interest Offer of most likely sale Most relevant link Forecast sale value Optimal Bid Prediction or Selection Scoring Rules

6 Real-time Deployment 1.Data distillation 2.Model development and validation 3.Model deployment 4.Real-time model scoring 5.Model refresh 6"CLOCK" by Heiko Klingele flickr.com/photos/divdax/3458668053/ CC-BY 2.0

7 1. Data Distillation in Hadoop 7 Unstructured Data Analytics Data Mart Structured Data Log Files Sensor Streams Language Text HDFS Load Map-Reduce rmr

8 8 2. The Model Development Cycle Feature Selection Sampling Aggregati on Variable Trans- formation Model Estimation Model Refineme nt Model Comparis on / Bench- marking Structured Data Predictive Model R White Paper bit.ly/r-is-hot R White Paper bit.ly/r-is-hot

9 3: Deployment Options Unknown factors SQL / Rules Engine Code (C++, Java, R, Hadoop) PMML Engine Factors known in advance Batch Lookup Tables 9 Factors Scores

10 Why did I buy that blender? Just browsing in the mall TV ad / magazine ad Coupon in the mail “Just moved” promo email Webstore recommendation Browsing catalog 10

11 UpStream: Attribution Modeling 11

12 ETL Marketing channel data Behavioral variables Promotional data Overlay data Exploratory data analysis Time-to-event models GAM survival models Scoring for inference Scoring for prediction 5 billion scores per day per retailer UPSTREAM DATA FORMAT CUSTOM VARIABLES (PMML) 4. Model Scoring

13 13 5. Model refresh Factors Scores Actual Outcomes

14 14 Big DataReal Time Kilobytes/S ec Megabytes/ Sec Gigabytes  Terabytes Petabytes  Exabytes Seconds Milliseconds Minutes Minutes  Hours

15 15 PREDICTIVE ANALYTICS BIG DATA REAL TIME

16 16 www.revolutionanalytics.com+1 650 646 9545Twitter: @RevolutionR The leading enterprise provider of software and services for Open Source R Real-Time Big Data Predictive Analytics: From Deployment to Production Booth 618 / Office Hours Weds 1:30PM David Smith @revodavid


Download ppt "Real-Time Big Data Analytics From Deployment to Production 1 David Smith Revolution"

Similar presentations


Ads by Google