Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © 2014 Criteo 2014-11-20 15 millions de prédictions par seconde Les défis de Criteo Nicolas Le Roux Scientific Program Manager - R&D.

Similar presentations


Presentation on theme: "Copyright © 2014 Criteo 2014-11-20 15 millions de prédictions par seconde Les défis de Criteo Nicolas Le Roux Scientific Program Manager - R&D."— Presentation transcript:

1 Copyright © 2014 Criteo 2014-11-20 15 millions de prédictions par seconde Les défis de Criteo Nicolas Le Roux Scientific Program Manager - R&D

2 Copyright © 2014 Criteo What does Criteo do? We buy advertising spaces on websites We display ads for our partners We get paid if the user clicks on the ad

3 Copyright © 2014 Criteo Retargeting

4 Copyright © 2014 Criteo In practice 1. A user lands on a webpage 2. The website contacts an ad-exchange 3. The ad-exchange contacts Criteo and its competitors 4. It is an auction: each competitor tells how much it bids 5. The highest bidder wins the right to display an ad

5 Copyright © 2014 Criteo Details of the auction Real-time bidding (RTB) Second-price auction: the winner pays the second highest price Optimal strategy: bid the expected gain Expected gain = price per click (CPC) * probability of click (CTR)

6 Copyright © 2014 Criteo Goal of the arbitrage Choose the appropriate advertiser Only win the profitable displays: it is a prediction problem An overbid means you are losing money An underbid means you are losing potential revenue

7 Copyright © 2014 Criteo What to do once we win the display? We are now directly in contact with the website Choose the best products Choose the color, the font and the layout Generate the banner

8 Copyright © 2014 Criteo Goal of the recommendation Increase the value of a display: it is a ranking problem The potential gains are huge

9 Copyright © 2014 Criteo Real-time constraints at Criteo More than 2 billion banners displayed per day

10 Copyright © 2014 Criteo Real-time constraints at Criteo More than 2 billion banners displayed per day

11 Copyright © 2014 Criteo Real-time constraints at Criteo More than 2 billion banners displayed per day

12 Copyright © 2014 Criteo Fitting within the real-time constraints To avoid timeouts, only simple models are allowed Logistic regression Decision trees What freedom do we have?

13 Copyright © 2014 Criteo Predicting the CTR

14 Copyright © 2014 Criteo Predicting the CTR

15 Copyright © 2014 Criteo A large number of parameters

16 Copyright © 2014 Criteo A large number of parameters

17 Copyright © 2014 Criteo A large number of parameters

18 Copyright © 2014 Criteo Modeling higher-order information A linear model cannot represent higher order information

19 Copyright © 2014 Criteo Modeling higher-order information A linear model cannot represent higher order information E.g. CurrentUrl = "disney.com" and Advertiser = "Guns4Life"

20 Copyright © 2014 Criteo Modeling higher-order information A linear model cannot represent higher order information E.g. CurrentUrl = "disney.com" and Advertiser = "Guns4Life" We model these by creating "cross-features"

21 Copyright © 2014 Criteo Modeling higher-order information

22 Copyright © 2014 Criteo Hashing This model has estimation and computational issues

23 Copyright © 2014 Criteo Hashing

24 Copyright © 2014 Criteo Hashing

25 Copyright © 2014 Criteo Hashing

26 Copyright © 2014 Criteo Hashing

27 Copyright © 2014 Criteo Dealing with collisions

28 Copyright © 2014 Criteo Dealing with collisions

29 Copyright © 2014 Criteo Dealing with collisions

30 Copyright © 2014 Criteo The cost of adding variables « Hey, I thought of this great variable: Time since last product view. Can we add it to the model? »

31 Copyright © 2014 Criteo The cost of adding variables « Hey, I thought of this great variable: Time since last product view. Can we add it to the model? » Storage: #Banners/day x #Days x 4 = 480GB

32 Copyright © 2014 Criteo The cost of adding variables « Hey, I thought of this great variable: Time since last product view. Can we add it to the model? » Storage: #Banners/day x #Days x 4 = 480GB RAM: #Users x #Campaigns x 4 = 40GB

33 Copyright © 2014 Criteo Feature selection About 40 original features

34 Copyright © 2014 Criteo Feature selection About 40 original features 780 level-2 cross-features 9880 level-3 cross-features

35 Copyright © 2014 Criteo Feature selection About 40 original features 780 level-2 cross-features 9880 level-3 cross-features Each feature contains many modalities

36 Copyright © 2014 Criteo Feature selection at scale Greedy methods score the variables independently Group sparsity methods score them jointly but are biased (think L1)

37 Copyright © 2014 Criteo Feature selection at scale Greedy methods score the variables independently Group sparsity methods score them jointly but are biased (think L1) Let us use greedy methods to provide a good initial set of features Refine with group sparsity

38 Copyright © 2014 Criteo Learning the parameters n = 10^9, p = 10^7 Theory tells us that stochastic gradient methods should be used

39 Copyright © 2014 Criteo Learning the parameters - The actual situation Tens of models are trained several times a day Warm starts favor batch methods Not all points are equal Stochastic methods are harder to parallelize.

40 Copyright © 2014 Criteo Criteo's optimizer

41 Copyright © 2014 Criteo Criteo's optimizer

42 Copyright © 2014 Criteo Dealing with spurious clicks Some clicks do not bring sales Some clicks are pure fake We can build a fraud detection system

43 Copyright © 2014 Criteo Dealing with spurious clicks Some clicks do not bring sales Some clicks are pure fake We can build a fraud detection system What is the real issue here?

44 Copyright © 2014 Criteo Dealing with spurious clicks The real goal is to only buy clicks which bring sales

45 Copyright © 2014 Criteo Dealing with spurious clicks The real goal is to only buy clicks which bring sales We have labeled data

46 Copyright © 2014 Criteo Dealing with spurious clicks The real goal is to only buy clicks which bring sales We have labeled data Let's build a sale prediction model.

47 Copyright © 2014 Criteo Are we done yet? We know how to build a good model We know how to train it We are predicting a meaningful target

48 Copyright © 2014 Criteo Simpson’s paradox CTROverall First position60/9000 (0.67%) Second position50/7000 (0.71%)

49 Copyright © 2014 Criteo Simpson’s paradox CTROverallLow-value usersHigh-value users First position60/9000 (0.67%)48/8000 (0.6%)12/1000 (1.2%) Second position50/7000 (0.71%)2/1000 (0.2%)48/6000 (0.8%)

50 Copyright © 2014 Criteo Simpson’s paradox CTROverallLow-value usersHigh-value users First position60/9000 (0.67%)48/8000 (0.6%)12/1000 (1.2%) Second position50/7000 (0.71%)2/1000 (0.2%)48/6000 (0.8%) Because of the underprediction, we might not detect our error.

51 Copyright © 2014 Criteo Offline model evaluation The data collection process depends on the current algorithm Changing the predictor changes the input distribution How can we predict what will happen?

52 Copyright © 2014 Criteo Evaluating the model A/B test Slow Costly

53 Copyright © 2014 Criteo Evaluating the model A/B test Slow Costly Counterfactual analysis: “What would have happened if…?” Based around importance sampling

54 Copyright © 2014 Criteo Counterfactual analysis Requires randomization in real time Requires the ability to replay what happened “Would I have taken another decision?” Extensive logging: ALL campaigns

55 Copyright © 2014 Criteo Using side information There is never enough data With more information, we could learn finer behaviours But there are only so many sales, can we do better?

56 Copyright © 2014 Criteo From unsupervised to weakly supervised learning Unsupervised learning tries to learn about X Weakly supervised learning uses related tasks Long visits on the website Sales which do not follow a click Big data: unstructured targets rather than inputs

57 Copyright © 2014 Criteo Summary for the prediction The technical constraints shape our solution Accuracy is not the only goal Scaling is much more than the quantity of data

58 Copyright © 2014 Criteo Summary for the prediction The technical constraints shape our solution Accuracy is not the only goal Scaling is much more than the quantity of data Which latest developments can we incorporate and benefit from?

59 Copyright © 2014 Criteo Product recommendation

60 Copyright © 2014 Criteo Product recommendation

61 Copyright © 2014 Criteo Two-stage approach Stage 1: Product preselection based on: Popularity Browsing history of the user

62 Copyright © 2014 Criteo Two-stage approach Stage 1: Product preselection based on: Popularity Browsing history of the user Stage 2: Exact scoring using a prediction model

63 Copyright © 2014 Criteo Major hurdles for product recommendation Products come and go There might be a sale (Black Friday) Complementary vs. similar products

64 Copyright © 2014 Criteo Other challenges Multiple products in a banner Interaction between products and layout Different timeframes for different products

65 Copyright © 2014 Criteo A first recipe for success There are many sources of success/failure It is often suboptimal to focus on one The first step to address each source is often manual

66 Copyright © 2014 Criteo Conclusion Prediction is at the core of our business Huge engineering constraints Different bottlenecks than in academia Build from the ground up, not the other way around

67 Copyright © 2014 Criteo This is the end Thank you! Questions?


Download ppt "Copyright © 2014 Criteo 2014-11-20 15 millions de prédictions par seconde Les défis de Criteo Nicolas Le Roux Scientific Program Manager - R&D."

Similar presentations


Ads by Google