Presentation is loading. Please wait.

Presentation is loading. Please wait.

Large-Scale Real-Time Product Recommendation at Criteo

Similar presentations


Presentation on theme: "Large-Scale Real-Time Product Recommendation at Criteo"— Presentation transcript:

1 Large-Scale Real-Time Product Recommendation at Criteo
Simon Dollé RecSys FR, December 1st, 2015

2 Catalog data Feed provided by the merchants User behavior data Large scale intent data All visits to merchant websites Page views, basket, sales events Ad display data Displayed and clicked ads

3 We buy Ad spaces

4 We buy Ad spaces We sell Clicks

5 We buy Ad spaces We sell Clicks that convert

6 We buy Ad spaces We sell Clicks that convert a lot

7 We buy Ad spaces We sell Clicks that convert a lot We take the risk

8 displays

9 displays leads to 50 clicks

10 displays leads to 50 clicks leads to 1 sale

11 3 billion ads/day 3 billion products

12 10ms to pick relevant products

13 7 data centers 15 000 servers 1200-node hadoop cluster

14 Catalog data 3B+ products Catalog data Feed provided by the merchants
User behavior data Large scale intent data All visits to merchant websites Page views, basket, sales events Ad display data Displayed and clicked ads

15 Catalog data Browsing history 3B+ products 2B events / day
Feed provided by the merchants User behavior data Large scale intent data All visits to merchant websites Page views, basket, sales events Ad display data Displayed and clicked ads

16 Catalog data Browsing history Ad display data 3B+ products
2B events / day Ad display data 20B events / day Catalog data Feed provided by the merchants User behavior data Large scale intent data All visits to merchant websites Page views, basket, sales events Ad display data Displayed and clicked ads

17 How do we do it ?

18 Recommend products for a user
What we want: reco(user) = products 1B users x 3B products ! But we need to scale and keep it fresh What we can do : Pre-select products offline Refine scoring online to get final candidates

19 Bob saw orange shoes

20 Bob saw orange shoes Some candidate products Historical

21 Bob saw orange shoes Some candidate products Historical Most viewed

22 Bob saw orange shoes Some candidate products Historical Most viewed

23 Bob saw orange shoes Some candidate products Historical Most viewed Similar

24 Bob saw orange shoes Some candidate products Historical Most viewed Similar

25 Bob saw orange shoes Some candidate products Historical Most viewed Similar Complementary

26 Recommendation Service
20K qps

27 HADOOP 20K qps Recommendation Service 50B Browsing history
Preselection computation Map-Reduce jobs 50B Browsing history

28 HADOOP 20K qps Recommendation Service Preselections 12h 500M 50B
Preselection computation Map-Reduce jobs 50B Browsing history

29 Online: sources Similarities Most viewed Most bought

30 Online: merge of products
Similarities Most viewed Most bought

31 ML model Logistic regression models because : They scale They are fast
They can handle lots of features Product-specific User-specific User-product interactions Display-specific Product-specific: price, category User-specific: usersegment, user last category User-product interactions: time since last view, category match Display-specific: desktop vs mobile

32 HADOOP 20K qps Recommendation Service Preselections 12h 500M 50B
Preselection computation Map-Reduce jobs 50B Browsing history

33 HADOOP 20K qps Recommendation Service Preselections 6h 12h 500M
Preselection computation Map-Reduce jobs Prediction models 50B Browsing history

34 HADOOP 20K qps Recommendation Service Display, Click, Sale logs
Preselections 6h 12h 500M HADOOP Preselection computation Map-Reduce jobs Prediction models 50B Browsing history

35 HADOOP 20K qps Recommendation Service Display, Click, Sale logs
Preselections 6h 12h 500M HADOOP Preselection computation Map-Reduce jobs Prediction models 50B Browsing history

36 Online: scoring Similarities Most viewed Most bought
0, , , , , , , , , , , ,007

37 Online: scoring Similarities Most viewed Most bought
0, , , , , , , , , , , ,004

38 Online: candidates -50% SHOP SHOP SHOP SHOP 0, , , , , , , , , , , ,004

39 What’s next ?

40 What’s next for us: Upcoming challenges
Long(er)-term user profiles

41 What’s next for us: Upcoming challenges
Long(er)-term user profiles More and better product information (images, semantic, NLP)

42 What’s next for us: Upcoming challenges
Long(er)-term user profiles More and better product information (images, semantic, NLP) Instant-update of similarities

43 What’s next for us: Upcoming challenges
Long(er)-term user profiles More and better product information (images, semantic, NLP) Instant-update of similarities Joint product scoring (score full banner and not products independently)

44 What’s next for you: Fancy a try?
On your own: We published datasets for click prediction 4GB display-click data: Kaggle challenge in 1TB Display-Click data (industry’s largest dataset): 4 billion of observations 156 billion feature-value available on Microsoft Azure used by edX (UC Berkeley) With us !

45

46 Questions?

47 s.dolle@criteo.com @simondolle @recsysfr
Thank you ! @simondolle @recsysfr Credits: Creative Stall, Gilbert Bages


Download ppt "Large-Scale Real-Time Product Recommendation at Criteo"

Similar presentations


Ads by Google