Download presentation

Presentation is loading. Please wait.

Published byKylie Perkins Modified over 2 years ago

1
Thore Graepel Online Services and Advertising Group Microsoft Research Cambridge

2
Complex large-scale data in the enterpriseComplex large-scale data in the enterprise –What kind of data is available? –What technologies are used? –Tasks and enterprise-specific challenges? Methodology:Methodology: –Bayesian Inference in Factor Graph Models –PQL: Using SQL to describe probability models Applications:Applications: –Gamer Rating and Matchmaking: TrueSkill –Click-Through Rate Prediction: AdPredictor –Large-Scale Recommendations: Matchbox

3
Joint work with Tom Minka & Phillip Trelford

4
Online Services DivisionOnline Services Division –Web index –Search and Ad click logs (12-15 TB / day) –Hotmail, Instant messaging, Internet Explorer (100s million users) –MSN portal and Bing maps Xbox Live Gaming ServiceXbox Live Gaming Service –User transaction log data –Ranking and matchmaking data –Game instrumentation for user testing

5
Development and Software InstrumentationDevelopment and Software Instrumentation –Watson (customer feedback data) –Source depot (MS source code, e.g., Office, Windows) –Multilingual technical documentation BusinessBusiness –Customer databases –Sales and Marketing

6
Prediction of user behaviour and preferencesPrediction of user behaviour and preferences –Improve web search –Improve targeting for advertising –Spam filtering and content prioritisation Improve user experienceImprove user experience –Matchmaking for games –Multi-modal user interfaces (Natal, speech) Improve software development processImprove software development process –Improve productivity of developers –Analyse software for defects

7
Relational Databases/SQLRelational Databases/SQL –Great agility for analysis and reliability for business –Limited scalability –Need to import data into SQL Windows HPCWindows HPC –Complex computations / fine grained parallelism –Need to move data to HPC cluster CosmosCosmos –Take the computation to the data –Super efficient stream based computations

8
Cosmos Stream Cluster Machine Dryad SCOPEDryadLINQSputnik

9
PrivacyPrivacy –Privacy limit the ways in which data can be used –Interesting trade-offs (differential privacy) IncentivesIncentives –Data produced by self-interested agents –Need to design incentive compatible mechanisms Exploration/ExploitationExploration/Exploitation –Results of inference feed back into business process and determine future observations. –Need to aim at long-term benefits

10

11
Definition: Graphical representation of product structure of a function (Wiberg, 1996)Definition: Graphical representation of product structure of a function (Wiberg, 1996) –Nodes: = Factors = Variables –Edges: Dependencies of factors on variables. Question:Question: –What are the marginals of the function (all but one variable are summed out)? –What is the mode of the function?

12
ss s2s2s2s2 s2s2s2s2 s1s1s1s1 s1s1s1s1 Bayes lawBayes law Factorising priorFactorising prior Factorising likelihoodFactorising likelihood Sum out latent variablesSum out latent variables t1t1t1t1 t1t1t1t1 t2t2t2t2 t2t2t2t2 dd yy

13
v v w w x x f1(v,w)f1(v,w) f1(v,w)f1(v,w) f2(w,x)f2(w,x) f2(w,x)f2(w,x) Observation: Sum of products becomes product of sums of all messages from neighbouring factors to variable! y y f3(x,y)f3(x,y) f3(x,y)f3(x,y) z z f4(x,z)f4(x,z) f4(x,z)f4(x,z)

14
w w x x f2(w,x)f2(w,x) f2(w,x)f2(w,x) Observation: Factors only need to sum out all their local variables! y y f3(x,y)f3(x,y) f3(x,y)f3(x,y) z z f4(x,z)f4(x,z) f4(x,z)f4(x,z)

15
x x f2(w,x)f2(w,x) f2(w,x)f2(w,x) Observation: Variables pass on the product of all incoming messages! y y f3(x,y)f3(x,y) f3(x,y)f3(x,y) z z f4(x,z)f4(x,z) f4(x,z)f4(x,z)

16
Three update equations (Aji & McEliece, 1997)Three update equations (Aji & McEliece, 1997) Update equations can be directly derived from the distributive law.Update equations can be directly derived from the distributive law. Efficient for messages in the exponential family.Efficient for messages in the exponential family. Calculate all marginals at the same time.Calculate all marginals at the same time.

17
Problem: The exact messages from factors to variables may not be closed under products.Problem: The exact messages from factors to variables may not be closed under products. Solution: Approximate the marginal as well as possible in the sense of minimal KL divergence.Solution: Approximate the marginal as well as possible in the sense of minimal KL divergence. Expectation Propagation (Minka, 2001): Approximate the marginal by moment-matching resulting inExpectation Propagation (Minka, 2001): Approximate the marginal by moment-matching resulting in

18
Map-Reduce for IID data –Map: Nodes compute messages m f i s from data y i and m f i s –Reduce: Combine messages m f i s into p s by multiplication Caveats: –All approximate data factors need the incoming message m s f i ! –All messages m f i s need to be stored if the same data point is considered multiple times s s y1y1 y1y1 y2y2 y2y2 y3y3 y3y3

19
Joint work with Ralf Herbrich & Jurgen Van Gael

20

21
People = AUGMENT DB.People ADD weight FLOAT

22

23
People FACTOR Normal(p.weight,75.0,25.0) FROM People p People

24
DrVisit People FACTOR Normal(g.weight, p.weight, 1.0) FROM People p, DrVisit g WHERE p.PersonID = g.PersonID DrVisit People

25

26
Joint work with Tom Minka & Phillip Trelford

27
Given:Given: –Match outcomes: Orderings among k teams consisting of n 1, n 2,..., n k players, respectively Questions:Questions: –Skill s i for each player such that –Global ranking among all players –Fair matches between teams of players

28
y12y12 y12y12 y23y23 y23y23 s1s1 s1s1 s2s2 s2s2 s3s3 s3s3 s4s4 s4s4 t1t1 t1t1 t2t2 t2t2 t3t3 t3t3 Gaussian Prior Factors Ranking Likelihood Factors Fast and efficient approximate message passing using Expectation Propagation

29
TrueSkill: Superfast convergence to True Skills Level char (Halo 2 Beta) SQLwildman (Halo 2 Beta) char (TrueSkill) SQLwildman (TrueSkill) Games played

30
LeaderboardLeaderboard –Global ranking of all players MatchmakingMatchmaking –For gamers: Most uncertain outcome –For inference: Most informative –Both are equivalent!

31

32
Joint work with Joaquin Quiñonero Candela, Onno Zoeter, Tom Borchert, Phillip Trelford

33
Advantages of improved probability estimates:Advantages of improved probability estimates: –Increase user satisfaction by better targeting –Fairer charges to advertisers –Increase revenue by showing ads with high click-thru rate Display (according to expected revenue) – Charge (per click) – $1.00 $2.00 $0.10 * 10% * 4% * 50% =$0.10 =$0.08 =$0.05 $0.80 $1.25 $0.05

34
Client IP Exact Match Broad Match Match Type Position ML-1 SB-1 SB-2 P(pClick) + +

35
No Click Click w1w1 w1w1 w2w2 w2w2 s s c c + +

36

37

38
AdPredictor is now running 100% Paid Search traffic in Microsofts Bing Search EngineAdPredictor is now running 100% Paid Search traffic in Microsofts Bing Search Engine Relevance and Click-Through Rate of Ads improvedRelevance and Click-Through Rate of Ads improved Calibrated CTR prediction provides solid foundation for further improvementsCalibrated CTR prediction provides solid foundation for further improvements AdPredictor explored for other tasks such as contextual and display advertisingAdPredictor explored for other tasks such as contextual and display advertising

39
Joint work with David Stern and Ralf Herbrich

40
AA BB CC DD Users Items ?????? Metadata?

41
User ID Male Female Gender Country UK USA 1.2m Height Item ID Horror Movie Genre Drama Documentary Comedy

42
rr ** s1s1s1s1 s1s1s1s1 ++ u 11 u 21 s2s2s2s2 s2s2s2s2 ++ u 12 u 22 t1t1t1t1 t1t1t1t1 ++ v 11 v 21 t2t2t2t2 t2t2t2t2 ++ v 12 v 22 u 01 u 02 ** ++ Message update functions powered by Infer.net

43
Preference Cone for user

44

45

46
Great variety of data sources and tasksGreat variety of data sources and tasks Challenges: privacy, incentives, explorationChallenges: privacy, incentives, exploration Tools: SQL, No-SQL, HPCTools: SQL, No-SQL, HPC Modelling platform (Factor Graphs & PQL):Modelling platform (Factor Graphs & PQL): –Represent uncertainty –Composable models –Distributed, data-centric computation Applications: TrueSkill, AdPredictor, MatchboxApplications: TrueSkill, AdPredictor, Matchbox Thanks!Thanks!

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google