Download presentation

Presentation is loading. Please wait.

Published byDaniel Folds Modified over 2 years ago

1
TAU Agent Team: Yishay Mansour Mariano Schain Tel Aviv University TAC-AA 2010

2
Overview Machine Learning approach: –Regret Minimization Simple: Adaptive scheme –Robust: Performance Bounds Low dependency on the exact models Started (very) late. –3 weeks (for everything) Influenced many of the strategic decisions

3
Regret Minimization: Overview Setting: Single player multiple actions At every time step: –Player Chooses a distribution over actions. –observes the gain of each action Can be even adversarial model Partial information model (MAB) Goal: –Maximize cumulative gain Benchmarks: –Best static choice of action (external regret) Guarantee: –Near optimal W.r.t. benchmark Vanishing average regret

4
RM Algorithm (full information) Main idea: Smoothed Greedy Best action – Highest weight Near-best action – High weight Inferior actions – low weight Non trivial analysis Many algorithms Polynomial Weights: Parameter u Maintain weights w i,t p i,t = w i,t /W t Initially w i,1 =1, W 1 =m At time step t: observed gains g i,t-1 : w i,t =w i,t-1 (1+u*g i,t-1 )

5
Applying Regret Minimization to AA: Challenges Partial Information –Explore vs. Exploit –There are Partial Information (MAB) Regret Minimization algo., –Similar regret bounds Higher dependency on the action space More time for initial exploration Very Large Action Space –Action = (bid, ad type, budget limit) for every query –Observed gain = Value Per Unit Sold for every query –Theoretical results may not directly apply

6
The elements of TAU scheme (Almost) constant high bids on specialty queries: –Reduce action space! –Win impression for every user in population – ease exploration! –Also… High conversion rate, High click-through rate, High revenue Adaptive score: based on Value Per Unit Sold: –Main limitation is capacity units –Use regret minimization to select action distribution Fractional allocation of capacity based on score –Based on regret minimization output Profitable queries gets most of the capacity –Maintain exploration a minimum budget to probe all queries and adapt to trends

7
Software reports Overall Capacity Control Analysis sales Analysis Analysis: Score, Est. Allocation quota scores, est. sales Bid Bid: cpc, limit est. cpc, est. convrate

8
Plans / Enhancements Features : –Burst Identification –Bottom fishing –Tuning parameters to capacity –ML to estimate sales –Reinforced learning of capacity allocation decisions Post Competition analysis: Validate Robustness –Varying game simulation parameters

9
Mariano Schain Thank You

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google