Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Value of Information to Learn and Classify under Hard Budgets Russell Greiner, Daniel Lizotte, Aloak Kapoor, Omid Madani Dept of Computing Science,

Similar presentations


Presentation on theme: "Using Value of Information to Learn and Classify under Hard Budgets Russell Greiner, Daniel Lizotte, Aloak Kapoor, Omid Madani Dept of Computing Science,"— Presentation transcript:

1 Using Value of Information to Learn and Classify under Hard Budgets Russell Greiner, Daniel Lizotte, Aloak Kapoor, Omid Madani Dept of Computing Science, University of Alberta Yahoo! Research Task: – Need classifier for diagnosing cancer Given: – pool of patients whose subtype is known feature values are NOT known – cost c(X i ) of purchasing feature X i – budget for purchasing feature values Produce: – classifier to predict subtype of novel instance, based on values of its features – … learned using only features purchased Process : Initially, learner R knows NO feature values At each time, – R can purchase value of a feature of an instance at cost – based on results of prior purchases – … until exhausting fixed budget Then R produces classifier Challenge : At each time, what should R purchase? – which feature of which instance? Purchasing a feature value… – Alters accuracy of classifier – Decreases remaining budget Quality: – accuracy of classifier obtained – REGRET: difference of classifier vs optimal Simpler Task Task: Determine which coin has highest P(head) … based on results of only 20 flips C1C1 C2C2 C3C3 C4C4 C5C5 C6C6 C7C7 ------- 20 Which coin?? C1C1 C2C2 C3C3 C4C4 C5C5 C6C6 C7C7 -- 1H ---- 19 C1C1 C2C2 C3C3 C4C4 C5C5 C6C6 C7C7 - - 1H - 1T -- 18 C1C1 C2C2 C3C3 C4C4 C5C5 C6C6 C7C7 - - 1H 1T - -- 17 C1C1 C2C2 C3C3 C4C4 C5C5 C6C6 C7C7 3H 2T - 1H 1T 5H 3T 1T 2H 1T 0 ⋮ X1X1 X2X2 X3X3 X4X4 Y ???? 1 ???? 0 ???? 1 ???? 1 $95 $0 ⋮ Costs: 30 15 10 20 Classifier Learner Which feature of which instance?? Selector “C 7 ” Original Task Bayesian Framework: – Coin C i drawn from Beta(a i, b i )  MDP – State =  a 1, b 1, …, a k, b k, r  – Action = “Flip coin i” – Reward = 0 if r  0; else max i { a i /(a i +b i ) } –solve for optimal purchasing policy NP-hard  Develop tractable heuristic policies that perform well Round Robin – Flip C 1, then C 2, then... Biased Robin – Flip C i – If heads, flip C i ; else C i+1 Greedy Loss Reduction – Loss1(C i ) = loss of flipping C i once – Flip C * = argmin i { Loss1(C i ) } once Single Feature Lookahead (k) – SFL(C i, k) = loss of spending k flips on C i – Flip C * = argmin i { SFL(C i, k) } once Heuristic Policies X1X1 X2X2 X3X3 X4X4 Y ?? + ? 1 ???? 0 ???? 1 ???? 1 $100 $85 X1X1 X2X2 X3X3 X4X4 Y ?? + ? 1 ? -- ? ? 0 ???? 1 ???? 1 X1X1 X2X2 X3X3 X4X4 Y ?? + ? 1 ? ? ? 0 ?? + 1 ? + 1 A is APPROXIMATION Algorithm iff A ’s regret is bounded by a constant worse than optimal (for any budget, #coins, …) NOT approximation alg’s: Round Robin Random Greedy Interval Estimation Beta(1,1); n=10, b=10Beta(1,1); n=10, b=40 Beta(10,1); n=10, b=40 Results: Obvious approach Round robin is NOT good ! Contingent policies work best Important to know/use remaining budget http://www.cs.ualberta.ca/~greiner/BudgetLearn (UAI’03; UAI’04; COLT’04; ECML’05) Related Work Not standard Bandit: Pure explore for “b” steps, then single exploit Not on-line learning No “feedback” until end Not PAC-learning Fixed #instances; NOT “polynomial” Not std experimental design This is simple active learning General Budgeted Learning is different … Use NaïveBayes classifier as… it handles missing data  no feature interaction Each +class instance is “the same”, … only O(N) parameters to estimate regret budget optimal alg A rArA Extended Task So far, LEARNER (researcher) has to pay for features… but CLASSIFIER (“MD”) gets ALL feature values … for free ! Typically… MD also has constraints [… capitation …]  Extended model: Hard budget for BOTH learner & classifier Eg: spend b L = $10,000 to learn a classifier, that can spend only b C = $50 /patient… Classifier = “Active Classifier” [GGR, 2002]  policy (decision tree) sequentially gathers info about instance, until rendering decision (Must make decision by  b C depth) Learner … spends b L gathering information,  posterior distribution P(·) [using naïve bayes assumption] uses Dynamic Program to find best cost-b C policy for P(·) Double dynamic program! Too slow  use heuristic policies Issues: Use “analogous” heuristic policies Round-robin (std approach) still bad SingleFeatureLookahead: how far ahead? k in SFL(k) ? Glass – Identical Feature Costs (b C =3) Heart Disease – Different Feature Costs (b C =7) Randomized SFL – Flip C i with probability  exp( SFL(C i, k) ), once Issues: Round-robin still bad… very bad… Randomized SFL is best (Deterministic) SFL is “too focused”


Download ppt "Using Value of Information to Learn and Classify under Hard Budgets Russell Greiner, Daniel Lizotte, Aloak Kapoor, Omid Madani Dept of Computing Science,"

Similar presentations


Ads by Google