Optimal Stopping -Amit Goyal 27 th Feb 2007 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Slides:



Advertisements
Similar presentations
Reaching Agreements II. 2 What utility does a deal give an agent? Given encounter  T 1,T 2  in task domain  T,{1,2},c  We define the utility of a.
Advertisements

Less is More Probabilistic Model for Retrieving Fewer Relevant Docuemtns Harr Chen and David R. Karger MIT CSAIL SIGIR2006 4/30/2007.
Local Computation Algorithms
Markov Decision Process
Lecture XXIII.  In general there are two kinds of hypotheses: one concerns the form of the probability distribution (i.e. is the random variable normally.
Small Subgraphs in Random Graphs and the Power of Multiple Choices The Online Case Torsten Mütze, ETH Zürich Joint work with Reto Spöhel and Henning Thomas.
Topics in Search Theory The Secretary Problem. Theme Paper Who Solved the Secretary Problem? – Thomas Ferguson, 1989.
Markov Decision Processes
Planning under Uncertainty
1 Maximal Independent Set. 2 Independent Set (IS): In a graph, any set of nodes that are not adjacent.
Hypothesis Testing and Dynamic Treatment Regimes S.A. Murphy Schering-Plough Workshop May 2007 TexPoint fonts used in EMF. Read the TexPoint manual before.
Matroids, Secretary Problems, and Online Mechanisms Nicole Immorlica, Microsoft Research Joint work with Robert Kleinberg and Moshe Babaioff.
No Free Lunch (NFL) Theorem Many slides are based on a presentation of Y.C. Ho Presentation by Kristian Nolde.
Small Subgraphs in Random Graphs and the Power of Multiple Choices The Online Case Torsten Mütze, ETH Zürich Joint work with Reto Spöhel and Henning Thomas.
Scheduling of Flexible Resources in Professional Service Firms Arun Singh CS 537- Dr. G.S. Young Dept. of Computer Science Cal Poly Pomona.
The Marriage Problem Finding an Optimal Stopping Procedure.
Objective: Objective: Use experimental and theoretical distributions to make judgments about the likelihood of various outcomes in uncertain situations.
Chapter 5 Probability Distributions
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 1 Chapter 1: The DP Algorithm To do:  sequential decision-making  state.
Small subgraphs in the Achlioptas process Reto Spöhel, ETH Zürich Joint work with Torsten Mütze and Henning Thomas TexPoint fonts used in EMF. Read the.
Statistical Decision Theory
10/1/20151 Math a Sample Space, Events, and Probabilities of Events.
Sequential, Multiple Assignment, Randomized Trials and Treatment Policies S.A. Murphy MUCMD, 08/10/12 TexPoint fonts used in EMF. Read the TexPoint manual.
Markov Decision Processes1 Definitions; Stationary policies; Value improvement algorithm, Policy improvement algorithm, and linear programming for discounted.
Session 8 University of Southern California ISE544 June 18, 2009 Geza P. Bottlik Page 1 Outline Two party distributive negotiations (Win/Lose) –Case history.
Constraint Satisfaction Problems (CSPs) CPSC 322 – CSP 1 Poole & Mackworth textbook: Sections § Lecturer: Alan Mackworth September 28, 2012.
1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.
Copyright Robert J. Marks II ECE 5345 Multiple Random Variables: A Gambling Sequence.
1 Chapter 4. Section 4-1 and 4-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
DPA51 Dynamic Programming Applications Lecture 5.
Chapter 4 Bayesian Approximation By: Yotam Eliraz & Gilad Shohat Based on Chapter 4 on Jason Hartline’s book Seminar in Auctions and Mechanism.
Uri Zwick Tel Aviv University Simple Stochastic Games Mean Payoff Games Parity Games TexPoint fonts used in EMF. Read the TexPoint manual before you delete.
C&O 355 Mathematical Programming Fall 2010 Lecture 16 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
1 Dirichlet Process Mixtures A gentle tutorial Graphical Models – Khalid El-Arini Carnegie Mellon University November 6 th, 2006 TexPoint fonts used.
Risk Management with Coherent Measures of Risk IPAM Conference on Financial Mathematics: Risk Management, Modeling and Numerical Methods January 2001.
Optimal Testing Strategy For Product Design Wei Wang Pengbo Zhang Apr /15/20091.
Part 3 Linear Programming
Chapter 5 Constraint Satisfaction Problems
Decision Theoretic Planning. Decisions Under Uncertainty  Some areas of AI (e.g., planning) focus on decision making in domains where the environment.
Game Theory: Optimal stopping problem I: Introduction Sarbar Tursunova.
C&O 355 Lecture 24 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A A A A A A A.
1 Optimizing Decisions over the Long-term in the Presence of Uncertain Response Edward Kambour.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Matroids, Secretary Problems, and Online Mechanisms Nicole Immorlica, Microsoft Research Joint work with Robert Kleinberg and Moshe Babaioff.
Markov Decision Process (MDP)
MDPs and Reinforcement Learning. Overview MDPs Reinforcement learning.
CS Lecture 26 Monochrome Despite Himself. Pigeonhole Principle: If we put n+1 pigeons into n holes, some hole must receive at least 2 pigeons.
Markov Processes What is a Markov Process?
On Investor Behavior Objective Define and discuss the concept of rational behavior.
Knowing When to Stop: The Mathematics of Optimal Stopping 1 Minimize Cost and Maximize Reward.
Application of Dynamic Programming to Optimal Learning Problems Peter Frazier Warren Powell Savas Dayanik Department of Operations Research and Financial.
1 Distributed Vertex Coloring. 2 Vertex Coloring: each vertex is assigned a color.
Sequential Off-line Learning with Knowledge Gradients Peter Frazier Warren Powell Savas Dayanik Department of Operations Research and Financial Engineering.
Markov Random Fields in Vision
PROBABILITY DISTRIBUTIONS DISCRETE RANDOM VARIABLES OUTCOMES & EVENTS Mrs. Aldous & Mr. Thauvette IB DP SL Mathematics.
An Optimal Probabilistic Forwarding Protocol in Delay Tolerant Networks ACM MobiHoc ’09 Cong Liu and Jie Wu Presenter: Hojin Lee.
7.2 Day 1: Mean & Variance of Random Variables Law of Large Numbers.
Optimal Stopping.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3
Task: It is necessary to choose the most suitable variant from some set of objects by those or other criteria.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3
Topics in Search Theory
CONCEPTS OF HYPOTHESIS TESTING
Markov Decision Processes
Chapter 4 Bayesian Approximation
Markov Decision Processes
Uri Zwick – Tel Aviv Univ.
Oliver Friedmann – Univ. of Munich Thomas Dueholm Hansen – Aarhus Univ
You can choose one of three boxes
Presentation transcript:

Optimal Stopping -Amit Goyal 27 th Feb 2007 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

2 Overview Why? What? How? Example References

3 WHY? House selling: You have a house and wish to sell it. Each day you are offered X n for your house, and pay k to continue advertising it. If you sell your house on day n, you will earn y n, where y n = (X n − nk). You wish to maximise the amount you earn by choosing a stopping rule. Secretary Problem: You are observing a sequence of objects which can be ranked from best to worst. You wish to choose a stopping rule which maximises your chance of picking the best object.

4 WHAT? (Problem Definition) Stopping Rule is defined by two objects  A sequence of random variables, X 1, X 2,.., whose joint distribution is assumed known  A sequence of real-valued reward functions, y 0, y 1 (x 1 ), y 2 (x 1,x 2 ), …, y ∞ (x 1,x 2,…) Given these objects, the problem is as follows:  You are observing the sequence of random variables, and at each step n, you can choose to either stop observing or continue  If you stop observing, you will receive the reward y n  You want to choose a stopping rule Φ to maximize your expected reward (or minimize the expected loss)

5 Stopping Rule Stopping rule Φ consists of sequence Φ = (Φ 0, Φ 1 (x 1 ), Φ 2 (x 1,x 2 ), …)  Φ n (x 1,…,x n ) : probability you stop after step n  0 <= Φ n (x 1,…,x n ) <= 1 If N is random variable over n (time to stop), the probability mass function ψ is defined as

6 HOW? (Solution Framework) So, the problem is to choose stopping rule Φ to maximize the expected return, V(Φ), defined as j = ∞ if we never stop (infinite horizon) j = T if we stop at T (finite horizon)

7 Optimal Stopping in Finite Horizon Special case of general problem, by setting y T+1 = y T+2 = … y ∞ = -∞ Use DP algorithm Here, V j (x 1,…,x j ) represents the maximum return one can obtain starting from stage j having observed X 1 =x 1,…, X j =x j.

8 The Classical Secretary Problem (CSP) Aka marriage problem, hiring problem, the sultan's dowry problem, the fussy suitor problem, and the best choice problem Rules of the game:  There is a single secretarial position to fill.  There are n applicants for the position, n is known.  The applicants can be ranked from best to worst with no ties.  The applicants are interviewed sequentially in a random order, with each order being equally likely.  After each interview, the applicant is accepted or rejected.  The decision to accept or reject an applicant can be based only on the relative ranks of the applicants interviewed so far.  Rejected applicants cannot be recalled.  The object is to select the best applicant. Win: If you select the best applicant. Lose: otherwise Note: An applicant should be accepted only if it is relatively best among those already observed

9 CSP – Solution Framework A relatively best applicant is called a candidate Reward Function  y j (x 1,…,x n ) = j/nif applicant j is a candidate,  = 0otherwise Lets say the interviewer rejects the first r-1 applicants and then accept the next relatively best applicant. We wish to find the optimal r

10 CSP – Solution Framework Cont. Probability that the best applicant is selected is ** (r-1)/(r-1) = 1 if r = 1

11 CSP – Solution Framework Cont. For optimal r, n r* P

12 CSP – for large n

13 References Optimal Stopping and Applications by Thomas S. Ferguson ( ntents.html) ntents.html m m