There have been 10 votes for "yes" and 5 votes for "no". The intuitive answer to
the puzzle is yes ($u_0=1$ is optimal). That might make you suspicious that I am going to argue for the non-intuitive answer, no ($u_0=1$ is not necessarily optimal). I wonder if you will like my answer, or think I am cheating?
I said nothing about $u_0$ being the
only decision. I make no apology, because this lecture course has been all about taking a sequence of decisions (or controls) $u_0,u_1,u_2,\dotsc$ over time. That possibility should have been in your mind, and I was giving you a hint by denoting the decision as $u_0$ (the first of a sequence of decisions).
Suppose we are faced with a stopping problem in which $u_0=1$ means "continue" and $u_0=0$ means "stop". We should consider the possibility that by knowing that A is true, or by knowing A is false, it is optimal to continue. But if we do not know whether A is true or false then it is optimal to stop.
It is not completely obvious that this can happen. So let's see how to construct an example in which it does. Consider a version of the secretary problem in which we will see 3 candidates. They are "best", "middle" and 'worst" (B,M,W). Of course we can only compare them with one another, so if they are presented in the order M,B,W, we would see this as $(x_1,x_2,x_3)=(1, 1, 0)$, where $x_i=1$ if the $i$th candidate is the best so far. Suppose that the candidates will be presented in one of the of the following four orders, with consequent sequence of 0s and 1s, with the probabilities given:
- B,M,W (1,0,0) with probability 0.2
- M,B,W (1,1,0) 0.3
- B,W,M (1,0,0) 0.2
- W,M,B (1,1,1) 0.3
It is a bit artificial that all 6 possible orders are not equally likely, but for the purpose of this discussion we are not trying to think about a really practical problem.
We wish to maximize the probability of stopping on the last 1.
Now suppose the unknown "A" is whether "the worst candidate will be seen last".
If "A" is true it will be 1 or 2, but if A is false it will be 3 or 4.
You can now verify that given A is true it is optimal not to stop on the first 1 ($u_0=1=$ "continue"). The same decision is optimal if A is false. In both cases we win with probability 0.6.
However, if we do not know whether A is true or false, then if is optimal to stop on the first 1 $(u_0=0$, and win with probability 0.4). If we do not stop we will with probability 0.6 reach 1, 1 and from there can win with probability of only 0.5. So our win probability is only 0.3.
Do you think this example is a cheat (in that it does not really match up with the way I posed the original puzzle), or have you learned something interesting?
I was thinking about this recently in connection with a similar question. I wonder if any of you can invent a neater example to illustrate this point, or one with a better "back-story".
The question I have been thinking about concerns an application of Bruss's odds theorem in circumstances that we do not know the odds. Recall that if $X_1,\dotsc,X_n$ are independent, with $X_i=1$ or $0$ with probabilities $p_i$ and $q_i=1-p_i$, then we maximize the probability of stopping on the last 1 if we stop at the first 1 that we find amongst $X_s,X_{s+1},\dotsc,X_n$, where $s$ is the greatest integer $i$ such that
$\frac{p_i}{q_i}+\cdots+\frac{p_n}{q_n}\geq 1$.
Now suppose we do not know the actual values of $p_1,\dotsc,p_n$. But we have seen $X_{i-1}=1$ and somehow know that $p_i/q_i+\cdots+p_n/q_n=1$. Can we say that it is optimal to stop? Yes. Since sum-of-odds $=1$ is the borderline case in the odds theorem is it also optimal not to stop? Yes.
But suppose we know only that $E[p_i/q_i+\cdots+p_n/q_n] > 1$. Now this is insufficient information on which to know whether or not it is optimal to stop.