Thursday, January 21, 2016

Lecture 3

Theorem 3.1 is our first serious theorem. It had an easy but non-trivial proof. It is important because it allows us to know that $F(x)$ satisfies a DP equation (3.7). It holds under the cases of D (discounted), N (negative) or P (positive) programming.

In the proof of Theorem 3.1 we used that $\lim_{s\to\infty}EF_s(x_1)=E[\lim_{s\to\infty}EF_s(x_1)]$ when $F_s$ is either monotone increasing or decreasing in $s$, as it indeed is in the N and P cases. This is called the Lebesgue monotone convergence theorem. In the D case the interchange of $E$ and $\lim_{s\to\infty}$, it is true because $F_s(x)$ is close to its limit for large $s$, uniformly in $x$.

The problem of selling a tulip bulb collection in Section 3.5 is very much like the secretary problem in Section 2.3. The differences are that now (i) we observe values (not just relative ranks), (ii) wish to maximize the expected value of the selected candidate (rather than probability of choosing the best), and (iii) the number of offers is infinite, but with a discount factor $\beta$. We see that one way in which discounting by a factor beta can naturally occur is via a catastrophe, with probability $1-\beta$, bringing the problem to an end.

How might the asset selling problem differ if past offers for the tulip bulb collection remain open (so long as the market has not collapsed)? The state $x$ is now the best offer so far received. The DP equation would be
$$
F(x) = \int_0^\infty\max\Bigl[x,y,\beta F(\max\{x,y\})\Bigr] g(y) dy.
$$
The validity of this equation is from Theorem 3.1 and that fact that this is a Positive case of dynamic programming. In fact the solution is exactly the same as when offers did not remain open. Can you see why intuitively? Can you prove it from the above dynamic programming equation?