Thursday, March 14, 2013

Give your feedback

This year's course finished on 12 March 2013. 

Your initial state of knowledge was $W_0$, and has been brought to $W_{16}$. It is now your job to control the trajectory onwards. Information about certain unobservables will be revealed by variables $y_1,y_2,y_3$ in the Easter Term.

As you study and revise from the notes I would appreciate an email from you if you see any typos or grammatical mistakes, or find something that you recommend could be explained better. I try to tread a Goldilocks path in these notes of being neither too verbose nor too succinct. 

You can give the standard feedback on-line here. This is identical to the Faculty's paper form that I will handout on Thursday. It will be sent to my email anonymously. After reading the responses, I will forward them to the Faculty Office. The online form is easy to use and is particularly useful for anyone who is following the course from my notes and is not attending lectures.

Proof by Genie, continued

I am interested in collecting more examples of Proof by Genie, as mentioned in a previous post. If you come across an example of this, please let me know.

I've now remembered another example of proof by genie, actually from one of my own recent papers. Two people are trying to find each other on the vertices of a triangle. Initially they start at two different vertices and at each step $t=1,2,\dotsc$ each player stays or moves to another vertex, trying to minimize the expected number or steps until they meet. In working out the best symmetric rendezvous strategy it helps to assume that the players are given a common notion of what is "clockwise" on the triangle. However, once the optimal strategy is found it turns out that a common notion of clockwise is not needed to implement the optimal strategy. If you would like to learn more about this problem see 


R. R. Weber, Optimal symmetric rendezvous search on three locations, Math Oper Res., 37(1): 111-122, 2012.

or seminar slides. One of the slides begins with the words "Suppose the problem is made a bit easier: the players are provided with the same notion of clockwise. (We will see later that this does not actually help the players.)"

Tuesday, March 12, 2013

Broom balancing in the 21st century

Have a look at this link to see what optimal control can do when it is coupled to today's fast microprocessors and feedback controllers:

Quadrocopter pole acrobatics

Thanks to Felix Horns for finding this link.

Risk-sensitive optimal control

I ended the course by saying a little about risk-sensitive optimal control. This is a concept introduced by Peter Whittle and an area in which there are still new ideas to be researched. The objective function is now
\[
\gamma_\pi(C)=-\frac{1}{\theta}E_\pi e^{-\theta C} = E_\pi C -\tfrac{1}{2}\theta\,\text{var}_\pi(C)+\cdots
\] So when $\theta>0$ we are risk-seeking, or optimistic. We like variance in $C$. This is perhaps the attitude of those who gamble or who are high-frequency traders of bonds. Many of the ideas we have met in the course go through, such as a certainty-equivalence principle in the LQG case. One can even obtain a risk-sensitive stochastic maximum principle.

I talked about the problem of maximizing consumption over a lifetime, but now with uncertain lifetime. I had to be brief, so here are more careful details if you are interested. The capital $x$ evolves as $dx/dt=ax-u$. We wish to maximize
\[ \int_0^T \log u(t) dt + \kappa \log x(T).
\] If $T$ is certain, then $u(t)=x(t)/(\kappa+s)$, where $s=T-t$. We saw this in Lecture 14.

But now suppose the lifetime is uncertain, given by $\dot y = -1 +\epsilon$, where $\epsilon dt=dB$. The problem takes place over $[0,\tau]$, where $\tau$ is the time at which $y(\tau)=0$. The cute answer is that the solution is $u(t)=x(t)/(\kappa+ s)$, where $s$ is now the "effective remaining life", which is related to $x,y$ by
\[
y^2-s^2=2\theta N s^2\left(1-a(\kappa+s)+\log\left(\frac{\kappa+s}{x}\right)\right).
\] If $\theta=0$ then $s=y$, which gives us a certainty-equivalence result. But now think about $\theta>0$ (the optimistic case). For a given $y$, this exhibits the interesting property that if $x$ is large then $s > y$. In this case the individual is optimistic of a long life, and has sufficient capital to believe he can build it up, while consuming more slowly than if he were not optimistic. But if $x$ is small then $s < y$. Having only little capital, there is not much time to build it up, and so he optimistically takes the view that his life will be short (!), and that it is best to consume what little he has at a faster rate than he would if he was certain that his remaining life were to be $y$.

For $\theta < 0$ the policy is reversed. The pessimist is risk averse to running out of money because of living longer than expected, or not spending fast enough when life is shorter than expected.

Lecture 16

Brownian motion and the Wiener process are covered in the Part II course Stochastic Financial Models. The Wikipedia articles referenced in the previous sentence are good and contain some nice video demonstrations of things like self-similar scaling. In this lecture I reviewed only the very basics, but enough for us to develop a simple stochastic differential equation that corresponds to the optimality equation of dynamic programming in the case of continuous time and a continuous state space. Although Brownian motion is a subtle thing, one can infer (or at least guess) many of its properties by thinking about the simple discrete time symmetric random walk, of which it is a limit.

There are many interesting things to be said about Brownian motion. Paul Levy's Forgery Theorem for Brownian motion, implies that in two dimensions almost surely every arc of the path contains the complete works of Shakespeare in the handwriting of Bacon.

I mentioned in the lecture that it is initially surprising to see that in Example 16.4 the cost is greatest when there is no noise. This is a little bit counterintuitive and so we should try to understand why this is. I guess it is because $C_0$ and $C_1$ are almost the same, and $L$ is small compared to $Q$, and so we can let the noise take us to the endpoints, only expending minimal effort (which is costed at $Qu^2$). I guess that if $C_0$ were a lot less than $C_1$, or if $L$ were bigger compared to $Q$, then noise will not be helpful and we would expect Figure 4 to look different.

Monday, March 11, 2013

Solving problems using PMP

In Questions 6-10 of Sheet 3 the strategy is to use the necessary conditions of PMP to synthesize the optimal solution. What I mean by "synthesize" is that you write down all the information that PMP says is necessary (in terms of $u(t)$ maximizing, $H(x,u,\lambda)$ to $-\lambda_0(t)$, and $\dot \lambda_i=-\partial H/\partial x_i$, and any boundary and transversality conditions) until you convince yourself that the optimal policy can only be one thing. Question 7 is a bit like proving the shortest distance between two points is a straight line. The monster's optimal strategy is fairly obvious, but we can prove it is optimal by applying the necessary conditions of PMP to rule out the possibility of any other solution.

Thursday, March 7, 2013

Feldbaum-Bushaw problem

I think I have finally tracked down the answer to the question in the blog for Lecture 14, when I wondered about the naming of the rocket car problem. I am now guessing that the answer is none of the above. According to J. T. Olympio, A continuous implementation of a second-variation optimal control method for space trajectory problems. J Optim Theory Appl, 2013

The double integrator problem (also called the Feldbaum–Bushaw problem) is a time minimization problem, where a frictionless particle moving along a line with an initial velocity and position should be put to rest." 

This is exactly our rocket car problem. Apparently, it was first solved by D.W. Bushaw, Differential Equations with a  Discontinuous Forcing Term, Ph.D. Thesis, Princeton, 1952.

In the obituary of Donald W. Bushaw (1926-2012) it is stated that "Don’s Ph.D. thesis is recognized as the beginning of modern optimal control theory."

The name of A. A. Feldbaum (a Russian) is also mentioned in connection with this problem which he solved at about the same time. Pontryagin came up with his maximum principle a few years later, 1956.

Lecture 15

Today's lecture was about solving some problems using Pontryagin's Maximum Principle. Some personal experience of the power of PMP came in solving the problem in the paper

R. R. Weber. Optimal search for a randomly moving object. J. Appl. Prob. 23:708-717, 1986.

The problem that is solved in this paper in continuous time is still an open problem in discrete time. That shows the power of PMP, in that with it we can solve problems in discrete time that cannot be solved in discrete time.

I mentioned today in connection with section 15.5 (insects as optimizers) that zoologists have observed that in real-world choices of $t_{\text{switch}}$ some insect populations seem to do a good job of predicting $T$ (the time of the serious frost that ends the lives of the workers.)

What makes for a beautiful problem in science?

In his paper with the title, What makes for a beautiful problem in Science? (Journal of Political Economy, Vol. 78, No. 6, 1970),  Paul Samuelson writes:

What is the shortest distance between two points? A straight line. Can you prove it?

What is the longest distance between two points? Surely, no answer is possible. But consider two points a mile apart on a horizontal line in the plane. Suppose that the admissible paths cannot have an absolute slope exceeding one million. Then it is not hard to envisage that the longest distance is not (sic) a bit over a million miles. But it takes the methods of Pontryagin and not those of the classical theory to handle such a problem in which the control variables are constrained within a closed set.

You might like to see if you can use Pontryagin's maximum principle to prove what Samuelson claims. You must start deciding what he means by an admissible path. I expect he is thinking about a curve between $(0,0)$ and $(1,0)$ that is a graph $y(x)$ determined by $y'(x)=u(x)$, $0\leq x \leq 1$. I have added (sic) (for sic erat scriptum) because I think the word "not" should be deleted. He  is trying to say that the maximum distance is a bit over a million miles, in fact $\sqrt{10^{12}+1}$. Samuelson goes on:

A scientific problem gains in aesthetic interest if you can explain it to your neighbor at a dinner party. 

That no map requires more than four colors is such a problem. Can the turnpikes of Professor Chakravarty pass this test? In a system in which everything produces everything, a critical set of proportions at which balanced growth is most rapid seems reasonable. So the general notion of the von Neumann turnpike is conveyed. But suppose we begin away from this highway, and also are to end up away from it. Still, if the journey is to be a sufficiently long one, as seen we shall do well to move in the beginning toward that fast highway; catch a fast ride along it, staying so long there as to leave behind in all directions any rival who sticks to any slower path; then, toward the end of our journey, we can swing off the turnpike to our final rendezvous.

We see this in today's lecture (the turnpike of economic growth, Section 15.6).

I cannot resist one more excerpt from Samuelson's paper, in which he muses on the way in which an optimal path may sometimes seem strange.

Begin with durable hammers and robots that can make new hammers or robots with the same symmetrical input requirements for the one as for the other. Each input is subject to familiar diminishing returns; but increasing both inputs proportionally leads to constant returns to scale. I begin with 10 robots and 5 hammers and must end up in minimum time with (at least) 1,000 robots and 5,000 hammers. How should I proceed? Surely, I begin by producing only hammers, since the redundant robots are subject to diminishing returns. When I reach 10 of each input, I shift gears to produce equal amounts of both, proceeding along what is known as a von Neumann-DOSSO turnpike. You might think I go along it until 1,000 of each are attained, thereafter producing only the 4,000 more hammers. And you might be right. But you might be wrong. Depending upon the exact technology, it might pay to produce more of both along the turnpike, ceasing to produce more robots only after they are somewhere between 1,000 and 5,000 in number. We end up with more than we needed of robots. Irrational, since we don't want them and they cost something? Not necessarily, since they may have been worthwhile for their help in producing the extra hammers we do want.

Tuesday, March 5, 2013

Lecture 14

The maximum principle is due to Lev Pontryagin. It is remarkable that despite being blind he was one of the greatest mathematicians of his generation. The key thing to grasp is that the PMP provides necessary conditions. We use the fact that an adjoint trajectory $\lambda$ exists to deduce properties of, or completely determine, the optimal control and optimally controlled trajectory. To my thinking, the PMP is notoriously badly explained in most books that it appears. I hope I have been able to make it seem more intuitive.  Lagrange multipliers give a helpful interpretation, as does differentiation of the infinitestimal version of the optimality equation.

The rocket car example is a celebrated problem that is nicely solved using the PMP. Whittle calls it the "Bush problem", but does not explain why.  There are two plausible answers. This may be a reference to the famous engineer Vannevar Bush, but I have not been able to track down any definitive reference. Whittle talks about bringing the rollers of a rolling mill to rest in a standard position in minimum time. So perhaps he is thinking about a roller chain or "bush roller chain", which (according to Wikipedia) is the type of chain drive most commonly used for transmission of mechanical power on many kinds of domestic, industrial and agricultural machinery, including conveyors, wire- and tube-drawing machines, printing presses, cars, motorcycles, and bicycles. I will ask Peter Whittle if he remembers.

I think I have finally guessed the answer to the question posed above. According to J. T. Olympio, A Continuous Implementation of a Second-Variation Optimal Control Method for Space Trajectory
Problems. J Optim Theory Appl, 2013,

"The double integrator problem (also called the Feldbaum–Bushaw problem) is a time minimization problem, where a frictionless particle moving along a line with an initial velocity and position should be put to rest." 

This is exactly our rocket car problem. Apparently, it was first solved by D.W. Bushaw, Differential Equations with a  Discontinuous Forcing Term, PhD Thesis, Princeton, 1952.

In the obituary of Donald W. Bushaw (1926-2012) it is stated that "Don’s PhD thesis is recognized as the beginning of modern optimal control theory."

There is a nice interactive demo of the solution to the rocket car parking problem that you can try.