Optimization and Control: Risk-sensitive optimal control

I ended the course by saying a little about risk-sensitive optimal control. This is a concept introduced by Peter Whittle and an area in which there are still new ideas to be researched. The objective function is now
\[
\gamma_\pi(C)=-\frac{1}{\theta}E_\pi e^{-\theta C} = E_\pi C -\tfrac{1}{2}\theta\,\text{var}_\pi(C)+\cdots
\] So when $\theta>0$ we are risk-seeking, or optimistic. We like variance in $C$. This is perhaps the attitude of those who gamble or who are high-frequency traders of bonds. Many of the ideas we have met in the course go through, such as a certainty-equivalence principle in the LQG case. One can even obtain a risk-sensitive stochastic maximum principle.

I talked about the problem of maximizing consumption over a lifetime, but now with uncertain lifetime. I had to be brief, so here are more careful details if you are interested. The capital $x$ evolves as $dx/dt=ax-u$. We wish to maximize
\[ \int_0^T \log u(t) dt + \kappa \log x(T).
\] If $T$ is certain, then $u(t)=x(t)/(\kappa+s)$, where $s=T-t$. We saw this in Lecture 14.

But now suppose the lifetime is uncertain, given by $\dot y = -1 +\epsilon$, where $\epsilon dt=dB$. The problem takes place over $[0,\tau]$, where $\tau$ is the time at which $y(\tau)=0$. The cute answer is that the solution is $u(t)=x(t)/(\kappa+ s)$, where $s$ is now the "effective remaining life", which is related to $x,y$ by
\[
y^2-s^2=2\theta N s^2\left(1-a(\kappa+s)+\log\left(\frac{\kappa+s}{x}\right)\right).
\] If $\theta=0$ then $s=y$, which gives us a certainty-equivalence result. But now think about $\theta>0$ (the optimistic case). For a given $y$, this exhibits the interesting property that if $x$ is large then $s > y$. In this case the individual is optimistic of a long life, and has sufficient capital to believe he can build it up, while consuming more slowly than if he were not optimistic. But if $x$ is small then $s < y$. Having only little capital, there is not much time to build it up, and so he optimistically takes the view that his life will be short (!), and that it is best to consume what little he has at a faster rate than he would if he was certain that his remaining life were to be $y$.

For $\theta < 0$ the policy is reversed. The pessimist is risk averse to running out of money because of living longer than expected, or not spending fast enough when life is shorter than expected.

Optimization and Control

Tuesday, March 12, 2013

Risk-sensitive optimal control

1 comment:

Statcounter