Derive the optimal feedback control and the value function of a stochastic control problem with log cosh function as terminal cost function.

83 Views Asked by At

Greetings~ This is a question which I encountered in my stochastic control course homework. Does anyone know how to derive the closed-form expression for the optimal feedback control and the value function of this problem? I was confused with the hint of "completing a square for $J(u . ; s, y)+\ln (\cosh (y))$".

Consider the following stochastic control problem (the state, control and Brownian motion are all one-dimensional) $$ \begin{aligned} \text { Minimise } & J(u . ; s, y)=\mathbb{E}\left[\int_s^T\left(u_t^2+1\right) d t-\ln \left(\cosh \left(X_T\right)\right)\right] \\ \text { subject to } & \left\{\begin{array}{l} d X_t=2 u_t d t+\sqrt{2} d W_t, t \in[s, T] \\ X_s=y \\ u_t \in[-1,1], \quad t \in[s, T] \end{array}\right. \end{aligned} $$ where $(s, y) \in[0, T]$ and $\cosh (x)=\frac{1}{2}\left(e^x+e^{-x}\right)$. By completing a square for $J(u . ; s, y)+\ln (\cosh (y))$, derive the optimal feedback control and the value function of the problem.

I was planning to reduce this problem to a linear-quadratic type of stochastic control problem by letting $\widetilde{J}(u . ; s, y):=J(u . ; s, y)+\ln (\cosh (y))$, where the terminal value at $T$ for $\widetilde{J}$ is zero. But I struggled to guess a right form for value function.

1

There are 1 best solutions below

0
On

$$ \begin{aligned} &\mathbb{E}\left[\int_s^T\left(u_t^2+1\right) d t- \ln \left(\cosh \left(X_T\right)\right) + \ln \left(\cosh \left(y\right)\right)\right]\\ =&\mathbb{E}\left[\int_s^T\left(u_t^2+1\right) d t-\int_s^Td\ln \left(\cosh \left(X_s\right)\right)\right]\\ =& \mathbb{E}\left[\int_s^T\left(u_t^2+1\right) d t-\int_s^T \tanh (X_{t})dX_{t} - \frac{1}{2}\int_s^T \operatorname{sech}^2(X_{t}) \left(dX_{t}\right)^{2}\right]\\ =& \mathbb{E}\left[\int_s^T\left(u_t^2+1\right) d t-\int_s^T \tanh (X_{t})\left(2 u_t d t+\sqrt{2} d W_t\right) - \frac{1}{2}\int_s^T \operatorname{sech}^2(X_{t}) \left(dX_{t}\right)^{2}\right]\\ =& \mathbb{E}\left[\int_s^T\left(u_t^2+1\right) d t-\int_s^T \tanh (X_{t})\left(2 u_t d t+\sqrt{2} d W_t\right) - \frac{1}{2}\int_s^T \operatorname{sech}^2(X_{t}) 2dt\right]\\ =& \mathbb{E}\left[\int_s^T\left(u_t^2+1\right) d t-\int_s^T 2\tanh (X_{t}) u_t d t - \int_s^T \operatorname{sech}^2(X_{t}) dt\right]\\ =& \mathbb{E}\left[\int_s^T\left(u_t^2+1 - 2\tanh (X_{t}) u_t - \operatorname{sech}^2(X_{t}) \right) dt\right]\\ \end{aligned} $$ Due to the fact that $$ \operatorname{sech}(x) = \frac{1}{\cosh(x)} = \frac{2}{e^x+e^{-x}}, $$ and $$ \begin{aligned} 1 - \operatorname{sech}^2(x) =& \left(1 - \operatorname{sech}(x)\right)\left(1 + \operatorname{sech}(x)\right)\\ =& \frac{\left(e^x+e^{-x}\right) - 2}{e^x+e^{-x}}\times \frac{\left(e^x+e^{-x}\right) + 2}{e^x+e^{-x}} \\ =& \frac{\left(e^x+e^{-x}\right)^{2} - 4}{\left(e^x+e^{-x}\right)^{2}} \\ =& \frac{\left(e^{2x}+e^{-2x} + 2\right) - 4}{\left(e^x+e^{-x}\right)^{2}} \\ =& \frac{e^{2x}+e^{-2x} - 2}{\left(e^x+e^{-x}\right)^{2}} \\ =& \frac{\left(e^{x} -e^{-x}\right)^{2}}{\left(e^x+e^{-x}\right)^{2}} = \tanh^{2}(x)\\ \end{aligned} $$ Thus, we have: $$ \begin{aligned} &\mathbb{E}\left[\int_s^T\left(u_t^2+1-2 \tanh \left(X_t\right) u_t-\operatorname{sech}^2\left(X_t\right)\right) d t\right] \\ =&\mathbb{E}\left[\int_s^T\left(u_t^2-2 \tanh \left(X_t\right) u_t + \left(1-\operatorname{sech}^2\left(X_t\right)\right)\right) d t\right] \\ =&\mathbb{E}\left[\int_s^T\left(u_t^2-2 \tanh \left(X_t\right) u_t + \tanh^{2}(X_{t})\right) d t\right] \\ =& \mathbb{E}\left[\int_s^T\left(u_t - \tanh(X_{t})\right)^2 d t\right] \\ \end{aligned} $$ Hence, we must have $$ u_{t}^{\star} = \tanh(X_{t}). $$