I am sitting with the following control problem.
Given know the controlled Markov equation \begin{align} dX_t&=-\lambda X_t\cdot dt+ U_t\cdot dt+\sigma\sqrt{1+X_t^2}\cdot dB_t \end{align} with the performance objective function
\begin{align} \mathbb{E}\left[\int_0^T \left(\frac{1}{2}qX_t^2+\frac{1}{2}U_t^2\right)dt +\frac{1}{2}\alpha\cdot X_T^2\right] \end{align}
The goals is to minimize the performance function over all Markov controls $U_t=\mu(X_t,t)$.
Furthermore, I want to determine a $\alpha>0$ such that for all $q>0$, the optimal control does not depend on $t$, i.e. $U_t=\mu(X_t)$.
Question: The question I have here is how to determine the $\alpha>0$ such that for all $q>0$, $U_t=\mu(X_t)$.
Does anybody have an idea?