Riccati equation in optimal control problem

246 Views Asked by At

Using Riccati equation

$$\dot{P} = - PA - A^\textrm{T}P + P B R^{-1} B^{\textrm{T}} P - Q,\ P(T)= F$$

find the optimal minimum cost $J$ if

$$\dot{x} = 2x + u,\;J(u) = 5 x^2(1) + \int_0^1 \left[x^2(t) + u^2(t)\right]dt, \;x(0) = x_0.$$

I can't see how to relate Riccati equation to this problem. Any ideas on how to start will be greatly appreciated.

1

There are 1 best solutions below

0
On

I general this problem can be formulated using the following dynamics

$$ \dot{x} = A\,x + B\,u, \tag{1} $$

with $x\in\mathbb{R}^n$, $u\in\mathbb{R}^m$, $A\in\mathbb{R}^{n\times n}$, $B\in\mathbb{R}^{n\times m}$ and $(A,B)$ stabilizable. The goal is the find the control input $u(t)$ which solves

$$ \min_u \int_0^T \begin{bmatrix} x(t) \\ u(t) \end{bmatrix}^\top \begin{bmatrix} Q & N \\ N^\top & R \end{bmatrix} \begin{bmatrix} x(t) \\ u(t) \end{bmatrix} \,dt + x(T)^\top Q_T\,x(T), \tag{2} $$

with $R = R^\top \succ 0$, $Q_T = Q_T^\top\succeq0$, $Q = Q^\top$, $Q - N\,R^{-1} N^\top = W^\top W \succeq 0$ and $(A,W)$ detectable. It can be noted that in the case when $N = 0$ $(2)$ simplifies to

$$ \min_u \int_0^T \left(x(t)^\top Q\,x(t) + u(t)^\top R\,u(t)\right) dt + x(T)^\top Q_T\,x(T). \tag{3} $$

Finding the optimal input can be solved by starting at the end and go backwards in time, similar to dynamic programming. In order to do this only the terminal cost and the last sliver of the integral from $(2)$ are taken into consideration

$$ \min_u \int_{T-\delta}^T \begin{bmatrix} x(t) \\ u(t) \end{bmatrix}^\top \begin{bmatrix} Q & N \\ N^\top & R \end{bmatrix} \begin{bmatrix} x(t) \\ u(t) \end{bmatrix} \,dt + x(T)^\top Q_T\,x(T). \tag{4} $$

By making $\delta$ in $(4)$ infinitesimally small and for convenience use $\chi = x(T-\delta)$ and $\mu = u(T-\delta)$ then the integral in the cost function can be written as one term of Riemann sum. Because $\delta$ is infinitesimally small $x(T)$ can be expressed using Euler's method $x(T) = x(T-\delta) + \delta\,\dot{x}(T-\delta)$. Applying this to $(4)$ yields

$$ \min_\mu \begin{bmatrix} \chi \\ \mu \end{bmatrix}^\top \begin{bmatrix} Q & N \\ N^\top & R \end{bmatrix} \begin{bmatrix} \chi \\ \mu \end{bmatrix} \,\delta + \left(\chi + \delta(A\,\chi + B\,\mu)\right)^\top Q_T\,\left(\chi + \delta(A\,\chi + B\,\mu)\right). \tag{5} $$

When discarding all negligible $\delta^2$ terms in $(5)$ and rewriting it into one quadratic form yields

$$ \min_\mu \delta\,\mu^\top R\,\mu + 2\,\delta\,\mu^\top \left(N + Q_T B\right)^\top \chi + \chi^\top \left(Q_T + \delta\,(Q + A^\top Q_T + Q_T A)\right)\chi, \tag{6} $$

which can be shown to have to solution

$$ \mu = -R^{-1} \left(N + Q_T B\right)^\top \chi. \tag{7} $$

Substituting $(7)$ into the cost function of $(6)$ gives the minimal cost to go from $t = T - \delta$ to $t = T$ given $x(T - \delta)$

$$ \label{eq:optimal_cost_end} \left.J_{\min}\right|_{T-\delta}^T = \chi^\top \left[Q_T + \delta\left(Q + A^\top Q_T + Q_T A - (N + Q_T B) R^{-1} (N + Q_T B)^\top\right)\right] \chi. \tag{8} $$

When considering the next infinitesimally small time step back in time the same type of problem is encountered, only the terminal cost defined with $Q_T$ is replaced with $(8)$. The matrix $P(t)$ is defined as the equivalent $Q_T$ at time $t$, thus $P(T) = Q_T$. The term added to $Q_T$ in $(8)$ is proportional to $\delta$, so the update of $P(t)$, interpreting it as Euler's method, can also be written as the following matrix differential equation

$$ -\dot{P} = Q + A^\top P + P\,A - \left(N + P\,B\right) R^{-1} \left(N + P\,B\right)^\top, \tag{9} $$

which is a Riccati differential equation. It can be noted that the minus sign in front of $\dot{P}$ is there because it starts at $P(T)=Q_T$ and then goes backwards in time. Using $P(t)$, $x(t)$ and $u(t)$ instead of $Q_T$, $\chi$ and $\mu$ respectively the optimal control solution shown in $(7)$ can be generalized to

$$ u(t) = -R^{-1} \left(N + P(t)\,B\right)^\top x(t). $$

It can be noted that $x(\tau)^\top P(\tau)\,x(\tau)$ is equal to the minimal cost from $t=\tau$ to $t=T$ given $x(\tau)$. For the infinite horizon problem $T = \infty$ the final state $x$ and thus also the input $u$ can be assumed to go to zero, so $Q_T$ would be meaningless. Instead it can be assumed that $P$ converges to a constant value, which for any finite amount of time will remain constant. This constant value for $P$ can be obtained by setting $(9)$ to zero

$$ Q + A^\top P + P\,A - \left(N + P\,B\right) R^{-1} \left(N + P\,B\right)^\top = 0, \tag{10} $$

which is also known as a continuous algebraic Riccati equation (CARE).