linear quadratic regulator proof

361 Views Asked by At

Given a basic LQR regulator problem to design an optimal control, $u^*(t)$ for the state-space system:

$\dot{x} = Ax + Bu$

which minimizes the following functional:

$J = \frac{1}{2}\int_{0}^{t_f} \left[ x^TQx + u^TRu \right]dt$

I am to prove that with $t_f$ fixed and $x(t_f)$ free, that if $x^*(t_f) = 0$ then both of the following are solutions to the problem:

$u^*(t) = R^{-1}B^T(P_{12}^{-1}P_{11})x^*(t)$
$u^*(t) = R^{-1}B^T(P_{22}^{-1}P_{21})x^*(t)$

where

$\left[\begin{array}{c|c} P_{11} & P_{12} \\ \hline P_{21} & P_{22} \end{array}\right] = e^{P(t_f-t)}$

and

$P = \left[\begin{array}{c|c} A & -BR^{-1}B^T \\ \hline -Q & -A^T \end{array}\right]$

And I am having trouble proving this. I checked the derivation of how the Riccati equations are derived, and it starts with a Hamiltonian optimization problem:

$H = g + p^Ta$

Where g is the inside integral term of the functional, p are the lagrange multipliers, and a is the "subject to" constraint equations:

$H = \frac{1}{2}x^T Q x + \frac{1}{2}u^TRu + p^T(Ax + Bu)$

with the necessary conditions of:

$\dot{x} = \frac{\partial H}{\partial p} = Ax + Bu$
$\dot{p} = -\frac{\partial H}{\partial x} = -\left[Qx + A^Tp \right]$
$0 = \frac{\partial H}{\partial u} = Ru + B^Tp$

Solving the last necessary conditions of u:

$u = -R^{-1}B^Tp$

substituting this into the other equations make x and p a system of ODE:

$\left[\begin{array}{ccc} \dot{x} \\ \dot{p} \end{array} \right]= \left[\begin{array}{c|c} A & -BR^{-1}B^T \\ \hline -Q & -A^T \end{array}\right]\left[\begin{array}{ccc} x \\ p \end{array} \right] $

It will turn out that:

$p(t) = k(t)x(t)$

making

$\dot{p} = \dot{k}x + k\dot{x}$

And after some substitutions and equating:

$\left( \dot{K} + KA + A^TK + Q - KBR^{-1}B^TK \right)x = 0$

In which K are now the variable to solve for in this differential Riccati equation and:

$u^* = -R^{-1}B^TKx$

This is about as far as I can go without some numerical values. I can conclude that the solution for K is related to $P_{12}^{-1}P_{11}$ and $P_{22}^{-1}P_{21}$, but I don't know how to prove that generically. I'm not sure if I am supposed to actually find the elements of $e^{P(t_f-t)}$, or if this is even possible (you can't generically take the exponential of a matrix, or even diagonalize it without specific values, right?). I haven't really made use of the fact that $x^*(t_f) = 0$ either....

Can someone give me a pointer on how to prove this?