Optimal control input for nonlinear dynamics

219 Views Asked by At

This is from Chapter 1 of the book, "Robust Nonlinear Control Design" by Freeman and Kokotovic.

Consider the system $$\dot{x} = -x^3 + u + wx,$$ where $u$ is an unconstrained input and $w$ is known to take values in the interval $[-1,1]$.

If we consider the cost functional, $$ J = \int_0^{\infty} (x^2 + u^2) dt, $$ then the optimal feedback law that minimizes the cost functional (in the worst case) is claimed to be given by $$ u = x^3 - x - x\sqrt{x^4 -2x^2 + 2}. $$

Question:

How is the optimal feedback law obtained?


Attempt #1: I tried using Pontryagin Maximum Principle, and I got the following. The Hamiltonian is given as $$ H = x^2 + u^2 + \lambda(-x^3 + u + wx). $$ By taking derivatives with respect to $x$ and $u$ and solving them, $$ \dot{\lambda} = -\frac{\partial H}{\partial x} = -2x+ 3\lambda x^2 -\lambda wx \Rightarrow \lambda = ke^{(3x^2-w)t}+\frac{2x}{w-3x^2}, $$ $$ \frac{\partial H}{\partial u} = 0 \Rightarrow u = -\frac{\lambda}{2}. $$ Substituting $\lambda$ in the first expression into the second expression, $$ u = -\frac{k}{2}e^{(3x^2-w)t}+\frac{2x}{w-3x^2} $$

The initial or final conditions for the ODE in $\lambda$ are not stated so I cannot solve for $k$, but the form of the solution that I obtained is quite different from what is given.

Does anyone has any ideas or alternative approaches?


Attempt #2 (added on Jan 5 2021): I also tried to work directly with the given optimal control input, $u$ and the nonlinear dynamics.

Substituting $u$ into the nonlinear dynamics $\dot{x}$ and using the worst-case $w$ of $1$ gives $$ \dot{x}^2 = x^2 (x^4 - 2x^2 +2) = x^6 - 2x^4 + 2x^2 $$

Also note that we have the following expression for $u$ from the nonlinear dynamics, $$ \dot{x} = -x^3 + u + x \Rightarrow u = \dot{x} + x^3 - x $$ This is independent of the given optimal control input.

With this expression of $u$, if $\dot{x}$ can be assumed to be $0$ (may not be a valid assumption), then the following can be said of the cost functional,

$$ \begin{aligned} J = \int_0^{\infty} (x^2 + u^2) dt &= \int_0^{\infty} (x^2 + (\dot{x} + x^3 - x)^2) dt\\ &= \int_0^{\infty} (x^2 + x^6 - 2x^4 + x^2) dt\\ &= \int_0^{\infty} (x^6 - 2x^4 + 2x^2) dt\\ &= \int_0^{\infty} \dot{x}^2 dt = 0, \end{aligned} $$ which is the minimum, since the cost functional is nonnegative.

2

There are 2 best solutions below

9
On

First of all, your solution for $\lambda$ as a function of time does not look correct. Namely, when one takes the time derivative of it yields

\begin{align} \dot{\lambda} &= \frac{d}{dt}\left(k\,e^{(3\,x^2 - w)\,t} + \frac{2\,x}{3\,x^2 - w}\right), \\ &= \left(3\,x^2 - w + 6\,x\,\dot{x}\,t\right) k\,e^{(3\,x^2 - w)\,t} + \frac{2\,(3\,x^2 + w)}{(3\,x^2 - w)^2} \dot{x}, \\ &= - 2\,x + 3\,x^2\,\lambda - w\,\lambda + \left(6\,x\,t\,\left(\lambda - \frac{2\,x}{3\,x^2 - w}\right) + \frac{2\,(3\,x^2 + w)}{(3\,x^2 - w)^2}\right) \dot{x}, \end{align}

which would only match the desired derivative when $\dot{x} = 0$.


Instead one could solve it in a similar way as one could go about solving infinite time horizon LQR using PMP. Namely, by only considering only the initial conditions that would bring both state to the origin as time goes to infinity. Namely, if $x$ or $u$ (and thus $\lambda$) does not go to zero as time goes to infinity then the cost function integral would become infinite. Once near the origin one could use a linearization of the dynamics. The dynamics after substituting in the expression for $u$ can be written as

$$ \begin{bmatrix} \dot{x} \\ \dot{\lambda} \end{bmatrix} = f(x,\lambda), $$

with

$$ f(x,\lambda) = \begin{bmatrix} -x^3 - 1/2\,\lambda + w\,x \\ -2\,x + 3\,x^2 \lambda - w\,\lambda \end{bmatrix}. $$

This dynamics in $z=\begin{bmatrix}x & \lambda\end{bmatrix}^\top$ can be linearized near an equilibrium point, where $\dot{z}=0$ and whose solution is denoted with $\bar{z}$. This linearization can be obtained by using a Taylor expansion

$$ \dot{z} = f(\bar{z}) + \left.\frac{\partial\,f}{\partial z}\right|_{z=\bar{z}} (z - \bar{z}) + \text{h.o.t.}, $$

with $\text{h.o.t.}$ short for higher order terms. Applying this to the given dynamics at the equilibrium point $(x,\lambda)=(0,0)$ yields

$$ \begin{bmatrix} \dot{x} \\ \dot{\lambda} \end{bmatrix} = \underbrace{ \begin{bmatrix} w & -1/2 \\ -2 & -w \end{bmatrix}}_A \begin{bmatrix} x \\ \lambda \end{bmatrix} + \text{h.o.t.}. $$

It can be noted that the matrix $A$ has one positive/unstable and one negative/stable eigenvalue. So the linearized system would only converge to the origin if its initial condition would lie on the stable manifold, which can be shown to be $\lambda = 2\,(w + \sqrt{w^2 + 1})\,x$. However, this manifold would for the nonlinear system only work very close to the origin. The manifold for the nonlinear system could maybe be found by starting infinitesimally close to the origin, but on the manifold for the linearized system, and solve it backwards in time. However, this is not an easy problem to solve analytically.

Because there are no constraints on the control input $u$ it follows that the Hamiltonian should remain constant (the choice for $\dot{\lambda}$ and $u$ assures that $\dot{H}=0$). The value of the constant value of $H$ can be obtained by substituting any state we know lies on the manifold, which would be $(x,\lambda)=(0,0)$, which gives $H=0$. First, simplifying the Hamiltonian after substituting in the expression $u=-\lambda/2$ yields

$$ H = x^2 + \lambda^2/4 + \lambda\,(-x^3 - \lambda/2 + w\,x). $$

From this expression it should hopefully also be clear why $H=0$ when $(x,\lambda)=(0,0)$. Using the expression for the Hamiltonian with the expression for $u$ substituted in, then the equation $H=0$ can also be written as the following quadratic equation in $\lambda$

$$ \lambda^2 + 4\,(x^3 - w\,x)\,\lambda - 4\,x^2 = 0, $$

which has the following two solutions

$$ \lambda = 2\,w\,x - 2\,x^3 \pm 2\,x \sqrt{(x^2 - w)^2 + 1}. $$

However, only one of these two solutions lies on the manifold for the linearized system near to origin. Namely

$$ \lambda = 2\,w\,x - 2\,x^3 + 2\,x \sqrt{(x^2 - w)^2 + 1}. $$

Substituting this into the expression for the control input yields the following solution for any $w$

$$ u = x^3 - w\,x - x\sqrt{(x^2 - w)^2 + 1}. $$

This matches the given control law when $w = 1$. This should also be expected, since for $w=1$ the origin is the most unstable (when $u=0$).

3
On

Not sure if you still the answer, but you can get this from the following HJB equation:

$$ {\rm min}_{u} \ (-x^3+x+u) V_{x} + x^2 + u^2 = 0.$$

Here, $V_{x}$ is the partial derivative of the value function $V$ with respect to $x$. You see that a condition for $u^{\star}$ to be a minimizer is that $$ u^{\star} = -\frac{1}{2} V_{x} $$. Substituting the above condition into your HJB equation yields $$ -\frac{V_x^2}{4} + x^2 + V_x (x - x^3) = 0 $$ Solving for $V_x$ yields $$ V_{x} = 2 \left(x - x^3 \pm x \sqrt{2 - 2 x^2 + x^4} \right) $$ Now, at this point you can just plug in the right expression for $V_x$ into $u^\star$ to get your feedback law.

But, which one should we pick? Your value function must be positive definite, since your objective functional is quadratic. If you integrate $V_x$ using the negative root you get a function that can not possibly be positive definite. On the other hand, if you integrate once the expression with the positive root and add the constant term $\frac{1}{2}(\sqrt{2} + \sinh^{-1}(1))$, you get a function which is positive definite---in fact a Lyapunov function---for the closed-loop system induced by the feedback law $$ \mu(x) = -\frac{1}{2}V_{x} = x^3-x-x\sqrt{2-2x^2+x^4}$$. Remember that you can always add a constant term to your objective without changing the optimal solution.