Differential Dynamic Programming gives infinite

41 Views Asked by At

I am developing the Differential Dynamic Programming algorithm to optimize the controls of a dynamic system that goes from an initial position to a goal position. The problem comes when I consider the tensor product in $Q_{xx}$,$Q_{uu}$ and $Q_{ux}$ because of $f_{xx}$,$f_{uu}$ and $f_{ux}$ (convention used in https://en.wikipedia.org/wiki/Differential_dynamic_programming). When doing the backward pass the $Q$’s get too big so end up being $NaN$’s and I do not know why because the dynamics are pretty simple: $$ \underline{x_i} = \left[ \begin{array}{c} x_i \\ y_i \\ \theta_i \end{array} \right] $$ $$ \underline{u_i} = \left[ \begin{array}{c} \kappa_i \\ \Delta_{s_i} \end{array} \right] $$ $$ \underline{x_{i+1}} = \left[ \begin{array}{c} x_{i}+\Delta_{s_i} * \cos(\theta_i)\\ y_{i}+\Delta_{s_i} * \sin(\theta_i)\\ \theta_{i}+\Delta_{s_i} * \kappa_i \end{array} \right] $$

So my corresponding Jacobians and Hessians are: $$ \begin{align} f_{u_i} &= \begin{bmatrix} 0 & cos(\theta_{i}) \\ 0 & sin(\theta_{i}) \\ \Delta_{s_i} & \kappa_i \end{bmatrix} \end{align} $$ $$ \begin{align} f_{x_i} &= \begin{bmatrix} 1 & 0 & -\Delta_{s_i}sin(\theta_{i}) \\ 0 & 1 & \Delta_{s_i}cos(\theta_{i}) \\ 0 & 0 & 1 \end{bmatrix} \end{align} $$ And the Hessisns I attach the Python code:

f_xx = np.array([
    [[0, 0, 0], [0, 0, 0], [0, 0, -dS * np.cos(theta_trailer)]],
    [[0, 0, 0], [0, 0, 0], [0, 0, -dS * np.sin(theta_trailer)]],
    [[0, 0, 0], [0, 0, 0], [0, 0, 0]]])

f_uu = np.array([
    [[0, 0], [0, 0]],
    [[0, 0], [0, 0]],
    [[0, 1], [1, 0]]])

f_ux = np.zeros((x.shape[0], u.shape[0], x.shape[0]))
f_ux[:, :, 0] = [[0, 0], [0, 0], [0, 0]]
f_ux[:, :, 1] = [[0, 0], [0, 0], [0, 0]]
f_ux[:, :, 2] = [[0, -np.sin(theta_trailer)], [0, np.cos(theta_trailer)], [0, 0]]

And the specific problem occurs when updating the $Q$’s during the backward pass (I attach the lines), while if I do not consider the tensor product (which is the Iterative Linear Quadratic Regulator algorithm) everything runs fine.

        # Q_xx = l_xx + np.dot(f_x^T, np.dot(V'_xx, f_x)) + tensor_prod(V'_x, f_xx)
        Q_xx = l_xx[t] + np.dot(F_x[t].T, np.dot(V_xx, F_x[t])) + np.tensordot(V_x, F_xx[t], axes=1)

        # Q_ux = l_ux + np.dot(f_u^T, np.dot(V'_xx, f_x)) + tensor_prod(V'_x, f_ux)
        Q_ux = l_ux[t] + np.dot(F_u[t].T, np.dot(V_xx, F_x[t])) + np.tensordot(V_x, F_ux[t], axes=1)#((0),(0)))

        # Q_uu = l_uu + np.dot(f_u^T, np.dot(V'_xx, f_u)) + tensor_prod(V'_x, f_uu)
        Q_uu = l_uu[t] + np.dot(F_u[t].T, np.dot(V_xx, F_u[t])) + np.tensordot(V_x, F_uu[t],axes=1)#, ((0),(0)))

        # Calculate Q_uu^-1 with regularization term set by
        # Levenberg-Marquardt heuristic (at end of this loop)
        try:
            Q_uu_evals, Q_uu_evecs = np.linalg.eig(Q_uu)
            Q_uu_evals[Q_uu_evals < 0] = 0.0
            Q_uu_evals += lamb # some small value
            Q_uu_inv = np.dot(Q_uu_evecs,
                              np.dot(np.diag(1.0 / Q_uu_evals), Q_uu_evecs.T))
        except np.linalg.LinAlgError:
            print("LinAlgError because NaNs in Quu, brake the loop ...")
            break

        k[t] = -np.dot(Q_uu_inv, Q_u)
        K[t] = -np.dot(Q_uu_inv, Q_ux)

        V_x = Q_x - np.dot(K[t].T, np.dot(Q_uu, k[t]))
        V_xx = Q_xx - np.dot(K[t].T, np.dot(Q_uu, K[t]))

Do you know a possible reason why this matrices increase so much? I think my $f$’s are correct