I am basically trying to see the effect of adding a linear term to the conventional quadratic cost function used in the design of LQR controller for a finite horizon, free terminal state, discrete system case. I expect a change in the optimal input policy when I add an additional term in the objective function. But the result with the additional term in cost function is the same as the one with the conventional cost function. I need to know how to bring out the effect of adding the additional term in the cost function in the expression for optimal control input?
All references I could find only consider the conventional cost function containing z'Qz and the i'Ri term. z is the state vector and i is the control input.
I have explained my approach below:
The discrete time plant model given is \begin{align} z[k+1] &= \Phi z[k] +\Gamma i[k]% + B_{ub}f_{ub}[k] \end{align}
The performance index to be minimized $ J $ with the linear weighing term q[k] in addition to the quadratic terms containing Q and R is given as \begin{align} J &= \dfrac{1}{2} z[N]^T S[N] z[N] + \sum_{i}^{N-1} \left( \dfrac{1}{2} \left[ z[k]^TQz[k] + i[k]^TRi[k] \right]+{\color{red}q_t[k]z[k]}\right) \end{align} S and Q are positive semi definite, R is positive definite. I have highlighted the term I am adding in red.
The final state z[N] is free while the initial state z[0] is fixed. The objective is to minimize J by choosing the optimal control input i[k]. At the final sampling instant N, the optimal performance index is \begin{align} J^*[N] = \dfrac{1}{2} z[N]^T S[N] z[N]. \end{align} For the penultimate time step, the performance index is \begin{align} J[N-1] = \dfrac{1}{2} z[N]^T S[N] z[N] + \left( \dfrac{1}{2} \left( z[N-1]^TQz[N-1] + i[N-1]^TRi[N-1] \right)+{\color{red}q_t[N-1]z[N-1]}\right) \end{align} Substituting for z[N] using the state equation to bring all the terms in the expression to the instant [N-1], \begin{align*} J[N-1] &= \dfrac{1}{2} \left( \Phi z[N-1] +\Gamma i[N-1]\right)^T S[N] \left( \Phi z[N-1] +\Gamma i[N-1]\right) \\&+ \left( \dfrac{1}{2} \left( z[N-1]^TQz[N-1] + i[N-1]^TRi[N-1] \right)+{\color{red}q_t[N-1]z[N-1]}\right) \end{align*} Differentiating with respect to i[N-1] and solving for the optimal control input, \begin{align} i^*[N-1] &= -\left( \Gamma^TS[N]\Gamma+R\right)^{-1}\Gamma^T S[N] \Phi z[N-1]\\ i^*[N-1] &= -K[N-1] z[N-1] \\ \end{align} where the Kalman gain is given by $ k[N-1] = \left( \Gamma^T S[N] \Gamma+R\right)^{-1} \Gamma^T S[N] \Phi $. Substituting the expression for the optimal control input in the PI at [N-1], the optimal PI (performance index) at [N-1] is obtained as \begin{align*} J[N-1] &= \dfrac{1}{2} z[N-1]^T {\color{black} S[N-1] }z[N-1] + {\color{red}q_t[N-1]z[N-1]} \end{align*}
where the time varying state weighing matrix S[N-1] is given by \begin{align*} S[N-1] = \left( \Phi -\Gamma K[N-1]\right)^T S[N] \left( \Phi -\Gamma K[N-1] \right) + Q + K[N-1]^T R K[N-1] \end{align*} From the computed S[N-1], the Kalman gain matrix and hence the optimal control input can be obtained. Since the differentiation operation with respect to input i[N-1] removes the additional weighing term introduced in addition to Q and R, the optimal input $i^* $ does not involve the additional weighing vector $ q_t[k] $.
Let $V_k$ be the value function at time $k$, then from Dynamic programming we get that
$$\min_{u_k}\{V_{k+1}-V_k+\dfrac{1}{2}(z_k^TQz_k+i_k^TRi_k)+q_k^Tz_k\}=0.$$
So, while the expression for the optimal $u_k$ does not change per say from the presence of the linear, the value and the structure of the value function does. In this case, for the HJB equation to be satisfied, the value function should take the form $V_k(z)=z^TS_kz+b_k^Tz$.