I have implemented LQG in MATLAB software. But, now I do not know how to determine the value of optimal cost. Each way of calculating cost, returns a different value. Which one should I trust to compare with other methods? Moreover, the optimal trajectory (x) does not converge to zero as in LQR. This is my code.
% Problem: minimize J = x[N]' P[N] x[N] + L[x,u]
% s.a. x[k+1] = A x[k] + B u[k] + w[k]
% where L[x,u] = sum{k=0}^{N-1} ( x[k]' Q x[k] + u[k]' R u[k] )
N = 50; % Horizon
% System Data
A = 1; B = 1; Q = 1; R = 1;
W = 1; P(:,:,N) = 1;
x(:,1) = 5;
% Calculate gain and Riccati
for k = N-1:-1:1
Aux1 = inv(R + B' * P(:,:,k+1) * B);
K(:,:,k) = - Aux1 * B' * P(:,:,k+1) * A;
Aux2 = P(:,:,k+1) - P(:,:,k+1) * B * Aux1 * B' * P(:,:,k+1);
P(:,:,k) = A' * Aux2 * A + Q;
end
% System Simulation
for i = 1:N-1
w = mvnrnd(0,W);
u(:,i) = K(:,:,i) * x(:,i);
x(:,i+1) = A * x(:,i) + B * u(:,i) + w;
if i == 1
J(i) = x(:,i)' * Q * x(:,i) + u(:,i)' * R * u(:,i);
else
J(i) = x(:,i)' * Q * x(:,i) + u(:,i)' * R * u(:,i) + J(i-1);
end
end
% Optimal Cost (calculated)
J_opt = J(N-1) + x(:,N)' * P(:,:,N) * x(:,N)
% Case I - Optimal Cost (Dynamic Programming)
Aux = 0;
for j = 1:N-1
Aux = Aux + trace( P(:,:,j+1) * W);
end
V = x(:,1)' * P(:,:,1) * x(:,1) + Aux
% Case II - Optimal Cost (Dynamic Programming)
X0 = cov( x(:,1) );
V2 = trace( P(:,:,1) * X0) + Aux
In addition, I looked for lectures to help me. But, I did not find a good teaching material. Could someone tell me one?
Keep in mind that the first method uses one specific noise realization. For example think in the extreem when $N=1$, then it is possible that one noise sample is very close to or very far from zero. Only if you would simulate the system many times then the average should go towards an expected value.
I am not sure where you got your expression for the optimal cost of Case II. Namely matlab will always return zero when evaluating
cov( x(:,1) )if the state dimension is zero.