What could possibly be wrong, that causes instability in this simulation, even though Barbalat's lemma has been employed?

67 Views Asked by At

My dearest Math.Stackexchangers...,

I have been having an issue for some time now, and no matter the attempt, I couldn't understand the cause of this stability issue.

I have reproduced the example from section B of section IV. HEBBIAN FEEDBACK COVARIANCE OPTIMIZATION IN DYNAMIC SYSTEMS from this paper.

The example used is of the form

$$ \dot x = \theta_0 - ax + b(u-\theta_1)$$

and long-term objective function $J=(1/2)(x^2+u^2)$ to be minimized. With some manipulation, using an intermediate filtered variable $s_f(n+1)$, we get the short-term objective function $$ Q(s_f(n+1), u(n)) = \frac{s_f^2(n+1)}{2} + \frac{u^2(n)}{2}.$$

This minimization is achieved by employing adaptive feedback gain $W(n)$, where control action is computed as $$u(n) = W(n) \cdot s_f(n). $$ Using Barbalat's lemma, an adaptation rule that guarantees convergence, is derived as $$ \Delta W(n+1) = -k\cdot s_f^2(n+1) \cdot \left[ \delta s_f(n+1) \delta u(n) + \frac{u(n)}{s_f(n+1)} \delta u^2(n)\right],$$

given the objective function. If I follow these steps (with a bit more detail, as explained in the paper), I indeed end up with the state evolution as shown in the figures, like so:. State x vs Time t However..., when I change the system parameter $a$ from 1 to 0.5 for example, stability is lost! enter image description here How can that be, when stability is supposed to be guaranteed, due to Barbalat's lemma! I mean, as long as is adhered to the adaptation rule that is derived, using this lemma, the stability should be maintained no matter the system parameters, am I right? And even if it's assumed that all system parameters are known exactly, and all disturbances and noises are removed completely, and these informations are used to compute the derivatives explicitly (see paper), then still this instability phenomenon is occuring (I found that changing the adaptation rate as well as the parameters are of influence on whether convergence is achieved or not). How can this ever be the case? To be more specific on what I did, this is my matlab code (I know this is not stackoverflow, but I am asking about mathematical principles rather than finding bugs in code):

clear all;
close all;
clc;
%% Hebbian Feedback Covariance Example
% system info: 
% continuous time: xdot = theta0 - a*x + b*(u-theta1);
% discrete time: x(n+1) = exp(-a*T)*x(n) +
% -1/a*(exp(-a*T)-b) + (theta0-b*theta1)/(-a)*(exp(-a*T) - 1);
% sf(n+1) = x(n+1)/(a*T) + (1-1/(a*T))*x(n);
% uf(n+1) = W(n)*sf(n);
% DeltaW(n+1) = -k*sf(n+1)^2*(deltasf(n+1)*deltau(n) +
% u(n)/sf(n+1)*deltau(n)^2);

T = 0.1; % [s] Simulation Time step;
theta0 = 70;
theta1 = 2;
a = 1;
k = 0.4;

%% State Evolution
A = exp(-a*T);
t = 0:T:100;
b = -0.5;
W0 = -b/a;
u0 = (b^2*theta1 - b*theta0)/(b^2+a^2);
x0 = u0/W0;
x(1) = x0;
W(1) = W0;
uf(1) = u0;
sf(1) = x0;
DeltaW(1) = 0;
deltasf(1) = 0;
deltauf(1) = 0;
b1=-0.5;
b2=-0.75;
for n = 1:length(t)-1
    b=b1*(t(n)<10)+b2*(t(n)>=10);
    B = -b/a*(exp(-a*T)-1);
    C = (theta0 - b*theta1)/(-a)*(exp(-a*T)-1);
    d = 0.1*(rand-0.5);
    uf(n+1) = W(n)*sf(n); %+d; %uf(n+1)
    x(n+1) = A*x(n) + B*(uf(n+1)) + C;
    sf(n+1) = x(n+1)/(a*T) + (1-1/(a*T))*x(n);
    DeltaW(n+1) = -k*(b/a+uf(n+1)/sf(n+1)); %*sf(n+1)^2
    % DeltaW(n+1) = -k*sf(n+1)^2*B/(a*T)-k*sf(n+1)*uf(n+1); %*sf(n+1)^2;
    W(n+1) = W(n) + DeltaW(n+1);
end

plot(t, x);

So summarized: Is there something obvious I am missing? Like an additional condition that is not met in my case for stability for certain parameter values?

Thanks for the help!