I have an optimization problem, where I attempt to identify an unknown system. I usa a linear, discrete function approximator, say
$$ x_{k+1} = Ax_k. $$
The method of optimization is a gradient descent one, where $A$ is updated by the algorithm $$ A_{2\times2} = A_{2\times2} - \alpha \cdot \text{ (gradient w.r.t. A)}_{2\times2}. $$
All good and nice. However, my $A$ matrix becomes unstable at times. And this is a problem, since, my $A$ matrix generates simulation data, needed to compute the next gradient to update $A$. But when unstable, the gradient will be undefined, and that is a problem.
My question is: Is there a way to regularize the scalar $\alpha$ in this setting such that $A$ remains stable trhoughout the optimization? I.e. is there an $\alpha$, such that $$ |eig(A-\alpha B)|<1 \quad \text{ given that } \quad |eig(A)|<1 \quad? $$ That is, select a scalar $\alpha$ such that my function approximator never becomes unstable. Maybe even choose a matrix step size like so $$ \alpha = \begin{bmatrix}\alpha_{11} & \alpha_{12} \\ \alpha_{21} & \alpha_{22} \end{bmatrix}, $$ and perform an element-wise multiplication with the gradient, to treat each parameter of $A$ differently, as long as we descend into the right direction. Any ideas?
If there's any information missing/required, let me know.