Solving a matrix equation using numerical optimization

2k Views Asked by At

To my knowledge, if $A \in \mathbf{S}^n_{++}$, then given any $b \in \mathbb{R}^n$, the system of linear equations $Ax = b$ has a unique solution $x^* \in \mathbb{R}^n$. Moreover, the solution $x^* \in \mathbb{R}^n$ of $Ax = b$ minimizes the objective function $$f(x) = \frac{1}{2}x^TAx - b^Tx$$ It is known that $f$ is a convex smooth function. The gradient and the Hessian of $f$ are correspondingly $$\nabla f(x) = Ax - b$$ $$\nabla^2 f(x) = A \succeq 0$$ So in fact the unique solution $x^*$ of $Ax = b$ satisfies the optimality conditions. Although it is stupid, we can use any convex optimization method on $f$ in order to find the solution $x^*$.


Now my question is what if $A$ is a general nonsingular square matrix (not necessarily SPD) and we follow the same idea to define an objective function $$f(x) = \frac{1}{2}x^TAx - b^Tx$$ where $b \in R(A) = \mathbb{R}^n$. Then

  1. Is the unique solution $x^*$ of $Ax = b$ an optimal point of $f$?
  2. What kind of optimization methods would converge to $x^*$ given any initial guess (with the fastest rate)?
  3. Any improvements if $A$ has a positive real symmetric part?

My guess for the first question is $x^*$ may be a saddle point of $f$ which satisfies the first-order optimality condition, but not the second-order. I don't know anything about non-convex optimization so I have no ideas for the second and the third question. I know there are numerical linear algebra methods like conjugate gradient which converges for SPD matrix and GMRES which converges for any matrix, but I don't have much knowledge about the underlying principle and which methods are the best ones either.

Any ideas and reference are welcome. Thanks in advance. :)

2

There are 2 best solutions below

1
On

There are several mistakes in the body of question.

  1. $\nabla(f)(x)=2Ax-b$.

Assume that $A$ is a square matrix. Then $x^*$ is a critical point of $f$ iff $(A+A^T)x^*=b$; as Thoth wrote, putting $A:=A+A^T$, we may assume that $A$ is symmetric, $f(x)=1/2x^TAx-b^Tx$ and the required equation becomes (*) $Ax^*=b$.

2 $A$ must be invertible, otherwise there is no solutions or an infinity of solutions in $x^*$.

Your equation (*) is equivalent to $Bx^*=c$ where $B=A^2$ -a symmetric $> 0$ matrix- and $c=Ab$. In this way, the second problem is reduced to the first one: find the minimum of $1/2x^TBx-c^Tx$.

2
On

In general you should investigate on iterative methods to solve $Ax=b$. If $A$ is not positive definite, there are provable no cheap methods. Rate of convergence and effective complexity depends strongly on the structure of $A$ and preconditioning is almost a must.

There are fix point iteration based methods and Krylov space based methods (like CG). A good starting point is Wikipedia: https://en.wikipedia.org/wiki/Iterative_method

Fun fact aside: You should avoid applying CG on the normal equation for large sized or ill conditioned $A$, as the condition of $A^T A$ is the squared condition of $A$.