Find minimum of $f(\mathbf{v})= \mathbf{v}^\top A \mathbf{v} - \sum_{i=1}^N \log((\mathbf{v}^\top \mathbf{u}_i)^2) $

87 Views Asked by At

Consider the function: $$ f(\mathbf{v})= \mathbf{v}^\top A \mathbf{v} - \log((\mathbf{v}^\top \mathbf{u})^2), $$ where $\mathbf{v}, \mathbf{u} \in \mathbb{R}^n$ and $A \in \mathbb{R}^{n \times n}$ is symmetric and positive definite.

I would like to find the solution(s) in $\mathbf{v}$ forming the minimum of this function.

Differentiating, we have $$f’(\mathbf{v}) = 2A \mathbf{v} - \frac{2 \mathbf{u}}{\mathbf{v}^\top \mathbf{u}}.$$

Setting this to zero and multiplying the LHS by $\mathbf{v}^\top$ gives the solution set as: $$\mathbf{v}^\top A \mathbf{v} = 1.$$

However, I’m unsure how to interpret the above and find the vectors that constitute as the minimum of the original function.

Now, can this then be extended to find the minimum of: $$f(\mathbf{v})= \mathbf{v}^\top A \mathbf{v} - \sum_{i=1}^N \log((\mathbf{v}^\top \mathbf{u}_i)^2),$$ with the known and independent $\mathbf{u}_i \in \mathbb{R}^n$?

2

There are 2 best solutions below

7
On

Rearranging a bit and renaming $\lambda=1/v^T u$ we obtain

$$Av=\lambda u\iff v=\lambda A^{-1}u$$

Dotting with $u$ yields the equation that determines $\lambda$:

$$\lambda^2=\frac{1}{u^T A^{-1}u}$$

which if $A$ is positive definite yields two solutions:

$$v=\pm\frac{A^{-1}u}{\sqrt{u^TA^{-1}u}}$$

both of which are minima of the function.

EDIT: To reflect the update of the question I will dive into the solution of the problem when there are multiple vectors $u_i$ that form an independent set.

We find that the critical points of the function are given by

$$\nabla_\mathbf{v}f=2\left[Av-\frac{\sum_{i=1}^N(v^T u_i)u_i}{\sum_{i=1}^N(v^Tu_i)^2}\right]=0$$

We note that this equation yields the constraint $v^T Av=1$, which is going to be important for the determination of the minimum. We also note that $N\leq n$ or we cannot have an independent set of vectors. For notational convenience we set $v^Tu_i=\lambda_i~,~ r^2=\sum_i\lambda^2_i$. These numbers have to be determined. Solving the equation for the critical points yields

$$v=\frac{\sum_i\lambda_i A^{-1}u_i}{r^2}$$

Dotting on the left with $u_j~,~j=1,...,N$ we obtain the set of equations $$r^2\lambda_j=\sum_{i}\lambda_i( u_j^TA^{-1}u_i)$$

We note that when we define the matrix $Q_{ij}:=u_i^TA^{-1}u_j$ this can be written as an eigenvalue problem

$$Q\lambda= r^2 \lambda$$

which basically says that the projections on the set of vectors $u$ have to be eigenvectors of the matrix $Q$- the representation of $A^{-1}$ in the subspace generated by the $u$'s. Since $A$ is invertible, $Q$ must be invertible as well (proof?) in which case it contains no zero eigenvalues. Furthermore, $A^{-1}$ is symmetric and all it's eigenvalues are real.

We are left with considering the space of solutions of the eigensystem above. Note that upon a first look, this is quite the eccentric eigenvalue problem; the eigenvalues are constrained to be the norm of the eigenvectors themselves! Not to worry however, for suppose the orthonormal solution to the eigenproblem of $Q$ is given by the set $\{\gamma_i\in\mathbb{R}, e_i\in \mathbb{R}^N\}$ (one should be always able to find such a solution for the matrices in question). Since the eigensystem is homogeneous we can divide it by a constant, so we will solve the equivalent problem

$$Q\frac{\lambda}{||\lambda||}=r^2\frac{\lambda}{||\lambda||}$$

We use the orthonormal solution above to construct the solution. Indeed we find that the solution is given by the set

$$\{r_i=\sqrt{\gamma_i},\lambda_i=\pm\sqrt{\gamma_i}e_i,~~ \gamma_i>0\}$$

There are no solutions allowed corresponding to negative eigenvalues. Since the value of the function on the critical points is

$$f(v)=1-\log(r_i^2)$$

The minimum is given by the vector that corresponds to the maximum positive eigenvalue of the matrix $Q$:

$$r=\max_i(\sqrt{\gamma_i})~,~ \lambda=\pm \sqrt{\gamma_{\max}}e_{\gamma_{\max}}~,~ f=1-\log\gamma_{\max}$$

Note that the minimum is always at least doubly degenerate, however it can have higher degeneracy if the maximum eigenvalue has multiplicity $>1$.

0
On

Your work suggests the problem is equivalent to maximizing $|v^\top u|$ subject to the constraint $v^\top A v = 1$.

Making the change of variables $z=A^{1/2} v$ and $y=A^{-1/2} u$, this is equivalent to maximizing $|z^\top y|$ subject to $z^\top z = 1$. By Cauchy-Schwarz, this is attained when $z = \pm y/\|y\|$. Working backwards, this leads to $v = A^{-1/2} z = \pm A^{-1/2} y/\|y\| = \pm \frac{A^{-1} u}{\sqrt{u^\top A^{-1} u}}$.