Minimum number of required function values to maximize a quadratic form

129 Views Asked by At

Consider a Rayleigh quotient $$f(\mathbf{x})=\frac{\mathbf{x}^H \mathbf{G}\mathbf{x}}{\mathbf{x}^H \mathbf{x}}$$ where all quantities are complex valued, $\mathbf{x}$ is an $N\times 1$ vector and $\mathbf{G}$ an $N\times N$ positive semidefinite matrix with (known) rank $r$; bold font indicate a vector/matrix. Maximizing $f(\mathbf{x})$ is trivial in case $\mathbf{G}$ is known. My question relates to the case of unknown $\mathbf{G}$ (but $r$ is known).

One has access to functional values $f(\mathbf{x}_\ell), 1\leq \ell\leq L$ for a set of vectors $\{\mathbf{x}_\ell\}$ that can be chosen arbitrarily.

Consider now the two optimizations $$\max_\mathbf{x} f(\mathbf{x}) \quad \mathrm{and} \quad \arg \max_\mathbf{x} f(\mathbf{x}).$$

My question is about the smallest value $L$ such that the problems are solvable (i.e., we would be in the same situation as if $\mathbf{G}$ was known).

  1. Is this minimal $L$ different for the two problems?
  2. What is this minimum $L$? How does the rank $r$ impact.

I am interested in general results, not special cases with special structures of $G$ (e.g., diagonal, etc.).

Some observations. For any $N$, and $r=1$, the two problems have different minimum $L$. As the first problem has the largest eigenvalue as solution, and since there is only one, it suffices to obtain the trace of $\mathbf{G}$ from the measurements. This can be obtained by $L=N$ using $\mathbf{x}_\ell =\mathbf{e}_\ell$ where $\mathbf{e}_\ell$ is an all-zero vector but whose $\ell$:th element is 1. However, to solve the second problem, we need information about the eigenvectors of $\mathbf{G}$. This essentially requires knowledge of the full matrix $\mathbf{G}$, hinting that one needs $L=N^2$.

1

There are 1 best solutions below

1
On

Heuristically I would expect roughly $L \approx 2rN$ values should suffice. In particular, $G$ can be written as $G=CF$ where $C$ is a $N\times r$ matrix and $F$ a $r \times N$ matrix. Consider the entries of $C,F$ as unknowns. Each observation gives you one quadratic equation on these $2rN$ unknowns. If the $x_i$ are chosen randomly, after about $2rN+O(1)$ such values are observed, I would (heuristically) expect that this system of equations likely has a unique solution, which then is sufficient to find the answer to both of your problems.

Of course this is not a proof. And of course it could always take less if you find a more data-efficient method to compute the answer, such as you found for the rank-1 case by using the trace to compute the answer to the first optimization.