How do singular values of a matrix change when left multiplying by a row vector?

677 Views Asked by At

Suppose I have a matrix $M \in \mathbb{R}^{A \times B}$ with SVD $M = U S V^T$.

For a given vector $p \in \mathbb{R}^A$, can I compute the SVD of $p^T S V^T$ as a function of the original singular values $S$ and original right singular vectors $V$?

2

There are 2 best solutions below

6
On BEST ANSWER

Given the svd of the $A\times B$ matrix $M$, $M=USV^\top$, and the $1\times B$ matrix $L:=p^\top SV^\top$, then the svd of $L$ can be worked out from that of $M$.

The singular values of $L$ are the square roots of the eigenvalues of $LL^\top$ where $$\sigma^2v=LL^\top v=p^\top SV^\top VS^\top p=p^\top S^\top Sp=[r^2]$$ where $$r^2=p_1^2\sigma_1^2+\cdots+p_n^2\sigma_n^2$$ Thus there is only one right singular vector with singular value $r$.

The related eigenvector is that of $L^\top L$, so $$\sigma^2v=L^\top Lv=VS^\top pp^\top SV^\top pv=(p^\top SV^\top v)(VS^\top p)$$ so $v$ is the unit vector in the direction of $VS^\top p=p_1\sigma_1u_1+\cdots+p_n\sigma_nu_n$, where $u_n$ are the singular vectors of $M$.

Conclusion: The SVD of $L$ is $$\underbrace{[1]}_{1\times1}\underbrace{\begin{bmatrix}r,0,\ldots,0\end{bmatrix}}_{1\times B}\underbrace{\begin{bmatrix}v_1&v_2&\ldots&v_n\\&\cdots\end{bmatrix}}_{B\times B}$$ The first row of the new $V^\top$ matrix is the $v$ vector and the other rows are any set of orthonormal vectors (so it is not unique).

0
On

As stated in my comment below your question (and in @Chrystomath's response), the general form of an SVD of a row vector $x^T\in\mathbb{R}^{1\times n}$ is:

$\left[1\right]\ \left[||x||\ \ 0^{n-1}\right]\ \left[\dfrac{x}{||x||}\ \ \ U_{\bot x}\right]^T\tag{1}$

Where $0^{n-1}$ is $n-1$ zeros and $U_{\bot x}$ is any orthonormal basis for $\mathbb{R}^n\setminus \mathbb{R}x$, i.e., the remaining part of $\mathbb{R}^n$ that's orthogonal to $x$. (In general, both $0^{n-1}$ and $U_{\bot x}$ can be removed to create a truncated SVD.)


That said, $p^TSV^T$ can only be loosely interpreted in terms of $M$, to an extent that it's largely irrelevant that $S$ and $V$ came from an SVD of $M$.

  1. Assuming that $M$ is square, if $USV^T$ is an SVD of $M$, then so is $(UP^T)(PSP^T)(PV^T)$ for any permutation $P$. This is because $UP^T$ and $VP^T$ are still unitary, and $PSP^T$ is still diagonal containing exactly the singular values of $M$. (This is easily extended to the rectangular case.) This means that, if $p^TSV^T$ is a possible outcome of your algorithm, then so is $p^TPSV^T$ for any permutation $P$.

  2. If a singular value has multiplicity $k$ then the corresponding spaces of singular vectors have dimension $k$. This means that any orthonormal basis for the corresponding spaces can be used in $U$ and $V$. For example, suppose $M=I$. Since $UIU^T=I$ for any orthogonal $U$, this means that $UIU^T$ is a possible SVD of $I$. So, $p^TU^T$ would be a possible outcome of your algorithm when $M=I$, i.e., literally any orthogonal transformation of $p^T$. (The same principle can be applied more locally with repeated singular values.)

In summary, $p^TSV^T$ itself is not well-defined in terms of $M$; you just need to multiply the whole thing out once you have $S$ and $V$. This essentially reduces the problem to finding an SVD for an arbitrary row vector.


(An exception to all of this is if $p$ is dependent on $U$, in which case the question is completely different. If that's the case, my answer would be quite a bit different.)


Edit: Just to show that these concerns aren't pedantry, here is a working demonstration.

Let $A$ be an $n\times n$ matrix of random values. In general, $A$ will be invertible; however, $AA^{-1}$ will differ from $I$ slightly, due to numerical errors. This will cause the singular values of $AA^{-1}$ to vary slightly from $1$, affecting how iterative algorithms generate $U$ and $V$. This simulates how the SVD algorithm ends up choosing an arbitrary basis when a singular value is repeated.

The following function in the R programming language demonstrates this:

f <- function(p) {
  # Generate a random matrix
  A <- matrix(rnorm(ncol(p)^2,0,1),c(ncol(p),ncol(p)))

  # Approximate I with AA^-1
  I <- A%*%solve(A)

  # Compute the SVD of I
  I.svd <- svd(I)

  # Compute p D V^T
  val <- p%*%diag(I.svd$d)%*%t(I.svd$v)

  # Return errors
  list(p.error=norm(p-val,type='F'),I.error=norm(diag(ncol(p))-I,type='F'))
}


# Call f with the row vector [1,2,3]
f(cbind(1,2,3))

f approximates $M=I$ by multiplying a random matrix by its inverse, then computes $pSV^T$ using the provided row vector $p$ and the SVD of $M$. Ideally, the result should be close to $p=pII^T$, but because slight perturbation has a large effect on the SVD algorithm, the result is nowhere near $p$.

Below is a table of 10 sequential calls to $f$ with $p=[1,2,3]$. The first column is the error between $p$ and $pSV^T$ and the second is the error between $AA^{-1}$ and $I$. (Both using the Frobenius norm.)

      p.error  I.error     
 [1,] 3.844951 5.551115e-16
 [2,] 4.978845 1.073535e-15
 [3,] 3.04258  3.002225e-16
 [4,] 6.458657 4.014976e-16
 [5,] 4.472136 5.087681e-16
 [6,] 1.414214 2.237726e-16
 [7,] 6.224307 4.007774e-16
 [8,] 3.260453 6.426114e-16
 [9,] 3.545585 3.521788e-16
[10,] 3.406453 1.428806e-16

Notice how the error in approximating $I$ is $O(10^{-16})$, whereas the error between $pSV^T$ and its "intuitive" result $p$ is $O(||p||)$. The latter errors completely disappear if you use $I$ with no perturbations, however.