Given $[S]$, $[A]$, and $[D]$, can we solve $[S]$ = $[A]$ $[B]$ + $[C]$ $[D]$ for $[B]$ and $[C]$?

30 Views Asked by At

Given matrices $[\mathbf S]\in\Bbb R^{m\times n}$, $[\mathbf A]\in\Bbb R^{m\times p}$, $[\mathbf D]\in\Bbb R^{q\times n}$, and $(m\times n) > (p\times n\,+m\times q)$, how can we solve the following simultaneously for the least-squares best-fit $\color{red}{[\mathbf B]}$ and $\color{red}{[\mathbf C]}$? $$[\mathbf S]_{m\times n}=[\mathbf A]_{m\times p}\color{red}{[\mathbf B]_{p\times n}}+\color{red}{[\mathbf C]_{m\times q}}[\mathbf D]_{q\times n}$$ For context, consider $[\mathbf S]$ as a matrix of $n$ spectra having $m$ channels. The spectra are linear combinations of pure components, some of which have known spectra, $[\mathbf A]$, but unknown concentration scaling factors, $\color{red}{[\mathbf B]}$, plus other components with unknown spectra, $\color{red}{[\mathbf C]}$, but known concentrations, $[\mathbf D]$. Because we collected enough spectra from enough different mixtures, $(m\times n) > (p\times n\,+m\times q)$, and there should be more knowns than unknowns in the system. I've had a little luck with an iterative approach, but can't help but think that there must be a way of arriving at some analytical solution (or family of solutions).

1

There are 1 best solutions below

0
On

Rearrange the equation to create a squared sum residual error statement: $$\mathbf E=\sum_{i=1}^m \sum_{j=1}^n(S_{ij}-[AB]_{ij}-[CD]_{ij})^2$$ where $$[AB]_{ij}=\sum_{k=1}^p a_{ik} b_{kj}$$ and $$[CD]_{ij}=\sum_{l=1}^q c_{il} d_{lj}$$ Now minimize $$\mathbf E=\sum_{i=1}^m \sum_{j=1}^n(S_{ij}-\sum_{k=1}^p a_{ik} b_{kj}-\sum_{l=1}^q c_{il} d_{lj})^2$$ over $b$ and $c$ by setting the partial derivative of $\mathbf E$ with respect to $b$ and $c$ equal to zero: $$\frac {\partial E}{\partial b} =\sum_{i=1}^m2(S_{ij}-\sum_{k=1}^p a_{ik} b_{kj}-\sum_{l=1}^q c_{il} d_{lj})(-\sum_{k=1}^pa_{ik})=0$$ (using the chain rule) and $$\frac {\partial E}{\partial c} =\sum_{i=1}^m2(S_{ij}-\sum_{k=1}^p a_{ik} b_{kj}-\sum_{l=1}^q c_{il} d_{lj})(-\sum_{k=1}^pd_{lj})=0$$ In matrix form:$$\frac {\partial E}{\partial b} =0=(-2\mathbf A)^\top(\mathbf S-\mathbf {AB}-\mathbf {CD})$$ and $$\frac {\partial E}{\partial c} =0=(\mathbf S-\mathbf {AB}-\mathbf {CD})(-2\mathbf D)^\top$$ where we use transpose $(^\top)$ to get the matrix dimensions to match up.

Now distribute, simplify, and rearrange: $$\mathbf A^\top \mathbf S = \mathbf A^\top \mathbf{A\color{red}B}+\mathbf A^\top \mathbf{\color{red}CD}$$ and $$\mathbf{SD}^\top=\mathbf{A\color{red}BD}^\top+\mathbf{\color{red}CDD}^\top$$

Two equations, two unknowns. Solve one for one variable and substitute into the other. Solving the first for $\mathbf {\color{red}B}$ $$(\mathbf A^\top \mathbf A)^{-1}\mathbf A^\top \mathbf S=\mathbf {\color{red}B} +(\mathbf A^\top \mathbf A)^{-1}\mathbf A^\top \mathbf {\color{red}C} \mathbf D$$ Note that $(\mathbf A^\top \mathbf A)^{-1}\mathbf A^\top$ is the left pseudoinverse of $\mathbf A$, or $\mathbf A^+$, then:

$$\mathbf{\color{red}B}=\mathbf A^+(\mathbf S-\mathbf{\color{red}CD})$$

substituting into the second and solving for $\mathbf{\color{red}C}$: $$\mathbf{SD}^\top=\mathbf{A\Bigl((\mathbf A^+(\mathbf S-\mathbf{\color{red}CD})\Bigr) D}^\top+\mathbf{\color{red}CDD}^\top$$

$$\mathbf{SD}^\top=\mathbf A(\mathbf A^+ \mathbf{SD}^\top-\mathbf A^+\mathbf{\color{red}CDD}^\top)+\mathbf{\color{red}CDD}^\top$$

Note that $\mathbf A\mathbf A^+$ is $(i\times i)$ square and is not an identity matrix! $$(\mathbf I-\mathbf A\mathbf A^+)\mathbf{SD}^\top=(\mathbf I-\mathbf {AA}^+) \mathbf{\color{red}CDD}^\top$$ Almost there: $$(\mathbf I-\mathbf {AA}^+)^{-1}(\mathbf I-\mathbf A\mathbf A^+)\mathbf{SD}^\top=\mathbf{\color{red}CDD}^\top$$ $$\mathbf{\color{red}C}=\mathbf{SD}^\top(\mathbf{DD}^\top)^{-1}$$

Hmm. Maybe something problematic here: $\mathbf{\color {red} C}$ appears ito be independent of $\mathbf A$, and it's simply $\mathbf{SD}^+$, where $\mathbf D^+$ is the right pseudo-inverse of $\mathbf D$.

Now back to $\mathbf{\color {red} B}$ $$\mathbf{\color{red}B}=\mathbf A^+(\mathbf S-(\mathbf{SD}^+)\mathbf{D})$$ Note that $\mathbf D^+\mathbf D$ is $(j \times j)$ square and is not an identity matrix! $$\mathbf{\color{red}B}=\mathbf A^+\mathbf S(\mathbf I-\mathbf{D}^+\mathbf{D})$$

My sense here, however, is that these solutions for $\mathbf{\color {red} B}$ and $\mathbf{\color {red} C}$ may be just the first in a whole family of solutions. In other words, the solution is not unique, but why that is (the system is singular but consistent?), and what that family of solutions looks like and what this all means exactly in this situation is still unclear to me. Any help would be appreciated!