How to find a matrix $\mathbf{W} = \text{argmin}_{\mathbf{W}} \mathbb{E} [ ||\mathbf{W}^T X - g(X)||^2]$ with a random vector $X$?

154 Views Asked by At

Assume $X$ is a random vector in $\mathbb{R}^d$ and $g: \mathbb{R}^d \to \mathbb{R}^D$ is a function.

Then how do we find a matrix $\mathbf{W} = \text{argmin}_{\mathbf{W}} \mathbb{E} [ ||\mathbf{W}^T X - g(X)||^2]$?

Can we prove $$ \mathbf{W} = \lim_{N \to \infty}(\mathbf{X}_N^T\mathbf{X}_N)^{-1}\mathbf{X}_N^Tg(\mathbf{X_N}) $$ where $\mathbf{X}_N = [\mathbf{x}_1 \dots \mathbf{x}_N]^T$ with $\mathbf{x}_i$ sampled from the distribution of $X$ by, for example, CLT?

Or is there any other approach to find (or express) it or to approximate it?

Is there any textbook that gives rigorous derivation on the multivariate regression?

1

There are 1 best solutions below

0
On

For a vector argument $x$, consider the function $$y(x)=W^Tx-g(x)$$ Draw samples $\{x_i\}$ from the distribution and evaluate the function $$\eqalign{ y_i &= W^Tx_i - g_i \cr Y &= W^TX - G \cr }$$ where $(Y,X,G)$ are matrices whose columns are the vectors $(y_i,x_i,g_i)$ respectively.

The expectation after $N$ samples is $$\eqalign{ E &= \|Y\|_F^2 = Y:Y \cr }$$ where colon denotes the trace/Frobenius product, i.e. $\,\,A:B={\rm tr}(A^TB)$

Calculate the differential and gradient of $E$ $$\eqalign{ dE &= 2Y:dY = 2Y:dW^T\,X = 2XY^T:dW \cr \frac{\partial E}{\partial W} &= 2XY^T = 2X(W^TX-G)^T = 2(XX^TW-XG^T) \cr }$$ Minimize the expectation by setting the gradient to zero and solving $$\eqalign{ XX^TW &= XG^T \cr W &= (XX^T)^{-1}XG^T \cr }$$