Optimizing a matrix multiplication

140 Views Asked by At

Given the following operation: $$ \left(\begin{array}{c} f(1) \\ f(2) \\ \ldots \\ f(n)\end{array}\right) \begin{pmatrix} 1^{-1} & 1^0 & \cdots & 1^{n-2}\\ 2^{-1} & 2^0 & \cdots & 2^{n-2}\\ \vdots & \vdots & & \vdots \\ n^{-1} & n^0 & \cdots & n^{n-2}\end{pmatrix}$$

So a $n\times 1$ vector multiplied by a $n\times n$ matrix. As an example the $n=4$ matrix would be

$$\begin{pmatrix} 1 & 1 & 1 & 1\\ 1/2 & 1 & 2 & 4 \\ 1/3 & 1 & 3 & 9 \\ 1/4 & 1 & 4 & 16\end{pmatrix}$$

I'm looking for anything I can take advantage of to reduce the complexity of this operation from

$$O(n^2)$$

down to something better. Performing a Discrete Fourier Transform (DFT) for example uses the cyclic property of the exponential in a Fourier transform to reduce the operation. That doesn't apply here, since I am not using a sinusoid, but I'm wondering if there is anything that can achieve the same effect for this matrix.

2

There are 2 best solutions below

2
On

Your Matrix looks like a Vandermonde matrix(https://en.wikipedia.org/wiki/Vandermonde_matrix). It seems that vector-matrix multiplication can be sped up for this type of matrix according to this article http://www.netlib.org/utk/people/JackDongarra/etemplates/node384.html

To be more specific if you have a Vandermonde matrix $$ \bf V = \left[\begin{array}{cccc} 1&a_1&\dots&a_1^n\\1&a_2& \dots&a_2^n\\\vdots&\vdots& \ddots&\vdots\\1&a_k& \dots&a_k^n\end{array}\right] $$ and a vector $b =\left[b_0 \dots b_n\right]$ multiplication is the same as evaluating the polynomial $ f(a) = \Sigma b_i a^i$ at k points. This can be done in $O(k\log n)$ (see https://mathoverflow.net/questions/69476/fast-evaluation-of-polynomials). In your case it would be $O(n\log n)$ since your matrix is square.

I'll try think if there is a way to also use the fact that in your case it is $a_{k+1} = a_k + 1$ to further reduce computation

3
On

EDIT To be more specific and systematic we can define

$${\bf D}=\left[\begin{array}{cccc}1&0&0&0\\0&2&0&0\\0&0&3&0\\0&0&0&4\end{array}\right], {\bf L}=\left[\begin{array}{cccc}1&0&0&0\\1&0&0&0\\1&0&0&0\\1&0&0&0\end{array}\right] , {\bf C}=\left[\begin{array}{cccc} 0&1&0&0\\ 0&0&1&0\\ 0&0&0&1\\ 1&0&0&0 \end{array}\right]$$

  • $\bf C$ is the representation of generating element from a cyclic group,

  • $\bf L$ acts as a selection matrix.

  • $\bf D$ performs each of the power increments.

We can then systematically construct the Vandermonde matrix like this:

$${\bf V}^T={\bf D(L+D(L+D(L+DLC)C)C)C}$$

If we evaluate this gives us $${\bf V}^T = \left[\begin{array}{cccc} 1&1&1&1\\2&4&8&16\\3&9&27&81\\4&16&64&256\end{array}\right]$$

Please note that $\bf C$ and $\bf L$ only shift and select indices and only the addition and multiplication by $\bf D$ performs any actual arithmetics and those multiplications are sparse. The transpose doesn't complicate things much since all matrices involved are so simple.