Compute matrix-vector multiplication without having the matrix.

798 Views Asked by At

So I am trying to think of a way to make a Matrix-Vector multiplication. What I have is a global stiffness matrix of 1D element with linear shape functions

$\begin{bmatrix} \dfrac{1}{3} & \dfrac{1}{6} & 0 & 0 \\ \dfrac{1}{6} & \dfrac{2}{3} & \dfrac{1}{6} & 0 \\ 0 & \dfrac{1}{6} & \dfrac{2}{3} & \dfrac{1}{6} \\ 0 & 0 & \dfrac{1}{6} & \dfrac{1}{3} \\ \end{bmatrix}$

What I mean is, say I have a vector $\begin{bmatrix} x_1 & x_2 & x_3 & x_4 \end{bmatrix}^{T}$. And I don't want to create extra memory in my computational implementation. Hence I want to recycle the memory that has been allocated to the original vector again.

My instinct is that there should be a way out for this, like how computer scientist (which I am not!) has shows how to swap x and y without using an extra variable by:

a = a + b;
b = a - b;
a = a - b;

Anyone has any thoughts?

2

There are 2 best solutions below

5
On BEST ANSWER

What about just writing a function that multiplies a vector by this matrix using a for loop? You can overwrite $x$ as the multiplication is performed, something like this:

for i = 2,3,...
   aNext = x[i]
   x[i] = (1/6)a + (2/3)x[i] + (1/6)x[i+1]
   a = aNext
end

After the loop, the variable $x$ will store the result of the matrix-vector product.

You could also note that you are performing a convolution, which can be computed very efficiently using the Fast Fourier Transform (if you are careful to handle the boundary correctly). But since your convolution kernel is so small, this might not be faster than just using a for loop. And the FFT approach might not conserve memory as much as you want. Although, I just googled and learned that "in place" FFT algorithms have been developed which aim to conserve memory.

2
On

Assumed problem: Given the setting \begin{align} A = \text{tridiag}(a, b, c) \in \mathbb{R}^{n\times n} & \\ a \in \mathbb{R}^n & \quad \text{(main diagonal)} \\ b, c \in \mathbb{R}^{n-1} & \quad \text{(upper and lower side diagonal)} \\ x \in \mathbb{R}^n & \end{align} you want to calculate $y = Ax$ in a way that $y$ replaces $x$ in memory.

We have \begin{align} y_1 &= a_1 \, x_1 + b_1 \, x_2 \\ y_n &= c_{n-1} \, x_{n-1} + a_n x_n \\ y_i &= c_{i-1} \, x_{i-1} + a_i \, x_i + b_i \, x_{i+1} \quad (i \in \{ 2, \ldots, n-1 \}) \end{align}

Without memory optimization we could try this:

  • have indices $(i_-, i, i_+)$, initialize with $(1,2,3)$
  • $y_1 = a_1 \, x_1$
  • $y_2 = c_1 \, x_1$
  • do loop
    • $y_{i_-} = y_{i_-} + b_{i_-} \, x_i$
    • $y_i = y_i + a_i \, x_i$
    • $y_{i_+} = y_{i_+} + c_i \, x_i$
    • $i_- = i$, $i = i_+$, $i_+ = i_+ + 1$
  • until $i_+ > n$
  • $y_{i_-} = b_{n-1} \, x_n$
  • $y_i = a_n \, x_n$

We walk through the matrix from the left to the right, using $x_i$ to contribute to the accumulating sums for the $y$.

After each step we can abandon one $x_i$ and use it to store $y_i$:

  • have indices $(i_-, i)$, initialize with $(1,2)$
  • have a window $(y, y_+)$, initialize with
  • $y = a_1 \, x_1$
  • $y_+ = c_1 \, x_1$
  • do loop
    • $x_{i_-} = y + b_{i_-} \, x_i$
    • $y = y_+ + a_i \, x_i$
    • $y_+ = c_i \, x_i$
    • $i_- = i$, $i = i + 1$
  • until $i >= n$
  • $x_{n-1} = y + b_{n-1} \, x_n$
  • $x_n = y_+ + a_n \, x_n$

I have not tested this, so there might be bugs. But I hope the idea is clear.