I'm reading a paper about systolic arrays, the author mentioned this formula for the convolution and I cannot map it to the formula that I have in mind.

What I can interpret here is as follows: I have to pad the smaller sequence with zeroes to match the length of the other, so each $y_i$ calculation consists of $k$ terms, $\sum_{j = 1}^{k} w_j \cdot x_{i + j - 1}$ but when I run this algorithm and compare the results to a simple graphical convolution sum, it does not give the same results. For example, this formula implies that each and every output $y_i$ consists of the summation of $k$ terms, as I've stated, but I think that the very first and the very last non-zero output samples must consist of only one term, the multiplication of a sample in the input sequence times a sample in the output sequence.
Can someone please clarify for me what I'm missing?