Unless I'm mistaken, matrices follow a distributive law, provided that the dimensions line up, so $X(Y+Z) = XY + XZ$. I'm struggling somewhat to prove this fact, though. Is it enough to say that because matrices represent linear transformations, we're simply applying the linearity of $X$? Or is a more rigorous derivation, likely requiring the summation representation of a matrix, required?
A hint on this would be very helpful. I'm mainly interested on the 'how' here rather than the proof itself, as it's very possible I'm misunderstanding the definition of linearity.
The fact that matrices represent linear functions means that for a matrix $A$ and vectors $u, v$, we have $A(u +v) = A u + A v$. This doesn't tell us anything about what happens when we multiply matrices together.
So yes, to prove distributivity you will need something more. If you consider the summation representation the proof is quite straightforward though.