I have a MS in Statistics and I did well in MS-level course in statistical inference. I had no problem with the MS course because the course used mostly calculus to conduct proof, and I have a good understanding in calculus. However when I look at a PhD course in statistical inference, it scares me quite a bit because it is heavily loaded on linear algebra. Although I did very well in my undergraduate linear algebra courses, I am not fully comfortable with the linear algebra operations. To be more specific, I get scared when everything is expressed in terms of matrices, and when I look at the matrix equation I do not know how to interpret it.
For example, my textbook says that in Generalized Linear Model,

and I don't get and can't see how the equation expressed under sigma is equivalent to (matrix X transpose) * (matrix V) * (matrix X). My question is:
Is there any trick that I can use when trying to convert an expression that does not involve matrix into an expression involving matrix? I just cannot do this quickly. For instance, how can the expression $x_ix_i^T$ inside sigma be obtained by computing $X^TX$ ??? Shouldn't it be that $X^TX$ will make expression inside sigma to involve $x_i^Tx_i$ instead? Thank you,

Remember - writing $Ax = b$ is a compact way of writing the corresponding equations - each row represents an equation.
Let me expand on this. Consider the system
\begin{align*} 3x + 5y &= -10 \\ 2x - 3y &= -1. \end{align*}
This can be compactly written as
$$\begin{pmatrix} 3 & 5 \\ 2 & -3 \\ \end{pmatrix} \begin{pmatrix} x \\ y \\ \end{pmatrix} = \begin{pmatrix} -10 \\ -1 \\ \end{pmatrix}$$
as the matrix multiplication yields the above scalar equations. Again, each row of the coefficient matrix represents a part of the equation in the system.