Not directly related to the title, but I recently went through LADR, and did not find it particularly intuitive. I've started to work through LADW. I've found it much more intuitive.
There is one thing, however, that I don't quite understand. Why is it that we need to think about the the columns of $A^{T}$ when they are equivalent to the rows of $A$? What am I missing?
What are the motivations for thinking of the transpose of a matrix?
The transpose of a matrix is the matrix associated to the dual linear map. That is the main reason for the importance of the transpose in abstract linear algebra.
Let $f : V \to W$ be a linear map between two finite-dimensional vector spaces over a field $F$ and $f^* : W^* \to V^*$ be the dual map. For bases $v_1,\ldots,v_n$ and $w_1,\ldots,w_m$ of $V$ and $W$, let $A$ be the $m \times n$ matrix describing $f$ as a matrix relative to those two bases. Then the matrix describing $f^*$ relative to the duals of those two bases is the transpose $A^\top$.
Working over the real numbers, there are a few different roles for the transpose of an $m \times n$ real matrix $A$.
As mentioned in a comment above, $A\mathbf v\cdot \mathbf w = \mathbf v \cdot A^\top\mathbf w$ for all $\mathbf v$ in $\mathbf R^n$ and $\mathbf w \in \mathbf R^m$. This is a crucial connection between transposes and dot products. (It is related to the abstract description above because the dot products on $\mathbf R^n$ and $\mathbf R^m$ allow you to think of the duals of these vector spaces as the original vector spaces in a "natural" way.)
The rotations of $\mathbf R^n$ around the origin should be $n \times n$ real matrices $A$ that preserve the dot product: $A\mathbf v \cdot A\mathbf v' = \mathbf v \cdot \mathbf v'$ for all $\mathbf v$ and $\mathbf v'$ in $\mathbf R^n$. That's equivalent, by the previous property, to $\mathbf v \cdot A^\top A\mathbf v' = \mathbf v\cdot \mathbf v'$ for all $\mathbf v$ and $\mathbf v'$, which is equivalent to $A^\top{A} = I_n$. This condition defines the orthogonal group ${\rm O}_n(\mathbf R)$. (Such $A$ have determinant $\pm 1$, and those with determinant $1$ are closer to actual rotations in $2$ or $3$ dimensions. They are called proper rotations and form the special orthogonal group ${\rm SO}_n(\mathbf R)$. Those with determinant $-1$ are called improper rotations and they are a product of a proper rotation and a reflection across a hyperplane through the origin.)
To compute the operator norm $||A||$ of an $m \times n$ matrix $A$ (this is the least $b>0$ such that $||A\mathbf v|| \leq b||\mathbf v||$ for all $\mathbf v$ in $\mathbf R^n$), first you show for all real matrices $S$ that are symmetric, meaning $S = S^\top$, that $||S||$ is $\max |\lambda|$ where $\lambda$ runs over the eigenvalues of $S$, which are all real. Then for an $m\times n$ real matrix $A$, both $A^\top A$ and $AA^\top$ are symmetric (of sizes $n\times n$ and $m\times m$) and it turns out that $||A|| = \sqrt{||A^\top A||} = \sqrt{||AA^\top||}$. See Theorems 3.4(vi) and 4.1 and exercise 1 of section 3 here.