An example in my book states:
A linear transformation $T$ has equations $x'=3x-y$ , $y'=x+y$ . Under $T$, find the image of the line with equation $y=2x-1$.
The solution then starts
We need to write the equation of the image line in terms of $x'$ and $y'$.
They then set the $x'$, $y'$ vector equal to the transformation matrix multiplied by the $x, y$ vector and solve for $x$ and $y$. Which they then plug into the original equation of the line, and once it's in standard form, they switch $x'$ to $x$ and $y'$ to $y$.
Why are they doing this? I understand how to do the whole thing, but I don't understand why. Can we not just plug in the $x'$ into the $x$ and the $y'$ into the $y$. It seems to me like they are actually finding the inverse if they do as they did... What am I missing?
They then use these $x$ and $y$ to find the image of equations of circles, parabolas, etc.
You can’t simply “plug in the $x'$ into the $x$ and the $y'$ into the $y$.” You need to end up with an equation in the new (transformed) variables $x'$ and $y'$, but replacing $x$ with $3x-y$ doesn’t get you any closer to having that: the new variables still don’t appear anywhere in the expression. Moreover, making these substitutions doesn’t make algebraic sense: you can legitimately plug in values of $x$ and $y$ for these variables, but $x'$ and $y'$ aren’t values of these variables. Instead, you need to invert the transformation so that you have expressions for $x$ and $y$ in terms of $x'$ and $y'$. You can then legitimately substitute those expressions for the variables $x$ and $y$ in the original equation.
This is really no different from what you learned to do in order to translate and/or scale the graph of a function of one variable. E.g., shifting the graph right by $2$ corresponds to adding $2$ to all of the $x$-coordinates of points on the graph, that is, it is accomplished by the transformation $x' = x +2$. However, to obtain the equation of the shifted graph, you had to replace $x$ by $x-2$ (really by $x'-2$), which is the inverse of the transformation. The reasoning is the same: $x$ is replaced by an expression that equals $x$, not by an expression that equals $x'$. Similarly, to scale the graph by a factor of $2$, you had to divide $x$ by $2$, again using the inverse transformation.