When dealing with linear regression, we are concerned about how far away a given point's $y$ component is from the "best fitting line".
My question: why do we choose the $y$ component instead of the $x$ one, or, better still, the length of the perpendicular dropped from a given point to this "best fitting line" (which makes the most sense out of all three options)?
We're generally trying to model a situation where x is our input data, and y is some measurement, so x is assumed certain, and the error is assumed to be in the measurement of y.