What are the relationships between no. of inputs and no. of outputs in regression?
Particularly, what if $|y| > |x|$? What if $|y|<|x|$?
Why not always $|y|=|x|$?
Where $|\cdot|$ denotes the cardinality.
What are the relationships between no. of inputs and no. of outputs in regression?
Particularly, what if $|y| > |x|$? What if $|y|<|x|$?
Why not always $|y|=|x|$?
Where $|\cdot|$ denotes the cardinality.
On
Basically, if $n\ge p$, i.e., the number of observation is at least as the number of $\beta$s, then $|x| = |y|$. If you have seen otherwise it is probably due to incomplete cases, i.e., if at least one of the values of the observation is missing - this observation is automatically dropped out of regression. Still the number of observation equals the number of fitted values, however any incomplete observation wasn't taken into the computations in the first place. Even if $p > n$, you can still use the generalized inverse to get $n$ fitted values, however, unlike in $n\ge p$, these fitted values are not unique and depend of your method of generalization.
It helps to think of straight line regression first. It takes two parameters to define a straight line, say the slope and intercept. If you have only one point there are an infinite number of lines that go through it. If you have two points, so $|y|=|x|$ there is one line that passes exactly through the points. If you have more than two points, there is likely no line that passes through all of them. You can determine the best fitting line. The added points make you less sensitive to errors in your input data.
The situation is the same with more parameters to fit. If you have less points than parameters you expect an infinite number of solutions. If you have the same number of points as parameters you expect (though this can fail if things are dependent) one solution that passes exactly through all the data. If you have more points than parameters you can do a least squares fit to get the best fit to your data, but it will probably not pass exactly through any data point.