Why does minimization of sum-of-squares yield a predictor that satisfies the same constraint as the training targets?

74 Views Asked by Bumbble Comm At 04 Apr 2026 - 12:18

Bishop's book [1] describes a least-squares approach for classification with a linear model:

$$y_k(x)=w_k^Tx + w_{k0}$$

and sum-of-square-errors cost function.

Then it mentions an interesting fact:

An interesting property of least-squares solutions with multiple target variables is that if every target vector in the training set satisfies some linear constraint $a^T t_n + b = 0$ for some constants $a$ and $b$, then the model prediction for any value of $x$ will satisfy the same constraint so that $a^Ty(x) + b = 0$.

I can't understand how can a linear constraint influence the predictions in such a way? Is there an intuitive way how to see this or a nice proof?

Thanks!

[1] Bishop: Pattern recognition and machine learning

Original Q&A

Why does minimization of sum-of-squares yield a predictor that satisfies the same constraint as the training targets?

Related Questions in LINEAR-ALGEBRA

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions