least square failure as classifier

56 Views Asked by Bumbble Comm At 25 Mar 2026 - 4:38

I was reading pattern recognition and machine learning by Christopher bishop in chapter 4.1.3 page 186 about least square classification failure I stumbled on this phrase

"The failure of least square should not surprise us when we recall that it corresponds to Maximum likelihood under the assumption of a Gaussian conditional distribution" however, I can not understand this! what is least square relation with conditional? why are we talking about conditional distribution? and how can it relate to gaussian? I would be so grateful if U could help me. please.

Original Q&A

There are 1 best solutions below

Bumbble Comm On 06 Jan 2019 - 9:39 BEST ANSWER

Suppose the relationship between the feature vectors $\mathbf x_i$ and the target variables $y_i$ is modelled as

$$y_i = f(\mathbf x_i) + \epsilon,$$

where the function $f$ represents the "true model", and $\epsilon \sim \mathcal N(0, \sigma^2)$ is Gaussian noise.

Then the log likelihood for the dataset is $$ \log P(y_1, \dots, y_N | \mathbf x_1 , \dots, \mathbf x_N) = - \frac{1}{2\sigma^2} \sum_{i=1}^N (y_i - f(\mathbf x_i))^2 - \frac{N}{2} \log (2\pi \sigma^2).$$

Treating $\sigma^2$ as a constant, and ignoring constant terms, we see that this log-likelihood is proportional to the least-squares loss function,

$$ L(y_1, \dots, y_N | \mathbf x_1, \dots, \mathbf x_n) = \sum_{i=1}^N (y_i - f(\mathbf x_i))^2.$$

So optimising the log-likelihood (under the assumption that the noise is Gaussian) is equivalent to optimising the least-squares loss function.

The point that Bishop is making here is that, for classification problems, this Gaussian noise model is not very sensible. For one thing, $y_i$ should always be $0$ or $1$ for classification! But the Gaussian noise model can give you fractional values for $y_i$, and even negative values or values greater than one!

least square failure as classifier

There are 1 best solutions below

Related Questions in STATISTICS

Related Questions in STATISTICAL-INFERENCE

Related Questions in CONDITIONAL-EXPECTATION

Related Questions in MACHINE-LEARNING

Related Questions in MAXIMUM-LIKELIHOOD

Trending Questions

Popular # Hahtags

Popular Questions