Minimize objective function for least square classification

348 Views Asked by Bumbble Comm At 22 Feb 2026 - 6:48

Consider a training set $\{(\vec{x^{(n)}}, \vec{t^{(n)}}) \in \rm I\!R^D \times I\!R^K : n = 1, ..., N \}$ where :

$\vec{t^{(n)}}$ is an indicator for the class membership of $\vec{x^{(n)}}$, i.e $\vec{t^{(n)}} = (0, 1, ... 0)$ if $\vec{x^{(n)}}$ belongs to $C2$
X is a $N \times (D+1)$ matrix whose nth row is $(\vec{x^{(n)}})^T$
T is a $N \times K$ matrix whose nth row is $(\vec{t^{(n)}})^T$

The least square problem is to minimise the objective function:

$G($W$)$ = $\frac{1}{2}\sum_{n=1}^{N}\sum_{k=1}^{K}[y_k(\vec{t^{(n)}};\vec{w_k}) - t_k^{(n)}]^2$

where $y_k(\vec{t^{(n)}};\vec{w_k})$ is the linear discriminant function.

I know that W is minimised by W = (X$^T$X)$^{-1}$X$^T$X

However, I don't understand why there is this $\frac{1}{2}$ coefficient in front of the double summation.

Original Q&A

There are 1 best solutions below

Bumbble Comm On 13 Feb 2018 - 1:54 BEST ANSWER

It's a simplification tool. When you take the derivative of the objective function and the exponent $2$ comes out to multiply the double sum it will be cancelled out. Similarly, when you set the objective function's derivative to $0$ you could just as easily divide out the $2$ in the front.

Minimize objective function for least square classification

There are 1 best solutions below

Related Questions in LINEAR-REGRESSION

Related Questions in DISCRIMINANT

Trending Questions

Popular # Hahtags

Popular Questions