Lagrange Multiplier - Confusion , Bishop -PRML

149 Views Asked by At

I am trying to solve Excercise 4.4 of the book Pattern Recognition and Machine Learning by Bishop . Implicitly we need to show that if we :
Maximize : $$w^{T}(m_{2}-m_{1}) \tag{1}\label{1} $$ Subject to : $$w^{T}w= \tag{2}\label{2} 1$$ Then the optimal value for $w$ we obtain is of the form $$w \propto (m_{2}-m_{1})$$ My Approach :
I started with lagrangian which is :

\begin{align} L=w^{T}(m_{2}-m_{1}) + \lambda(w^{T}w-1) \tag{3}\label{3} \end{align}

Where $\lambda$ is the Lagrangian Multiplier.
From this , I get :
\begin{align} \nabla_{w}L =(m_{2}-m_{1}) + 2\lambda w =0 \\ => w = - \frac{m_{2}-m_{1}}{2 \lambda} \tag{4}\label{4} \end{align} In the official solution here is where the author concludes the solution and writes ,"it follows that $w \propto (m_{2}-m_{1})$ . If i however try to see what lambda looks like , i substitute the value of $w$ in the constraint equation $Eq.2$ to get:
\begin{align} \frac{(m_{2}-m_{1})^{T} (m_{2}-m_{1})}{4 \lambda^{2}} = 1\\ =>\lambda = \frac{|m_{2}-m_{1}|}{2} \tag{5}\label{5} \end{align} Now if i substitue for $\lambda$ to in $Eq. 4$ i get \begin{align} w = -\frac{m_{2}-m_{1}}{|m_{2}-m_{1}|} \\ => \frac{w}{m_{2}-m_{1}} = -\frac{1}{|m_{2}-m_{1}|} \end{align}

which does not establish direct proportionality between $w$ and $m_{2}-m_{1}$ as $-\frac{1}{|m_{2}-m_{1}|}$ is not constant. Edit2 : The reason why $|m_{2}-m_{1}|$ is not constant is because $m_{1}$(and for that matter $m_{2}$) would change as new points are classified to the classes $C_{1}$ or $C_{2}$

So where my interpretation or mathematics is going wrong ?