I encountered an interesting problem in “Practical Methods of Optimization” by R. Fletcher, which utilizes the penalty method to solve the following constrained optimization problem
\begin{align}
{\mathop{\text{minimize} }} \quad & \left \| Ay \right \|^2 \\
\text{s}\text{.t}\text{.} \quad\quad\quad\,\,&
y \ge 0 \, , \,\, e^Ty = 1.
\end{align}
The solution $y^*$ of this problem can be obtained by solving the problem \begin{align} {\mathop{\text{minimize} }} \quad & \left \| Ay \right \|^2 + \left ( e^Ty - 1 \right ) ^2 \\ \text{s}\text{.t}\text{.} \quad\quad\quad\,\,& y \ge 0 \end{align}
giving a vector $y'$ say, and normalizing to get $y^* = y'/(e^Ty')$.
My question is that why is the equivalence true? While the normalization makes sense intuitively, why should it yield the answer to the original problem? I tried writing the KKT conditions for both problems and it seems the equivalence can be derived, but with a lot of manipulation. Is there a simpler way to justify the equivalence?