In penalty method I would like to prove (*)
Let $x^*$ is a optimal solution for the below problem
$\min f(x) $
subject to
$g_i(x)<= 0 $ $\forall i=1,..,p$
$h_j(x)=0$ $\forall j=1,..,q$
and $x^*_\mu$ is optimal solution for $f_{a,\mu}(x)=f(x)+\mu p(x)$
\begin{align} (*) \qquad \text{ If }\mu_1 < \mu_2 ~\text{ then }~ f_{a,\mu_1}(x^*_{\mu_1}) \le f_{a,\mu_2}(x^*_{\mu_2} ) \end{align} We have this statement $$ \forall \mu\ge0: \ \ f_a(x^*_{\mu}) \le f(x^*) $$
Then from above statement I write
$$f_a(x^*_{\mu_1}) \le f(x^*) \le f(x^*_{\mu_2})$$
because $x^*$ is optimal for problem.
A penalty function is positive and satisfies $p(x)=0$ for all $x$ in the admissible set, $g(x)\le 0$ and $h(x)=0$. Then either the optimal solution $x_μ^*$ for $f_{a,μ}=f+μp$ is inside the admissible set and thus $x_μ^*=x^*$. Or you are outside and thus $$ f(x^*)=f_{a,μ}(x^*)\ge f_{a,μ}(x_μ^*). $$ If $μ_1<μ_2$ then first by the minimality of $x_{\mu_1}^*$ and then by the non-negativity of $p$ $$ f_{a,μ_1}(x_{\mu_1}^*)\le f_{a,μ_1}(x_{\mu_2}^*)=f_{a,μ_2}(x_{\mu_2}^*)-(μ_2-μ_1)p(x_{\mu_2}^*)\le f_{a,μ_2}(x_{\mu_2}^*). $$ Which proves (1). However, the last inequality $$f(x^*)\le f(x_{\mu_2}^*)$$ is only guaranteed to be true if $x_{\mu_2}^*$ is an admissible point, which more often than not will not be the case. If everything is convex, the minimizer $x^*$ will be on the boundary of the admissible set and the $x_{\mu_2}^*$ will lie outside the admissible set and usually have smaller $f$ values.
Example: $f(x)=-x$, $g(x)=x\le 0$, $p(x)=x_+^2=\max(0,x)^2$. Then for $x>0$ $$f_{a,μ}(x)=-x+μx^2=μ\left(x-\frac1{2μ}\right)^2-\frac1{4μ}$$ so that $f_{a,μ}$ has its global minimum at $x_μ^*=\frac1{2μ}$ outside the admissible set with value $f_{a,μ}(x_μ^*)=-\frac1{4μ}<f(x^*)=0$. And also $$ f(x_μ^*)=-\frac1{2μ}<0=f(x^*). $$