I am looking for proof of entropy maximization and trying to understand the part of taking the derivative. The problem is basically for finding a probability distribution $p(x)$ for a given mean $\mu$ and variance $\sigma^2$ respectively. The Lagrangian is given as
$$ J(p)=\int p(x) \ln(p(x))\mathrm{d} x-\lambda_0\left(\int p(x) \mathrm{d}x-1\right)-\lambda_1\left(\int p(x)(x-\mu)^2 \mathrm{d}x-\sigma^2\right) $$
The proof continues by taking the derivative of the expression above w.r.t $p(x)$ and obtains the following:
$$ \begin{gathered} \frac{\delta J}{\delta p(x)}=1+\ln(p(x))-\lambda_0-\lambda_1(x-\mu)^2=0 \\ \ln(p(x))=1-\lambda_0-\lambda_1(x-\mu)^2 \\ p(x)=\exp\left(-\lambda_0+1-\lambda_1(x-\mu)^2\right) \end{gathered} $$
I am trying to understand how this derivative is taken explicitly and how the integral signs are removed since $p(x)$ is a function of $x$.
To see why we are doing that, let us consider the cost
$$J(p)=\int_\Omega F(x,p(x))dx$$ and assume that $p^*$ is the global minimum. Now, consider $J(p+h\nu)$ where $h\in\mathbb{R}$ and $\nu$ is a function. Therefore, we have
$$ J(p+h\nu)=\int_\Omega F(x,p(x)+h\nu(x))dx. $$ Assuming that the function $F$ is differentiable with respect its second argument, we get that
$$ J(p+h\nu)=J(p)+h\int_\Omega \dfrac{\partial F}{\partial p}(x,p(x))\nu(x)dx+o(h) $$
Therefore, we get that
$$ \lim_{h\to0}\dfrac{J(p+h\nu)-J(p)}{h}=\int_\Omega \dfrac{\partial F}{\partial p}(x,p(x))\nu(x)dx $$
and this value is called the directional derivative for the functional $F$, where $\nu$ is the direction, and we denote it by $DJ[\nu](\rho)$.
Now, if $J(p^*)$ is the minimum, then this means that $DJ[\nu](p^*)$ should be zero for all $\nu$'s. This is therefore equivalent to saying that $\dfrac{\partial F}{\partial p}(x,p^*(x))=0$ for all $x\in\Omega$. This is of course a necessary condition.
Now if we apply this idea to the current scenario we have that
$$F(x,p)=p\ln p-\lambda_0p-\lambda_1(x-\mu)^2$$
and the rest follows from the same lines.