how to justify first order total differential equals to zero when considering conditional local maxima (minima)

69 Views Asked by At

I don't know if I am overthinking about this.


Suppose a two variables function $f(x,y)$, with local maxima/minima at point $(a,b)$. When $x$ and $y$ are independent variables, it is easy to justify that $\frac{\partial f_x(a,b)}{\partial x}=\frac{\partial f_y(a,b)}{\partial y}=0$ by saying the single variable functions $f(x,b)$ and $f(a,y)$ must attain local maxima/minima. And so the total differential $df(a,b)=0$


Now I want to find the conditional maxima/minima of a function with $(m+n)$ variables $\{x_1,x_2,...,x_m,...,x_{m+n}\}$, connected by $n$ relationships $$\varphi_i(x_1,x_2,...,x_m,...,x_{m+n})=0,\ (i=1,2,...,n)$$ in order to apply the method of lagrange multiplier, the first step I have to do is to let the total differential equals to zero $$\displaystyle\sum_{s=1}^{m+n}\frac{\partial f_{x_s}}{\partial x_s}dx_s=0$$ But for a conditional maxima/minima, I thought each partial function $f_{x_s}$ may not reach its local maxima/minima at the same time/point (probably I am wrong about this ?).

For example, assume two variables $x$, $y$ and the function $f(x,y)=x^2+y^2$. $$\begin{cases}\text{When $x$ and $y$ are independent, then obviously their is a local minma at $f(0,0)$, where $\frac{\partial f_x(0)}{\partial x}=\frac{\partial f_y(0)}{\partial y}=0$}\\\text{But when $x$ and $y$ are dependent, say $\varphi(x,y)=(x-y-1)=0$, then the conditional minima is at $f(0.5,-0.5)$, where $\frac{\partial f_x(0.5)}{\partial x}\ne\frac{\partial f_y(-0.5)}{\partial y}\ne0$}\end{cases}$$


So how do I actually justify the first order total differential to be zero ? any examples ?

1

There are 1 best solutions below

0
On BEST ANSWER

Ok I think I may have figured it out myself, so for a function with $(m+n)$ variables $f(x_1,x_2,...,x_m,...,x_{m+n})$ constrained by $n$ auxliary equations $$\varphi_i(x_1,x_2,...,x_m,...,x_{m+n})=0,\ (i=1,2,...,n)$$ one can try to solve the auxliary equations, and then represent all $(m+n)$ variables by considering $m$ numbers of the variable among them are independent and present all $(m+n)$ variables with only $m$ variables.


That is to say, because there are only $n$ equations $\varphi_i$, one can only solve $n$ variables, but since there are $(m+n)$ variables in total, the answer for each solved variable will be some combination of the rest $m$ variables


  • For example, if I move the first $m$ variables $\{x_1,x_2,...x_m\}$ to the right side of the above equation as independent variables (suppose I can actually do that), then I can have $n$ numbers of equations with $n$ numbers of variables $\{x_{m+1},x_{m+2},...x_{m+n}\}$ on the left side of the equation, which means I have $$\begin{cases}\varphi_i(x_1,x_2,...,x_m,...,x_{m+n})=\phi_i(x_{m+1},x_{m+2},...x_{m+n})-\delta_i(x_1,x_2,...x_m)=0\\\phi_i(x_{m+1},x_{m+2},...x_{m+n})=\delta_i(x_1,x_2,...x_m)\end{cases}$$ [here I suppose I can actually construct the above equations (or do the above things), which means I must gurantee the existence of implicit function $\delta_i(x_1,x_2,...x_m)$ for each equation $\varphi_i=0$ so I can rewrite them as $\phi_i=\delta_i(x_1,x_2,...x_m)$]

    Then because there are $n$ equations, I can try to solve this system of equations and represent each variable from the set $\{x_{m+1},x_{m+2},...x_{m+n}\}$ as some combination of $x_1,x_2,...x_m$ $$\begin{cases}x_{m+1}=\rho_1(x_1,x_2,...,x_m)\\x_{m+2}=\rho_2(x_1,x_2,...,x_m)\\...\\x_{m+n}=\rho_n(x_1,x_2,...,x_m)\end{cases}$$ [The above is just an example, one does not always need to choose the first $m$ variables, any subset with $m$ elements of the set $\{x_{m+1},x_{m+2},...x_{m+n}\}$ will get the job done]


  • Take a more specific example, suppose there are $(2+2=4)$ variables $\{x_1,x_2,x_3,x_4\}$, and 2 constraint relations $\varphi_1=0$, $\varphi_2=0$ as follow $$\begin{cases}\varphi_1(x_1,x_2,x_3,x_4)=\ln(x_1x_3)-x_2x_4=0\\\varphi_2(x_1,x_2,x_3,x_4)=(x_1e^{x_2x_4})-\sin\left(\sqrt {x_3^2+x_4^2}\right)=0\end{cases}$$ then I can solve the above two equations by representing $x_1,x_2$ in terms of $x_3,x_4$ $$\begin{cases}x_1=\frac{e^{\left(\frac{1}{2}\ln\left(x_3\sin\left(\sqrt {x_3^2+x_4^2}\right)\right)\right)}}{x_3}\\x_2=\frac{\ln\left(x_3\sin\left(\sqrt {x_3^2+x_4^2}\right)\right)}{2x_4}\end{cases}$$ [or similarly, I can represent any of the two variables among $\{x_1,x_2,x_3,x_4\}$ in term of any other two variables]

So when one claims the total differential equals to zero $$\displaystyle\sum_{s=1}^{m+n}\frac{\partial f_{x_s}}{\partial x_s}dx_s=0$$ it means when considering the partial derivative $\frac{\partial f_{x_s}}{\partial x_s}$ one needs to firstly choose $n$ numbers of variables and replace them with the combination of other $m$ variables and diffrentiates the entire function as a "function of function" by using chain rule.


  • For example, take the above case by considering the first $m$ variables as independent, I then have $$f(x_1,x_2,...,x_m,x_{m+1},...,x_{m+n})=f[x_1,x_2,...,x_m,\rho_1(x_1,x_2,...,x_m),...,\rho_n(x_1,x_2,...,x_m)]$$ And I should differentiating the right side of the equation instead of treating each variable as independent


  • Take a more specific example, by using the function $f(x,y)=x^2+y^2$ above in the question. In this case I have $m=1$, $n=1$ and $(m+n=2)$ variables with $n=1$ auxliary equation $$\varphi_1(x,y)=x-y-1=0$$ [In this case, I know the conditional minima is at $f(0.5,-0.5)$]

    By solving $\varphi_1=0$, I get $$x=\rho_1(y)=y+1\\f(x,y)=f[\rho_1(y),y]=2y^2+2y+1$$ [or $y=\rho_1(x)=x-1$ if I choose $x$ as the "$m$ independent variable"]

    If I then differentiate $f[\rho_1(y),y]$, and trying to find the total differential $df$ I have $$df=f'_{\rho_1}d\rho_1(y)+f'_ydy=f'_{\rho_1}(\rho_1(y)'dy)+f'_ydy=(4y+2)dy\\\text{so that, }df(0.5,-0.5)=df[\rho_1(-0.5),-0.5]=0,dy=0$$ This is different from directly differentiating $f(x,y)$ by treating each variable as independent (because $\frac{\partial f_y(-0.5)}{\partial y}\ne0$ as mentioned above). And thus the total differential $df$ at point $(0.5,-0.5)$ is $0$