Lagrange Multiplier Method: Why is the Langragian function defined as $f(x,y)+\lambda \cdot g(x,y)$?

653 Views Asked by At

Edit: As AlexR points out in this comment, there is no mathematical reason behind defining the Lagrangian, except because it makes the Lagrange Multiplier Method easier to memorize. I find this confusing; for me, it is easier to find the critical values of the system of equations.


I still haven't found an explanation why is the Lagrange function defined as:

$$\Lambda(x,y,\lambda) = f(x,y)+\lambda \cdot g(x,y)$$

Every author doesn't explain the procedure and I don't think is necessary to learn Lagrangian Mechanics to understand the formula, some examples:

Wikipedia:

To incorporate these conditions into one equation, we introduce an auxiliary function

enter image description here

The Idea Shop:

Now, if we're clever we can write a single equation that will capture this idea. This is where the familiar Lagrangian equation comes in:

$L=f-\lambda(g-c)$

Lagrange Multipliers Without Permanent Scarring by Dan Kein:

We can compactly represent both equations at once by writing the Lagrangian:

$\Lambda(x,\lambda)=f(x)-\lambda g(x)$

The list goes on, the thing is that "Lagrangian" seems to have different meanings depending on the context: https://en.wikipedia.org/wiki/Lagrangian_%28disambiguation%29

So in this context, What does "Lagrangian function" means, and what are the steps to get to that function?

Thanks in advance.

1

There are 1 best solutions below

24
On BEST ANSWER

I'm not entirely sure what you're getting at, but I think this maybe helpful to you:

The intuition is that an extreme point of $f(x)$ under the condition $g(x) = 0$ must satisfy $g(x) = 0$ and $\nabla_\nu f(x) = 0$ for any direction $\nu$ tangential to the candidate set $\{g(x) = 0\}$. If this were not the case, we could go a small, positive distance along $\nabla_\nu f$ (or $-\nabla_\nu f$) to improve the function value without missing the constraint. Write these two below each other and get $$\pmatrix{\nabla f + \lambda \nabla g\\g}(x) = 0$$ This however is the $z$-gradient of $f + \lambda g$ if we write $z = (x,\lambda)$.

It turns out that this in fact is the necessary condition for a constrained minimisation (or maximisation).