I just learnt about Lagrange Multipliers & am confused about why they are useful. Why can we not just check for critical points by checking if the gradient vector of the objective function $f$ is $0$? Is it because for higher dimensions the boundary of a set may be infinite unlike the case when we have $[a,b]$ so if there is no critical point within $(a,b)$, the max and min lie at the endpoints.
Also, I may be wrong but the general procedure is to first check for critical points of $g$, the constraint function, on the level set and then look for Lagrange points. If this is correct, could someone explain why we look for critical points of $g$ instead of $f$? It doesn't make sense to look for maximum or minimum points of the constraint function..
Perhaps it would help to consider a simple example:
We can now address one of your concerns: why can we not just check for critical points by checking if the gradient vector of the objective function $f$ is $0$? As you can see, the gradient of the objective function is $\nabla f = (3,4)$, which is never zero. By your logic, we should expect no extrema since we have no critical points (you can verify, however, that we obtain a max at $(3,4)$ and a min at $(-3,-4)$).
To your second question: it amounts to the fact that points where the gradient of $g$ is zero or undefined are "poorly behaved" points over the domain on which we're optimizing. For instance, consider the example:
Verify that, although there are no Lagrange points, $f$ attains a minimum at $(x,y) = (0,0)$. To see what I mean by "poorly behaved", note that $x^2 - y^3 = 0$ traces out the graph of $y = x^{2/3}$, which fails to be differentiable at $x = 0$.