When finding the constrained maximum or minimum, the equation used is $$\nabla f =\lambda \nabla g$$ where $\nabla f$ is the objective function and $\nabla g$ is the constraint function.
However, this method might fail if $\nabla g = 0$ at the extremas.
So is this equation correct to use in such situations? $$\lambda \nabla f =\nabla g$$
Whatever you do, you have to check two cases. The standard thing is to find points $\mathbf x \in \mathbb R^n$ such that $g(\mathbf x) = 0$ and either
(Your constraint might also look like $g(\mathbf x)= c$, but we can put it into the standard form above by writing it as $g(\mathbf x) - c = 0$. So I'll assume $g(\mathbf x) = 0$ is the constraint.)
You're right that you could take the first possibility into account by moving the $\lambda$ to the other side, but then another case "falls out": you now need to find points $\mathbf x \in \mathbb R^n$ where $g(\mathbf x) = 0$ and either
Either formulation is equivalent for actually finding the set of potential optima (and in the end, they find the same set of points).
We could even imagine a third formulation: finding points $\mathbf x \in \mathbb R^n$ where $g(\mathbf x) = 0$ and either
This would also be valid, though redundant: the only cases found by one equation but not the other are cases where $\lambda =0$, so solving either of the formulations above would be simpler.
That being said, the first formulation is the better way to think about it. Essentially, the existence of any values $\mathbf x$ such that $g(\mathbf x) = 0$ and $\nabla g(\mathbf x) = \mathbf 0$ means that $g$ is somehow an ill-behaved constraint at $\mathbf x$. (For example, this sort of thing happens when the set $\{\mathbf x \in \mathbb R^n : g(\mathbf x) = 0\}$ has isolated points.) This can also happen if we introduce unnecessary complexity into the constraint: for example, turn the constraint $x+y-1=0$ into the constraint $(x+y-1)^2=0$ and now every point where $g(x,y)=0$ also satisfies $\nabla g(x,y) = (0,0)$.
In other words, the first formulation says "either the constraint is degenerate, or the gradient equation holds", which gives a useful interpretation to the "awkward" alternative.
In much greater generality, if we allowed multiple constraints and/or allowed some constraints to be inequalities, the condition of having no point $\mathbf x$ where $\nabla g(\mathbf x) = \mathbf 0$ and $g(\mathbf x) = 0$ would generalize to one of the regularity conditions of the Karush–Kuhn–Tucker theorem.