Consider the following problem: \begin{align} &\min_{(x, y)} (c_x \cdot x + c_y \cdot y)\\ \text{s.t.} \quad & K = x + y\\ &(x, y) \in[0, X] \times [0, \infty)\\ & c_x < c_y \\ & X \geq 0\\ & K \geq 0 \end{align}
The solution is trivial: $(x, y) = (K, 0)$ if $X \geq K$ and $(x, y) = (X, K - X)$ if $X < K$.
I'd like to solve the problem via the Lagrangian approach. Let $\mathcal L(x, y, \lambda) = c_x \cdot x + c_y \cdot y + \lambda \cdot(K - x - y)$ denote the Lagrangian. The FOCs read \begin{align} &\frac{\partial L}{\partial x} = c_x - \lambda = 0\\ &\frac{\partial L}{\partial y} = c_y - \lambda = 0\\ &\frac{\partial L}{\partial \lambda} = K - x - y = 0 \end{align} Obviously, there is a contradiction as $c_x = \lambda = c_y$ and $c_x < c_y$ cannot be true at the same time.
I'd rather think the FOCs read:
Case I: For $K \in [0, X)$ \begin{align} &\frac{\partial L}{\partial x} = c_x - \lambda = 0 \Longrightarrow x = K\\ &\frac{\partial L}{\partial y} = c_y - \lambda > 0 \Longrightarrow y = 0\\ &\frac{\partial L}{\partial \lambda} = K - x - y = 0 \end{align}
Case II: For $K \geq X$ \begin{align} &\frac{\partial L}{\partial x} = c_x - \lambda < 0 \Longrightarrow x = X\\ &\frac{\partial L}{\partial y} = c_y - \lambda = 0 \Longrightarrow y = K - X\\ &\frac{\partial L}{\partial \lambda} = K - x - y = 0 \end{align}
And the interpretation of the Lagrange multiplier in the sense of a shadow price would be: if $x$ can serve $K$ and there is an extra demand of one unit of $K$, then the price to serve this extra unit would be $c_x$, else $c_y$.
- What is the proper reasoning for the FOCs and the correct derivation of $\lambda$? I came to the conclsuion only because I knew the correct solution beforehand, which must not be the case in general.
Don't forget the multipliers for the inequality constraints too: $$\mathcal{L}(x,y,\lambda, \mu) = c_x x + c_y y + \lambda(K - x - y) - \mu_1 (X - x) - \mu_2 x - \mu_3 y$$ Stationarity of the Lagrangian is no longer sufficient on its own; we need to add the complementarity conditions to arrive at the "Karush-Kuhn-Tucker" (KKT) conditions for first-order optimality: $$\nabla_{x} \mathcal{L} = c_x - \lambda + \mu_1 - \mu_2 = 0$$ $$\nabla_{y} \mathcal{L} = c_y - \lambda - \mu_3 = 0$$ $$K - x - y = 0$$ $$0 \le \mu_1 \perp (X - x) \ge 0$$ $$0 \le \mu_2 \perp x \ge 0$$ $$0 \le \mu_3 \perp y \ge 0$$
These last three conditions are "complementary slackness"; the "perp" symbol $\perp$ indicates that at most one of the values can be nonzero (e.g. if $\mu_2 > 0$, then $x = 0$, or if $x > 0$, then $\mu_2 = 0$).
We have a system of nonsmooth equations that must be solved for the values of $(x, y, \lambda, \mu)$. In general, this is hard. For this problem, though, it helps us understand your two cases (slightly modified):
Case 1: $K \in (0,X)$
We expect the solution to be $(x,y) = (K, 0)$. We always have $X - x > 0$ and $x > 0$, which means $\mu_1 = 0$ and $\mu_2 = 0$ by complementary slackness. Further, since $y$ is zero, $\mu_3$ is allowed to be nonzero. The stationarity of the Lagrangian becomes $c_x - \lambda = 0$ and $c_y - \lambda - \mu_3 = 0$, which gives $\lambda = c_x$ and $\mu_3 = c_y - c_x$.
Case 2: $K > X$:
We expect the solution to be $(x,y) = (X, K - X)$. We have $X - x = 0$, $x > 0$, and $y > 0$, which, again by complementary slackness, implies $\mu_1$ may be nonzero, $\mu_2 = 0$, and $\mu_3 = 0$. Using this in the other equations we get $c_x - \lambda + \mu_1 = 0$, and $c_y - \lambda = 0$. Similarly, we get $\lambda = c_y$, and $\mu_1 = c_y - c_x$.
In general, the value of $\lambda$ is what you were expecting. A little more precisely, the value of $\lambda$ is the derivative of the optimal objective value as a function of $K$: $$ f(K) = \min_{x,y} \{c_x x + c_y y : -x - y = -K, x \in [0,X], y \ge 0\} $$ $$ \frac{\partial f}{\partial K} = \lambda $$
Note I have left out the "breakpoint" $K = X$ in the cases above since this is a point where multiple values of the multipliers $(\lambda, \mu)$ are possible. This corresponds to a point where $f$ has a non-singleton-valued generalized gradient.
Caveats: KKT conditions are only necessary and sufficient for optimality for problems satisfying certain conditions. A decent resource for diving deeper is https://stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf (plus many other books).