Often times economists (mostly ones who come to the profession with mathematics or physics background) use mathematics in a sloppy way in order to make their lives easier (but make smb's elses life, who wants to aunderstand the models deeply, harder) and they do not explain much besides giving mathematical expressions. They do not explain or prove why the way they formulate economic problems mathematically is correct in mathematical terms. My question is about one of such cases.
I have used and derived the results of this model (and models like this) many times in my course work and beyond but never had enough time to take care about the mathematical subtleties (you know all those deadlines and the stuff like that). So, I decided to revisit one of such models from the famous book "Monetary Policy, Inflation, and the Business Cycle: An Introduction to the New Keynesian Framework and Its Applications - First Edition" chapter 3: Basic New Keynesian Model" and understand every mathematical detail in it. Here is the one that I cannot make sense of mathematically:
Problem Definition
Suppose there are infinitely many goods indexed by $ i $ over the interval $ [0,1] $. Let $ C(i) $ denote the amount of consumption and $ P(i) $ denote the price of each good $ i $. Then, in order to find the amount of consumption for each good $i$ ($ C(i) $) that minimizes total expenditure on all $ i $th goods subject to the constraint that the total amount of all goods (aggregated in a special way as described in the constraint) equals to $ C $, the agent has to solve the following problem:
\begin{align*} \min_{C(i)} \quad & \int_0^1 P(i)C(i)di \\ \text{s.t.} \quad & \left[{\int_0^1 C(i)^{\frac{\epsilon-1}{\epsilon}}di}\right]^{\frac{\epsilon}{\epsilon-1}}=C \end{align*}
Problem Solution (Suggested by the author)
Form the Lagrangian and take the derivative w.r.t each $C(i)$:
\begin{align*} L(C(i)) = \int_0^1 P(i)C(i)di - \lambda \left( \left[{\int_0^1 C(i)^{\frac{\epsilon-1}{\epsilon}}di}\right]^{\frac{\epsilon}{\epsilon-1}} - C \right) \end{align*}
\begin{align*} FOC[C(i)]: P(i) - \lambda \frac{\epsilon}{\epsilon-1} \left[{\int_0^1 C(i)^{\frac{\epsilon-1}{\epsilon}}di}\right]^{\frac{1}{\epsilon-1}} \frac{\epsilon-1}{\epsilon} C(i)^{-\frac{1}{\epsilon}} = 0 \end{align*}
And then he goes on to find each $ C(i) $ in terms of lambda and other parameters (if anyone is interested I can write up all the problem but for my question this is enough).
Question:
As you can see in the above problem the author takes derivetive w.r.t. $ C(i) $ while the integration variable is $ i $. I would like to know if for any seasoned mathematician this makes sense and how? I know how (and why it is mathematically correct) to take derivative of a definite integral w.r.t variable border as well as w.r.t. to integration variable but this kind of derivative and especially the result of derivation in the FOC $\left(\frac{d\int_0^1 P(i)C(i)di}{dC(i)} = P(i) \right)$ does not really make sense to me. Does it makes sense for any mathematician concerned with rigor and why?
The good way to gain intuition about about this is to imagine that we start with a discrete number of goods N, with the “size” of each good (in terms of price and production function) equal to 1/N, and then take the limit as N approach infinity. That is we’re assuming an infinite number of infinitesimally small intermediate firms, which of course is unrealistic, but mathematically convenient, but I digress.
For a given level of N we have the problem:
$min_{c_i} \sum_{i=1}^N \frac{1}{N} P_i C_i$
s.t. $C= \left( \sum_{i=1}^N \left( \frac{1}{N}C_i\right)^{\frac{\epsilon-1}{\epsilon}} \right) ^{\frac{\epsilon}{\epsilon -1}}$
Notice the $1/N$ can be factored out at the end, leaving and FOC analogues to the one listed in the continuous case, but with a sum instead of an integral.
So we can see the solution is intuitively the continuous analog of the discrete case. Next is the question of whether the problem as listed is mathematically acceptable.
In terms of math, what the optimization problem is being asked to do, is to pick a function (yes our choice here is a function) $c(i)$ from 0 to 1 that minimizes the integral, given the constraint.
This is a perfectly reasonable problem to solve, I believe similar things come up in engineering and physics for example. The subfield of optimal control exists to solve this type of problem, so if you’re looking for some of the more nitty gritty aspects of the topic, you’ll need to delve into that literature, but I can offer some further intuition. Technically what we need to do solve a convex problem like this is to set the frechet derivative of the problem equal to 0. Luckily when things are well behaved we can solve this problem by treating $d_i$ like a constant, the integral like a sum and setting the derivative with respect to $c(i)$ (NOT the derivative of c w.r.t. i) equal to 0 in the same way we would if it was a variable rather than a point in a function.
If we do this we’ll notice that the FOC will come out as you listed, with both sides of the equality multiplied by $d_i$. Intuitively this is the $\frac{1}{N}$ in the limit of the problem listed above. To be clear that means we are saying that the marginal cost and marginal benefit of increasing $c(i)$ is infinitesimally small, but that makes sense because it’s just 1 one of infinitely many points in the interval.