Let $X$ and $Y$ be random variables. On solving the theoretical least squares for simple linear regression case, I came across the following step (while differentiating the expected mean square error with respect to the first parameter): $$\frac{\partial}{\partial{a}} \mathbb{E}(Y-aX-b)^2 = \mathbb{E}\big[\frac{\partial}{\partial{a}}(Y-aX-b)^2\big]$$
I have the following doubts on this step:
- Is this always true? Can we interchange derivative and expectation operator like this? Or do we need to check any condition before doing so?
- Since we are differentiating the term $\mathbb{E}(Y-aX-b)^2$ with respect to $a$ (keeping $b$ constant), the term $\mathbb{E}(Y-aX-b)^2$ is now a function of two random variables $X, Y$ or is it a function of $a, b$ ?
Yes. Expectation is a linear operation over the random variables, and here $a$ is a free variable.
$\mathbb E((Y-aX-b)^2)$ is a function of free variables $a$ and $b$, for exactly the reason $\Bbb E(X)$ is a constant rather than a function of random variable $X$.
In general, when the domain is not a function of $a$, we have for summation over a discrete domain:
$$\dfrac{\partial ~~}{\partial a}\sum_{x\in\mathcal D} g(a,x)=\sum_{x\in\mathcal D} \dfrac{\partial g(a,x)}{\partial a}$$
And analogously for integrals over a continuous domain:
$$\dfrac{\partial ~~}{\partial a}\int_{\mathcal D} g(a,x)\,\mathrm d x=\int_{\mathcal D} \dfrac{\partial g(a,x)}{\partial a}\,\mathrm d x$$
.
Remark: The partial differentiation in the RHS expressions do also treat $x$ as an independent variable.