I am a physicist and I’ve never taken any statistics or probability classes. Nevertheless, I'm trying to build a set of lecture notes for my students of experimental physics regarding error propagation. Sadly, I haven’t found much literature on the subject from a rigorous point of view (it probably has a name and I just don’t know it).
In any case, in physics we usually denote our measurements by $x\pm\Delta x$ where $x$ is our “best guess” and $\Delta x$ is an interval where we are reasonably certain that our measurement lies. If we have multiple measurements $x_1\pm\Delta x_1,\dots,x_n\pm\Delta x_n$ and we want to calculate a function of these, we would report it as $$f(x_1,\dots,x_n)\pm\sqrt{\sum_{i=1}^n\big(\partial_i f(x_1,\dots,x_n)\Delta x_i\big)^2}.$$
In my set of notes, I wanted to take a more rigorous approach. Instead of using the $x\pm \Delta x$ notation, I insisted on measurements being probability measures on an outcome space $X$. Then error propagation in a random variable $f:X\rightarrow Y$ is just the probability measure induced on $Y$ by $f$ $$P_f(F)=P(f^{-1}(F)).$$ Nevertheless, for my students to make calculations, I wanted to relate back to the old notation. I have two options:
I can define $x\pm\Delta x$ as the probability measure on $\mathbb{R}$ whose probability density function is a gaussian centered at $x$ with standard deviation $\Delta x$. If I take this option, I would like to prove that for all random variables $f:\mathbb{R}^n\rightarrow \mathbb{R}$, the induced measure is reasonably approximated by a Gaussian measure described by the physicists’ formula. Moreover, I find the approach very appealing because of the central limit theorem. Am I right in that that theorem can be interpreted as saying that the average of a bunch of probability measures (the process of averaging understood as a random variable) tends to induce a Gaussian probability measure?
I can define the expected value of a measure and its variance. I introduce the notation $x\pm \Delta x$ as stating that it is just some measure with expected value $x$ and variance $\Delta x$. Can I justify the physicists’ formula?
I would really appreciate if the reader can pick its favorite choice and help me build arguments for the questions asked.
I would wrap up the mathematical background of error propagation as follows:
All basic error-propagation rules are based on the assumption of independent, normal distributions.
The assumption of normal distributions is often justified by the central limit theorem as a typical measurement error is the accumulation of several error sources. The latter may be distributed in several ways, but when accumulated you get a normal distribution due to the central limit theorem. (Also see this answer of mine on Cross Validated).
Under these assumptions, it’s easy to show that $Δ(λx) = λ·Δx$.
It’s also easy to show that $Δ(x+y) = \sqrt{Δx^2 + Δy^2}$ (convolute the distributions).
If you assume that the errors are small with respect to the second derivatives of $f$, you can linearise $f$ and apply points 2 and 3 to get the desired formula.
With this is mind, to your individual questions:
I don’t think that the central limit theorem is of relevance to this particular step (see above). But then it is not very clear to me why you do.
No, you additionally need to assume a normal distribution. You can easily see this with a numerical experiment multiplying two exponentially distributed variables.