Error propagation, why use variences?

145 Views Asked by At

I have been reading up on error propagation and am slightly confused about something.

We can the error in $c=f(a,b)$ as the: $$\sigma(c)= f_a \sigma_a+f_b \sigma _b$$ Firstly is this correct and am I correct in saying that the partial derivatives are evaluated at the mean of $a$ and $b$?

Ever where I look, however, they do not seem to quote this formula but rather square it and assume that $a$ and $b$ are indepdent we get:

$$\sigma(c)^2= f_a^2 \sigma_a^2+f_b^2 \sigma _b^2$$ What is the advantage of this expression over my first one since it seems less accurate due to us having to assume independence, which is not necessary with my first formula.

1

There are 1 best solutions below

2
On BEST ANSWER

Lets assume that you measure $a$ and $b$ with error that has a standard deviation of $\sigma_a,\sigma_b$, respectively. We can model your measurements as $\hat a = a+\epsilon_a,\hat b = b+\epsilon_b$, where $Var(\epsilon_a)=\sigma_a^2,Var(\epsilon_b)=\sigma_b^2$ and $E[\epsilon_a]=E[\epsilon_b]=0$.

Then, we can write $\hat c = f(\hat a, \hat b)$.

Now, here's the big assumption in all of error propagation: $\frac{\sigma_a}{\mu_a}, \frac{\sigma_b}{\mu_b}\ \ll 1$..this justifies our ability to use the linearization of $f(a,b)$ (i.e., $L[f(a,b)]$) to estimate the measurement error of $c$:

$ L[f(x+\Delta a,y+\Delta b)]_{a,b}=f(a,b) + f_x|_a \times(\Delta a) + f_y|_b \times (\Delta b)$

Given small enough perturbations ($\Delta$), we can say that $f(a+\Delta a,b+\Delta b) \approx L[f(x+\Delta a,y+\Delta b)]_{a,b}$ if $|\Delta a|, |\Delta b|\ll 1$

Now, lets replace $a+\Delta a$ with $a+\epsilon_a$ and similarly for $b$ and see where this leads us:

$\hat c = f(a+\epsilon_a,b+\epsilon_b) \approx f(a,b)+f_x|_a \times(\epsilon_a) + f_y|_b \times (\epsilon_b)$

We can clean this up by letting $f_a:=f_x|_a$ and $f_b:=f_y|_b$:

$\hat c \approx f(a,b)+ f_a\epsilon_a + f_b\epsilon_b$

Thus, we've turned $\hat c$ into a random variable:

$E[\hat c]= f(a,b), \;\; Var[\hat c] = (f_a)^2\sigma^2_a + (f_b)^2\sigma^2_b + 2 f_af_b\sigma_{ab}$

If the errors are independent, then the last term is $0$ and we get:

$\sigma^2_{\hat c}=(f_a)^2\sigma^2_a + (f_b)^2\sigma^2_b$, which is your second equation.

Now, if we don't assume independence, we can use the Cauchy-Schwarz inequality, which in a probabilistic context states:

$$(\sigma_{XY})^2\leq \sigma^2_X \sigma^2_Y$$

Therefore, to maximize the value of $\sigma_{\hat c}^2$, we should set $\sigma_{ab}=\sigma_{a}\sigma_{b}$, which gives us:

$\sigma^2_{\hat c}=(f_a)^2\sigma^2_a + (f_b)^2\sigma^2_b + 2 f_af_b\sigma_{a}\sigma_{b}=(f_a\sigma_{a}+f_b\sigma_{b})^2\implies \sigma_{\hat c}=f_a\sigma_{a}+f_b\sigma_{b}$, which is your first equation.

So, as copper.hat said, the second equation in your post assumes independent errors, and hence does not maximize the variability. The first equation in your post is an upper bound on the error, since it assumes they are perfectly correlated, so the errors are always in the same direction as each other and proportional in magnitude.

To recap, propagation of error equations rely on the smallness of the error relative to the measurement (i.e., small signal to noise ratio) to allow for simplifying lineralizations to be used. Then, depending on how correlated you feel the errors are, you can adjust the variability appropriately using the covariance explicitly, or relying on an assumption of independence or perfect positive correlation.