Variance of ratio of mean value of functions of a random variable

913 Views Asked by At

The problem:

I have a random process in which the outcomes of the real valued random variable takes the independent values $x_1, x_2, ...,x_n$. Then I defined $Q$ as

$$ Q = \frac{n^{-1}\sum^n f(x_i)}{n^{-1}\sum^n g(x_i)}$$

Where $f$ and $g$ are known functions of the $x_i$. Note that $x_1, x_2$ etc. take the same values in both, the numerator and denominator.

My goal is obtain an expresion for $\operatorname{var}\left[Q\right]$.


My efforts:

I tried the equation

$$\operatorname{var}\left[\frac{X}{Y}\right]\approx\frac{\operatorname{var}\left[X\right]}{\operatorname{E}\left[Y\right]^2}-\frac{2\operatorname{E}\left[X\right]}{\operatorname{E}\left[Y\right]^3}\operatorname{cov}\left[X,Y\right]+\frac{\operatorname{E}\left[X\right]^2}{\operatorname{E}\left[Y\right]^4}\operatorname{var}\left[Y\right]$$

from this article, considering $X$ and $Y$ as the numerator and denominator, respectivelly.

I considered: $E[X]= E[f(x_i)]$ and $\operatorname{var}\left[X\right]=\operatorname{var}\left[f(x_i)\right]/n$.


Specific doubts:

  • Is this the right procedure?

  • How to evaluate $\operatorname{cov}(X,Y)$? ( I thought that $\operatorname{cov}(X,Y)=\operatorname{E}\left[XY\right]-\operatorname{E}\left[X\right]\operatorname{E}\left[Y\right]$) but I am unure because the next point and because the variance should be divided by $n$.

  • How the fact that $x_1, x_2,...$ are the same for numerator and denominator influence in this analysis?

Thank you very much.

1

There are 1 best solutions below

0
On

I actually have the exact same problem and I had been somewhat confused about the difference between (what I perceived to be) two separate situations:

  1. The expectation value and variance of the ratio of two random variables:

$$\text{E}\left[\frac{A}{B}\right] \quad\text{and}\quad \text{Var}\left[\frac{A}{B}\right]$$

  1. The variance of an estimator constructed by taking the ratio of the sample means of two random variables:

$$Q = \frac{A}{B} = \frac{\frac{1}{N}\sum_{i=1}^N a(x_i)}{\frac{1}{N}\sum_{i=1}^N b(x_i)} \quad\longrightarrow\quad \text{Var}\left[Q\right]$$


In the first case, there is a known formula, as you have already pointed out. Yet, I can relate to your lingering doubts as to whether the given formula applies to the second case as it feels somehow different. However, after some thinking, I believe the formula also applies to the second scenario. My reasoning goes as follows:

In the case where $Q = A/B$ such that $A$ and $B$ are estimates of the respective sample means of $a(x)$ and $b(x)$, $A$ and $B$ are themselves random variables, each having an associated error. Thus, even though we're taking the ratio of the (sample) means of two variables, the sample means are themselves random variables, implying that $\text{Var}[Q]$ really can be computed using the formula for $\text{Var}[A/B]$ as in the first case.

This reasoning becomes even clearer when considering the exercise of error propagation. Since $A$ and $B$ are estimates of some fluctuating quantities, they have each have an associated error. To find the variance, then, of a function, $Q$, defined as their ratio (as in the case we're considering), we can use error propagation to find how the errors in estimating $A$ and $B$ propagate to the error in $Q$. Indeed, the same formula for the ratio of two random variables, $\text{Var}[f] = \text{Var}[A/B]$, is provided in a table of formulas in the Wiki page on error propagation.

Though I've been using this formula for my own work, hopefully someone else can weigh in about the soundness of this reasoning and any caveats.


Note: Depending on the degree of nonlinearity in $a(x)$ and $b(x)$, the error estimate (for the ratio of their sample averages) will be biased. There are probably other caveats that I'm not aware of, but I hope this boosts your confidence in your use of the formula.