How to take the Variance of the Mean

44 Views Asked by At

In Sensitivity Analysis (specifically Sobol's Method) I keep coming across the following equation to calculate a decision variable's importance. (An example of such from Wikipedia.)

$$ V_i = Var_{X_i}\bigl(E_{X_\sim i}(Y | X_i )\bigr)$$

Knowing that expectation, the expected value, or mean can be written as $\mathbb E(X) $ or $\mu_x $, I'm having trouble conceptually understanding how to take the variance of a mean?

Is there a more detailed description of how using Monte Carlo sampling in this case and holding $X_i$ set for a given function such as $f(\pmb x) = x_1 + x_2 x_3^2 $ is used with the above equation?


Edited to add a more detail to the question.

There are basically two ways to look at this question specifically for the function noted above and I'm not sure which interpretation is correct. Does one hold one variable set and sample from the other two. Or how the other two variables set and sample from the $i^{th}$ variable?

(* Function to Analyze *)
f[x1_, x2_, x3_] := x1 + x2 (x3^2);

(* Fixed Values *)
x1 = RandomInteger[{1, 1000}];
x2 = RandomInteger[{1, 100}];
x3 = RandomReal[{1, 10}];

(* Random Values *)
randx1 = RandomInteger[{1, 1000}, num];
randx2 = RandomInteger[{1, 100}, num];
randx3 = RandomReal[{1, 10}, num];

(* Total Variance of Y *)
VarY = Variance[
   f[#1, #2, #3] & @@@ Transpose[{randx1, randx2, randx3}]];

(* Variable of interest Fixed *)
FixedX1 = Variance[f[x1, #1, #2] & @@@ Transpose[{randx2, randx3}]];
FixedX2 = Variance[f[#1, x2, #2] & @@@ Transpose[{randx1, randx3}]];
FixedX3 = Variance[f[#1, #2, x3] & @@@ Transpose[{randx1, randx2}]];

(* Variable of interest Random *)
VarX1 = Variance[f[#, x2, x3] & @@@ randx1];
VarX2 = Variance[f[x1, #, x3] & @@@ randx2];
VarX3 = Variance[f[x1, x2, #] & @@@ randx3];

The above show be close to the first-order Sobol index. With some post-processing in Mathematica I get:

However, from the example I'm following we show know that the function is most sensitive to $x_3$ which is opposite of what I'm getting. This can be intuitively shown just playing around with values drawn from the ranges above.