So I have a data divided into chunks, and I can only calculate the variance in each of the chunks because of software limitations. But I want to get the variance of the whole data together, not the chunks. I know the variance is not a linear operator. I would like the get kind of the average of the variance but this will have to be the same number as If I calculated the variance of the whole data together. Example: Rolling a dice in 3 groups of 2 rolls I can calculate the variance on each of the groups, so I with this data, I want to calculate the variance of the whole set: rolling a dice 6 times. Thank you for your help.
2026-03-29 19:10:20.1774811420
Can I work out the variance in batches?
1.3k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
There are 1 best solutions below
Related Questions in PROBABILITY
- How to prove $\lim_{n \rightarrow\infty} e^{-n}\sum_{k=0}^{n}\frac{n^k}{k!} = \frac{1}{2}$?
- Is this a commonly known paradox?
- What's $P(A_1\cap A_2\cap A_3\cap A_4) $?
- Prove or disprove the following inequality
- Another application of the Central Limit Theorem
- Given is $2$ dimensional random variable $(X,Y)$ with table. Determine the correlation between $X$ and $Y$
- A random point $(a,b)$ is uniformly distributed in a unit square $K=[(u,v):0<u<1,0<v<1]$
- proving Kochen-Stone lemma...
- Solution Check. (Probability)
- Interpreting stationary distribution $P_{\infty}(X,V)$ of a random process
Related Questions in STATISTICS
- Given is $2$ dimensional random variable $(X,Y)$ with table. Determine the correlation between $X$ and $Y$
- Statistics based on empirical distribution
- Given $U,V \sim R(0,1)$. Determine covariance between $X = UV$ and $V$
- Fisher information of sufficient statistic
- Solving Equation with Euler's Number
- derive the expectation of exponential function $e^{-\left\Vert \mathbf{x} - V\mathbf{x}+\mathbf{a}\right\Vert^2}$ or its upper bound
- Determine the marginal distributions of $(T_1, T_2)$
- KL divergence between two multivariate Bernoulli distribution
- Given random variables $(T_1,T_2)$. Show that $T_1$ and $T_2$ are independent and exponentially distributed if..
- Probability of tossing marbles,covariance
Related Questions in VARIANCE
- Proof that $\mathrm{Var}\bigg(\frac{1}{n} \sum_{i=1}^nY_i\bigg) = \frac{1}{n}\mathrm{Var}(Y_1)$
- $\{ X_{i} \}_{i=1}^{n} \thicksim iid N(\theta, 1)$. What is distribution of $X_{2} - X_{1}$?
- Reason generalized linear model
- Variance of $\mathrm{Proj}_{\mathcal{R}(A^T)}(z)$ for $z \sim \mathcal{N}(0, I_m)$.
- Variance of a set of quaternions?
- Is the usage of unbiased estimator appropriate?
- Stochastic proof variance
- Bit of help gaining intuition about conditional expectation and variance
- Variance of $T_n = \min_i \{ X_i \} + \max_i \{ X_i \}$
- Compute the variance of $S = \sum\limits_{i = 1}^N X_i$, what did I do wrong?
Trending Questions
- Induction on the number of equations
- How to convince a math teacher of this simple and obvious fact?
- Find $E[XY|Y+Z=1 ]$
- Refuting the Anti-Cantor Cranks
- What are imaginary numbers?
- Determine the adjoint of $\tilde Q(x)$ for $\tilde Q(x)u:=(Qu)(x)$ where $Q:U→L^2(Ω,ℝ^d$ is a Hilbert-Schmidt operator and $U$ is a Hilbert space
- Why does this innovative method of subtraction from a third grader always work?
- How do we know that the number $1$ is not equal to the number $-1$?
- What are the Implications of having VΩ as a model for a theory?
- Defining a Galois Field based on primitive element versus polynomial?
- Can't find the relationship between two columns of numbers. Please Help
- Is computer science a branch of mathematics?
- Is there a bijection of $\mathbb{R}^n$ with itself such that the forward map is connected but the inverse is not?
- Identification of a quadrilateral as a trapezoid, rectangle, or square
- Generator of inertia group in function field extension
Popular # Hahtags
second-order-logic
numerical-methods
puzzle
logic
probability
number-theory
winding-number
real-analysis
integration
calculus
complex-analysis
sequences-and-series
proof-writing
set-theory
functions
homotopy-theory
elementary-number-theory
ordinary-differential-equations
circles
derivatives
game-theory
definite-integrals
elementary-set-theory
limits
multivariable-calculus
geometry
algebraic-number-theory
proof-verification
partial-derivative
algebra-precalculus
Popular Questions
- What is the integral of 1/x?
- How many squares actually ARE in this picture? Is this a trick question with no right answer?
- Is a matrix multiplied with its transpose something special?
- What is the difference between independent and mutually exclusive events?
- Visually stunning math concepts which are easy to explain
- taylor series of $\ln(1+x)$?
- How to tell if a set of vectors spans a space?
- Calculus question taking derivative to find horizontal tangent line
- How to determine if a function is one-to-one?
- Determine if vectors are linearly independent
- What does it mean to have a determinant equal to zero?
- Is this Batman equation for real?
- How to find perpendicular vector to another vector?
- How to find mean and median from histogram
- How many sides does a circle have?
Refer to the following answer to this question: How do I combine standard deviations of two groups?
In particular, the final formula
$$s_z^2 = \frac{(n-1) s_x^2 + (m-1) s_y^2}{n+m-1} + \frac{nm(\bar x - \bar y)^2}{(n+m)(n+m-1)}$$
illustrates how to compute the total variance of two samples, one of size $n$, sample mean $\bar x$, and sample variance $s_x^2$, and one of size $m$, sample mean $\bar y$, and sample variance $s_y^2$. Those are the quantities you need to track. Also note that the total sample mean is given by the formula $$\bar z = \frac{n \bar x + m \bar y}{n + m}.$$ These formulas readily lend themselves to an extended calculation for any number of groups:
Since the original poster has claimed that the formula does not work, I will furnish a numerical example to illustrate. This example will employ discrete data to match the scenario described in the question, but realizations from a continuous distribution can just as easily be provided.
Let $D_i$ represent dataset $i$. Then
$$\begin{align*} D_1 &= \{1, 1, 3, 4, 1, 5, 6, 3, 5, 5\} \\ D_2 &= \{5, 6, 2, 4, 2, 1, 1, 4, 2, 4, 4, 1, 3, 5, 6\} \\ D_3 &= \{3, 2, 6, 4, 1, 5, 2, 1, 3, 1, 5, 2, 2\} \\ D_4 &= \{5, 3, 1, 5, 1\} \end{align*}$$
Consequently, $$\begin{array}{|c|c|c|c|} \hline i & n_i & \bar x_i & s_{x_i}^2 \\ \hline 1 & 10 & \frac{17}{5} & \frac{18}{5} \\ \hline 2 & 15 & \frac{10}{3} & \frac{65}{21} \\ \hline 3 & 13 & \frac{37}{13} & \frac{73}{26} \\ \hline 4 & 5 & 3 & 4 \\ \hline \end{array}$$
We now calculate the combined sample sizes, means, and variances of datasets $1$ through $i$:
$$\begin{array}{|c|c|c|c|} \hline T & n_T & \bar x_T & \bar s_T^2 \\ \hline 1 & 10 & \frac{17}{5} & \frac{18}{5} \\ \hline 2 & 25 & \frac{84}{25} & \frac{947}{300} \\ \hline 3 & 38 & \frac{121}{38} & \frac{4245}{1406} \\ \hline 4 & 43 & \frac{136}{43} & \frac{2749}{903} \\ \hline \end{array}$$
The last row represents the total sample size, sample mean, and sample variance for the $4$ combined datasets.
Here is a sample calculation of the aggregate variance of datasets $1$ through $3$:
$$s_T^2 (T = 3) = \frac{(25 - 1)(\frac{947}{300}) + (13 - 1)(\frac{73}{26})}{25 + 13 - 1} + \frac{(25)(13)(\frac{84}{25} - \frac{37}{13})^2}{(25 + 13)(25 + 13 - 1)} = \frac{4245}{1406},$$
which matches the direct calculation based on datasets $D_1, D_2, D_3$.
Finally, Mathematica code to replicate the above computations:
In the future, rather than simply asserting that the formula doesn't work, it would be more polite and instructive to provide your own computations showing where you are encountering problems, so that your error can be found.