Given descriptive stats for two sets $(x_1,y_1)$ and $(x_2, y_2)$, can $\rho^2$ be found for their union $(\{x_1 \cup x_2\}, \{y_1 \cup y_2\})$?

11 Views Asked by At

I have two sets of x-y coordinates: $(x_1,y_1)$ and $(x_2, y_2)$. I also have the $R^2$ value for each set of coordinates. (e.g. $(x_1,y_1)$ has the correlation coefficient of $R^2$)

Given the stats describing the sets listed below and not the original data, how can I compute the $\rho^2$ value for the union of the sets of coordinates? In math notation, how can I compute $R^2$ for $(\{x_1 \cup x_2\}, \{y_1 \cup y_2\})$?

Here's all the information I have about these two sets:
  • Correlation coefficients: $R_1^2$ and $R_2^2$
  • Means: $\mu_1$ and $\mu_2$
  • RMSE: $RMSE_1$ and $RMSE_2$
  • Number of elements: $n_1$ and $n_2$
  • Min/Max of each set: $min_1, max_1$ and $min_2, max_2$
  • Just about any other descriptive stat for each set individually (i.e. $\sigma_1$, $cov(x_1,y_1)$, etc.)
Assumptions:
  • The $x_1$ is independent from $x_2$
  • The $y_1$ is independent from $y_2$
  • To clarify, I'm looking for the coefficient of determination
  • I know about Simpson's Paradox, but I want to do this anyway

I'm trying to do things this way because if I have more than two sets (which I plan to), I cannot load the entire union of all the sets into my computer's memory all at once. I need to summarize each set and then compute combined values from there to avoid crashing my computer.

If this is better suited for the statistics stack exchange, please let me know! Thanks in advance