I am learning the hypothesis test with two sample proportions. Suppose the sample sizes and number of successes are ($n_1, y_1$) and ($n_2, y_2$), for the two samples, respectively. Let the true proportions of successes be $p_1, p_2$.
Null hypothesis $H_0$: $p_1-p_2 = 0$
Alternative hypothesis $H_a$: $p_1 - p_2 \ne 0$
Everywhere I have seen, it is required that both samples need to have at least $10$ successes and failures. I understand that a binomial to be approximated by a normal distribution needs to have that condition met for a single distribution.
Here, the null hypothesis is that $p_1 = p_2$. Then the estimate for the true proportion $p$ under that is $\hat p = \frac{y_1 + y_2}{n_1 + n_2}$. Is it not enough that the number of combined successes and failures meet $y_1 + y_2 > 10 $ and $n_1 + n_2 - y_1 - y_2 > 10$? Under the null hypothesis, then, would that imply that the individual samples are drawn from approximately normal distributions if $\hat p n_1, (1 - \hat p) n_1, \hat p n_2, (1 - \hat p) n_2 > 10$? This would then further imply that $\hat p_1 - \hat p_2$ is approximately normally distributed.
The "10 successes and failures" rule of thumb only informs whether you could reasonably use a normal approximation for the test statistic, or whether an exact test statistic should be used.
That said, you can immediately see how your proposed modification to the rule to consider the overall number of successes and failures across the two groups is inadequate by constructing an extreme example such as:
$$(y_1, n_1) = (1, 2), \quad (y_2, n_2) = (9000, 10000).$$
This easily meets your criterion but the sample size in the first group is tiny, meaning there is very little information that we can infer about the parameter $p_1$.
Let's actually look at testing the hypothesis $$H_0 : p_1 = p_2 \quad \text{vs.} \quad H_a : p_1 \ne p_2. \tag{1}$$ I argue that this is nearly equivalent to the test $$H_0 : p_1 = 0.9 \quad \text{vs.} \quad H_a : p_1 \ne 0.9. \tag{2}$$ This is because the observed proportion in group 2 is based on an extremely large sample size; thus the information contained in the data about the true value of $p_2$ is very high. For instance, the standard error of $\hat p_2 = 0.9$ is $$SE(\hat p_2) = \sqrt{\frac{0.9(1-0.9)}{10000}} = 0.003.$$ Therefore, the $p$-value of the test for hypothesis $(1)$ will be very close to the $p$-value for hypothesis $(2)$.
With this in mind, it is easy to compute the exact probability of a Type I error for hypothesis $(2)$:
$$\begin{align} p &= \Pr[Y_1 \le 1 \mid p_1 = 0.9] \\ &= 1 - \Pr[Y_1 = 2 \mid p_1 = 0.9] \\ &= 1 - (0.9)^2 \\ &= 0.19. \end{align}$$
This is certainly inadequate evidence to reject $H_0$. In other words, if we have a coin where the chance of seeing heads on any given flip is $90\%$, the chance that we see $0$ or $1$ head in $2$ flips is $19\%$.
What this tells us is that both groups must have sufficient data, not just to use a normal approximation, but to even have the possibility of rejecting the null hypothesis when an exact test statistic is used. This is why your proposed rule of thumb is flawed.