Calculating the expecated value of the number of samples belonging to one class in the 'best' fold of cross validation

12 Views Asked by Bumbble Comm At 27 Feb 2026 - 12:32

Assuming we are doing a $k$-fold cross-validation on a dataset consisting of $N$ samples, of which $R$ ($0\le R\le N$) samples belong to class A and the rest $N-R$ belong to class B. Of all k-folds one fold will have the most class A samples. How do I calculate the expected value of the number of samples in this 'best' fold $\mathbb{E}[x_{max}]$? Now I understand that for each fold this is a hypergeometric distribution, and I can probably get a good estimate of the expected value based on the following equation: $$f_{X_{(n)}}(x) = \frac{d}{dx}[F_{X_1}(x)] = \frac{d}{dx} [F(x)]^k = n[F(x)]^{k−1}f(x) $$

However, this approach assumes the $k$-folds are independent, but they are not because they share a total of $R$ class A samples. Is there a way to calculate the expected value more accurately and is it possible to calculate the variance as well? Thank you!

Original Q&A

Calculating the expecated value of the number of samples belonging to one class in the 'best' fold of cross validation

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in EXPECTED-VALUE

Related Questions in SAMPLING

Trending Questions

Popular # Hahtags

Popular Questions