Binary Classification : Prove that $\mathbb{E}_{\mathcal{D}_n}\left[R_e(h)\right] = R(h)$

69 Views Asked by Bumbble Comm At 31 Mar 2026 - 7:10

I originally posted this question on Cross Validated but thought it might be more relevant here since the answer I seek involves more of mathematical manipulation rather than statistical techniques.

Problem Statement

Let $h \in \mathcal{H}$ be a hypothesis to some class of binary classifiers $\mathcal{H}$. Show that $$\mathbb{E}_{\mathcal{D}_n}\left[R_e(h)\right] = R(h)$$ where the expectation on the LHS is over all possible training datasets $\mathcal{D}_n$ of size $n$.

$R_e(h)$ is the empirical risk of the algorithm over a given dataset $\mathcal{D}_n$. It is defined as

$$R_e(h) = \frac1n\sum_{i=1}^{n}\mathcal{L}(x_i, h(x_i))$$

Here $\mathcal{L}$ is the loss function for the binary classification problem defined as $$\mathcal{L}(x,h) = \begin{cases} 1, & s(x) \not= h(x) \\ 0, & \text{otherwise} \end{cases} $$

$s(x)$ is the system we are trying to model

$R(h)$ is the true risk of the hypothesis $h$

My work

$$R_e(h) = \frac1n\sum_{i=1}^{n}\mathcal{L}(X_i, h(x_i))$$ $$\mathbb{E}_{\mathcal{D}_n}\left[R_e(h)\right] = \int_{\mathcal{D}_n}{R_e(h)p(\mathcal{D}_n)}$$ $$ = \frac{1}{n}\int_{\mathcal{D}_n}{\sum_{x_i \in \mathcal{D}_n}\mathcal{L}(x_i, h)p(\mathcal{D}_n)}$$

Since I want to manipulate this to convert it to $R(h) = \int_{x}{\mathcal{L}(x,h)p(x)dx}$, I though of group all $x_i$ out of the above equation. But then I couldn't find a way to get the term $p(x)$ into the picture and this is where I am stuck.

I am looking for progressive hints that will help me solve this myself. Thanks!

Original Q&A

Binary Classification : Prove that $\mathbb{E}_{\mathcal{D}_n}\left[R_e(h)\right] = R(h)$

Problem Statement

My work

Related Questions in EXPECTATION

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions