Why if the "True Error" equals $0$ i.e. $L_{(\mathcal{D}, f)}(h^{*}) = 0$ then the "Training Error" equals $0$ i.e. $ L_{S}(h^{*}) = 0$?

16 Views Asked by At

I have read this question and I am confused by a part of the first answer, even though it is asked in the comments.

I don't understand why $$L_{(\mathcal{D}, f)}(h^{*}) = 0 \implies L_{S}(h^{*}) = 0$$

Why if the "True Error" equal to $0$ i.e. $L_{(\mathcal{D}, f)}(h^{*}) = 0$ then it's implied that the "Training Error" is also equal to $0$ i.e. $ L_{S}(h^{*}) = 0$?

Also, I would like to clarify my understanding of $\mathcal{D}$.

My understanding of $\mathcal{D}$ is that it's some probability distribution of the input data $\mathcal{X}$ (as stated in the book).

For example, let $\mathcal{X}=\{x_1,x_2,x_3\}$ and $\mathcal{Y}=\{0,1\}$. Let $f(x_i)=1$ if $x\geq 0$ and $0$ otherwise. Then let $\mathcal{D}$ be such that $P(x_1\geq 0)=1/2, P(x_2\geq 0)=1/3, P(x_3 \geq 0)=1/4$. Is this a reasonable example of how to interpret $\mathcal{D}$?