Utility of the definition of non-uniform learnability

34 Views Asked by At

The question concerns the book Understanding Machine Learning by Shalev-Schwartz & Ben-David and the provided definition of non-uniformly learnability. Suppose $\mathcal{H}$ is non-uniformly learnable, that is,

Def: There exists an algorithm $A$ and a function $m_{\mathcal{H}}^{NUL}:(0,1)^2 \times \mathcal{H} \rightarrow \mathbb{N}$ such that for any $\epsilon, \delta\in (0,1)$ and any $h\in\mathcal{H}$, if $m\geq m_{\mathcal{H}}^{NUL}(\epsilon, \delta, h)$, then for any distribution $\mathcal{D}$ over the sample space, if the size of training sample $S$ is equal to $m$, then with probability of at least $1-\delta$ we have $$L_{\mathcal{D}}(A(S)) \leq L_{\mathcal{D}}(h) + \epsilon,$$ where $L_{\mathcal{D}}$ denotes the true risk, i.e. the probability of misclassification.

Now if i show some Algorithm $A$ and a function $m_{\mathcal{H}}^{NUL}$ exists where this is the case, what is the utility of knowing my class $H$ is non-uniformly learnable. If our class is PAC-learnable we know that our error is \begin{equation} L_{\mathcal{D}}(A(S)) \leq min_{h\in H}L_{\mathcal{D}}(h) + \epsilon \end{equation} bounded by the best hypothesis in our class with probability $1-\delta$ if our sample size is sufficent. But in non-uniformly learnability i only know that our returned hypothesis can "compete" with a specific $h\in H$ if we provide $m_{\mathcal{H}}^{NUL}(\epsilon, \delta, h)$ samples. So it doesnt provide me with any information how many samples i need to compete with all the $h\in H$. For that i would need to know the $h$ that minimizes my $L_D(h)$ to determine my sample size.