In the context of computational learning theory:
Let $f(w,z): \mathcal{W} \rightarrow \mathbb{R}$ be convex in $w \in \mathcal{W}$ where $\mathcal{W}$ bounded by R. Over $\mathcal{W}$, $|f| \leq C$.
Assuming that $f(w,z)$ is learnable in expectation, therefore exists algorithm B such that when given a sample $S \thicksim D^{m}, |S| = m > m(\epsilon)$ returns $w_S$ such that $$\mathop{{}\mathbb{E}}\limits_{S \thicksim D^{m}}[F(w_S)]\leq \min \limits_{w \in \mathcal{W}} F(w) + \epsilon$$
Where $F(w)$ is the true risk.
I need to show that for any $\epsilon, \delta \in [0,1]$, and $m \gt m(\epsilon,\delta)$, there exists an algorithm A that given samples $z_1, ... , z_{m_1}$ the algorithm outputs $\hat w$ such that $$F(\hat w) \leq \min \limits_{w \in \mathcal{W}} F(w) + \epsilon$$
I also need to provide a sample complexity bound in terms of $m(\epsilon)$ and C.
Here's what I have:
Clearly, $F(w) \geq \min \limits_{w \in \mathcal{W}} F(w)$ so $F(w) - \min \limits_{w \in \mathcal{W}} F(w) \geq 0$.
So Markov's inequality can used on the random variable $F(w) - \min \limits_{w \in \mathcal{W}} F(w)$:
$P(F(w) - \min \limits_{w \in \mathcal{W}} F(w) \geq \frac{E[F(w) - \min \limits_{w \in \mathcal{W}} F(w)]}{\delta}) \leq E[F(w) - \min \limits_{w \in \mathcal{W}} F(w)] \cdot \frac{\delta}{E[F(w) - \min \limits_{w \in \mathcal{W}} F(w)]} \leq \delta$
So, with probability $1-\delta$: $F(w) - \min \limits_{w \in \mathcal{W}} F(w) \leq \frac{E[F(w) - \min \limits_{w \in \mathcal{W}} F(w)]}{\delta} = \frac{E[F(w)] - \min \limits_{w \in \mathcal{W}} F(w)}{\delta} \leq \frac{\min \limits_{w \in \mathcal{W}} F(w) - \min \limits_{w \in \mathcal{W}} F(w) + \epsilon \cdot \delta}{\delta} = \epsilon$
Thus, $F(w) \leq \min \limits_{w \in \mathcal{W}} F(w) + \epsilon$. Using the assumption about the expectation of the risk, given sample complexity of $m(\epsilon \cdot \delta)$ samples.
So, I showed what was needed. But, My sample complexity does not mention C. This is probably wrong. I tried replaceing $F(w) - \min \limits_{w \in \mathcal{W}} F(w)$ with $F(w) +C$ and following the same logic, but didn't arrive at anything neat.
Please help!