I am currently reading up on uniform convergence in probability and I have been trying to familiarize myself with the uniform law of large numbers. As I am not a mathematician or statistician by education, however, I found myself struggling with a few aspects of that proof.
I'll try to be concise. However, I will gladly give a bit more context, if that helps.
Consider a probability space $(\Omega,\mathcal{F},\mathbb{P}_{\theta_0})$ and a sequence of random variables $X_j:\Omega\to\mathbb{R}$.
The assumptions for the following are:
a)$\quad$The parameter space $\Theta$ is compact
b)$\quad g(X,\theta)$ is continuous at each $\theta\in\Theta$ with probability one
c)$\quad g(X,\theta)$ is dominated by a function G(X) such that $|g(X,\theta)|\leq G(X)$
d)$\quad E[G(X)]<\infty$
Question 1
Am I supposed to read assumption b) as
$\mathbb{P}(\theta\mapsto g(X,\theta)\,\text{is continuous}$) $=1$
or as
$\mathbb{P}(\theta\mapsto g(X,\theta)\,\text{is continuous at }\theta^*$) $ = 1$$\quad\forall\quad\theta^*\in\Theta$?
If that is not clear from the way assumption b) is formulated, I can give a instance of where it is used. I do understand that the first would imply the latter, I'd just like to know whether the latter would suffice.
More importantly:
Question 2
Somewhere, in the middle of the proof, it is stated that $$\max_{k\in K}\sup_{\theta\in\Theta}\left[n^{-1}\sum_{j=1}^ng(X_j,\theta)-\text{E}\big[g(X_j,\theta)\big] \right]\leq\max_{k\in K}\left[n^{-1}\sum_{j=1}^n\sup_{\theta\in\Theta_k}g(X_j,\theta)-\text{E}\big[\inf_{\theta\in\Theta_k} g(X_j,\theta)\big] \right]$$
While it seems viscerally plausible to me, in the sense that I know that $$\sup_x\,\sum\,f_j\,(x)\leq\sum\sup_xf_j\,(x)$$ and since it would make sense that for every $\theta$ $$\text{E}\big[ g(X_j,\theta)\big]\geq\text{E}\big[\inf_{\theta_*} g(X_j,\theta_*)\big],$$ I am afraid that I don't see how this inequality is established - in particular why does the infimum appear and how does it move into the expectations operator. I would very much appreciate to see some sort of step-by-step line of reasoning as to why that is true.
Question 3
I might be at a loss there, but how do I see that for any arbitrary $\varepsilon>0$ $$o_p-\varepsilon\leq\inf_{\theta\in\Theta}\left[n^{-1}\sum_{j=1}^ng(X_j,\theta)-\text{E}\big[g(X_j,\theta)\big] \right]\leq\sup_{\theta\in\Theta}\left[n^{-1}\sum_{j=1}^ng(X_j,\theta)-\text{E}\big[g(X_j,\theta)\big] \right]\leq o_p(1)+\varepsilon$$ implies that $$\sup_{\theta\in\Theta}\left|\,n^{-1}\sum_{j=1}^ng(X_j,\theta)-\text{E}\big[g(X_j,\theta)\big]\, \right|\,\overset{p}{\longrightarrow}0.$$
Thank you very much for your help.
Best,
Jon