I know the definition of the three convergences. My question is, in statistical, machine learning and/or engineering applications, do the stronger convergences have any advantage over the weaker convergences? Say the stronger the faster, which I know is unnecessary.
Can anyone figure out the advantages? Better to have an example.
thanks.
Without a convergence rate, none of the convergence theorem's are of much practical use. In particular, almost sure and in probability convergences are only of theoretical interest, while convergence in distribution is central to statistics but is usually combined with some rules of thumb on when convergence is "good enough".
However, to answer your question: You'd like to know that your estimator converges almost surely since not only does that imply convergence "in probability" but it ensures that an experiment is guaranteed to converge to the correct answer. However, convergence rate is far more important than the type of convergence - an estimator that only converges "in probability" but at an exponential rate is much preferred to a strongly converging estimator, but one that does so very slowly.
In my own field of environmental engineering, the data are so messy that neither types of "first order convergences" are very useful, its much more important to know the shape of the uncertainty than how fast our risk estimates converge. For example, in probabilistic risk analysis, I really need to know if my uncertainty is Weibull or Gamma or Gumbel, not whether a certain exposure estimate is known to converge "almost surely".