Suppose $h$ is a discrete random variable with probability of seeing $h_i$ proportional to $h_i$, and moments $M_k=E[h^k]$ exist for $k=1,2,\ldots$. Is it possible to get bounds on $s$ where the following holds, in terms of $M_k$?
$$E[(1-h)^s]<\epsilon$$
edit This problem comes up in average case analysis of gradient descent. $s$ gives the number of gradient descent steps needed to obtain average loss target $\epsilon$ when minimizing quadratic form with eigenvalues a multiple of $h_1,h_2,\ldots,$. Algorithms exist for computing $M_k$ in in practice, for small values of $k$, algorithm for $s$ is missing. A different approach which doesn't rely on $M_k$ but makes strong assumptions on form of $h$ is given here
Partial way towards solution.
Use the following approximation: $$(1-h)^s \approx exp(-h s)$$
Now, without loss of generality, assume $M_1=1$. Then $h_i$ add up to 1, and we can write expectation as
$$E[(1-h)^s]\approx \sum_{i} \exp[-h_i s] h_i$$
This is the moment generating function $g(s)$ of the following distribution: $$p(-h_i)=h_i$$
Hence the problem comes down to finding smallest $s$ such that $g(s)<\epsilon$
The Taylor approximation of $g$ is written in terms of $M_1,M_2,\ldots$, now need to find how to bound the error of Taylor approximation if we drop terms beyond first $k$
Edit Taylor approximation actually tends to give quite a poor fit to the solution. Example of plotting true mgf vs first 5 approximation orders: