I am trying to calculate $y = 1-(1-x)^n$ for $n$ in the 50-500 range, where $x$ ranges widely between, say, $10^{-20}$ and 1.
The straightforward way is not numerically robust, and loses precision under $y < 10^{-15}$ or so. Example in Python:
>>> import numpy as np
>>> x = np.logspace(-18,0,19)
>>> x
array([ 1.00000000e-18, 1.00000000e-17, 1.00000000e-16,
1.00000000e-15, 1.00000000e-14, 1.00000000e-13,
1.00000000e-12, 1.00000000e-11, 1.00000000e-10,
1.00000000e-09, 1.00000000e-08, 1.00000000e-07,
1.00000000e-06, 1.00000000e-05, 1.00000000e-04,
1.00000000e-03, 1.00000000e-02, 1.00000000e-01,
1.00000000e+00])
>>> 1 - (1-x)**20
array([ 0.00000000e+00, 0.00000000e+00, 2.22044605e-15,
1.99840144e-14, 1.99840144e-13, 2.00062189e-12,
1.99995576e-11, 2.00000017e-10, 2.00000017e-09,
1.99999992e-08, 1.99999982e-07, 1.99999810e-06,
1.99998100e-05, 1.99981001e-04, 1.99810114e-03,
1.98111352e-02, 1.82093062e-01, 8.78423345e-01,
1.00000000e+00])
The binomial formula says that I could alternatively calculate $y = nx - \frac{n(n-1)}{2}x^2 + \frac{n(n-1)(n-2)}{6}x^3 - \ldots - {n\choose k}(-x)^k + \ldots - n(-x)^{n-1} - (-x)^n$.
But that doesn't seem numerically robust (because of the size of $n\choose k$ for large n) unless I cut off the series after an appropriate number of terms that depends on each value I'm trying to compute.
For example, $nx - \frac{n(n-1)}{2}x^2$ is very accurate for $x$ in the $10^{-20}$ to $10^{-10}$ range. But I need more terms the larger $x$ gets.
>>> 20*x - 20*19/2*x*x
array([ 2.00000000e-17, 2.00000000e-16, 2.00000000e-15,
2.00000000e-14, 2.00000000e-13, 2.00000000e-12,
2.00000000e-11, 2.00000000e-10, 2.00000000e-09,
1.99999998e-08, 1.99999981e-07, 1.99999810e-06,
1.99998100e-05, 1.99981000e-04, 1.99810000e-03,
1.98100000e-02, 1.81000000e-01, 1.00000000e-01,
-1.70000000e+02])
Is there a better way to calculate this value, that works not only for large x but for small x?
Compute it as
For small $x$,
log1p(-x)computes the expected $\ln(1-x)=-x+O(x^2)$. For small $u$,expm1(u)computes $e^u-1=u+O(u^2)$. In combination, you should get the best accuracy for your formula.which returns with
You will have to switch formulas for $x\ge 1$, as then the logarithm is no longer defined as real function.