Is there a polynomial (or series) expression for summing $S_d(a,N)=\sum_{k=0}^{N-1} \log(1+{1\over a+k \cdot d})$? (perhaps Bernoulli-type)

221 Views Asked by At

I need a quickly evaluatable expression for sums of consecutive logarithms of the type $$ S_{d}(a,N) = \log(1+ {1\over a})+\log(1+ {1\over a+d})+\log(1+ {1\over a+2d})+ \cdots + \log(1+ {1\over a+(N-1)d}) $$ Here $N$ is typically very large (as well as $a$ while $d$ may be a small integer) why it is not feasible to evaluate the direct sum. The goal is to find some $a$ (not necessarily integer) with given fixed $d$ and $N$ which gives the sum a pre-determined value. For instance by a binary search inside a range of $a_\min$ and $a_\max$, and such a search needs to evaluate the sum multiple times.
I thought of some idea like assuming a Hurwitz-zeta-like ansatz of an infinite series and take the difference of two "exemplars" of them $$ \begin{array}{lll} S^{^\star}_{d}(a) &= \log(1+ {1\over a})+\log(1+ {1\over a+d})+\log(1+ {1\over a+2d})+ \cdots \\ S_d(a,N) & \overset?= S^{^\star}_{d}(a/d) - S^{^\star}_{d}(a/d+N) \end{array}$$

Fumbling with transposing the Bernoulli-polynomials as a matrix of coefficients I seem to have got a possible solution, but perhaps there is a well known expression, perhaps like $\psi()$ function (in Pari/GP) or the like, or possibly even some more obvious expression.

Q: Is there a polynomial or powerseries-expression for the finite sum of $\log(1+1/a_k)$ with equally spaced $a_k$ (having integer difference)?

Update: I've now an exposition of that method which employs the Bernoulli-numbers /Zeta-values for an asymptotic series. It was too difficult to start this explanation in an answerbox here. I began with a draft in a pdf-file (see here) which I shall convert soon in a valid answer here.

2

There are 2 best solutions below

4
On BEST ANSWER

I do not know how much this could help.

$$\sum_{k=0}^{N-1} \log(1+{1\over a+k \, d})=\sum_{k=0}^{N-1} \log(1+ a+k \, d)-\sum_{k=0}^{N-1} \log( a+k \, d)$$ $$\sum_{k=0}^{N-1} \log(b+k \, d)=\log\left(\prod_{k=0}^{N-1} (b+k \, d) \right)=\log \left(b d^{N-1} \left(\frac{b+d}{d}\right)_{N-1}\right)=\log \left(\frac{b d^{N-1} \Gamma \left(\frac{b}{d}+N\right)}{\Gamma \left(\frac{b+d}{d}\right)}\right)$$ $$S_{d}(a,N) =\log \left(\frac{\Gamma \left(\frac{a}{d}\right) \Gamma \left(\frac{a+1}{d}+N\right)}{\Gamma \left(\frac{a+1}{d}\right) \Gamma \left(\frac{a}{d}+N\right)}\right)$$ Using the log gamma function could make the problem "simple" from a numerical point of view.

The derivative is $$\frac{\partial} {\partial a} S_{d}(a,N)=\frac{\psi \left(\frac{a+1}{d}+N\right)-\psi \left(\frac{a}{d}+N\right)+\psi \left(\frac{a}{d}\right)-\psi\left(\frac{a+1}{d}\right)}{d}$$

I tried using $d=3$, $N=10^6$ and $S_{3}(a,10^6)=4.567$ using Newton method with $a_0=1$. The iterates are $$\left( \begin{array}{cc} n & a_n \\ 0 & 1.00000 \\ 1 & 2.19160 \\ 2 & 3.54105 \\ 3 & 4.19248 \\ 4 & 4.27054 \\ 5 & 4.27141 \end{array} \right)$$

It seems that the function looks like an hyperbola with an infinite branch when $a \to 0^+$. Trying polynomials does not seem (at least to me) to be a good idea.

Edit

If $N$ is really large, using Stirling approximations $$S_{d}(a,N)\sim \log \left(\frac{\Gamma \left(\frac{a}{d}\right)}{\Gamma \left(\frac{a+1}{d}\right)}\right)+\frac{\log \left({N}\right)}{d}$$ Now, using an expansion around $a=1$ would give as a crude estimate $$a=1+\frac {d \left(S_{d}(a,N)-\log \left(\frac{\Gamma \left(\frac{1}{d}\right)}{\Gamma \left(\frac{2}{d}\right)}\right)\right)-\log (N) } {\psi \left(\frac{1}{d}\right)-\psi \left(\frac{2}{d}\right) }$$

For the worked example, this gives almost exactly the first iterate of Newton method : $2.191599091$ instead of $2.191599310$. So, this is not of any interest.

For the generation of the starting guess, we can avoid the use of the $\Gamma(.)$ and $\psi(.)$ functions using the integral over $k$ instead of the sum. This would then require to solve (even approximately) for $a$ the equation $$d\,S_{d}(a,N)=-(a+d (N-1)) \log (a+d (N-1))+(a+d (N-1)+1) \log (a+d (N-1)+1)+a \log (a)-(a+1) \log (a+1)$$ Considering that $N$ is large, this could reduce to $$(a+1) \log (a+1)-a \log (a)=C\qquad \text{where} \qquad C=\log (edN)-d\,S_{d}(a,N)$$ Using a quick and dirty regression (data were generated for the range $0 \leq a \leq 10^4$) revealed as a good approximation $$\log(a)=C-1$$

Applied to the worked example, this would give $a=3.364$

4
On

The following are precision-comparisions of the two methods.Update, see end


Precision


Claude Leibovici's solution is called "su_lngamma" and my own (provisorical) approximation using the modified ZETA-matrix (involving the Bernoulli numbers) giving 16 or 32 coefficients for a powerseries is called "su_zetamat". Default numerical precision in Pari/GP was $200$ dec digits:

    N=100             d=3                                                                           
        a              su      su_lngamma        err_lngamma       su_zetamat         err_zetamat
  -----------------------------------------------------------------------------  ----------------
       10       1.1773591       1.1773591   (2.7863493 E-200)       1.1773591  (0.00000000030768858)
      100      0.46460322      0.46460322  (-4.4142713 E-200)      0.46460322       (3.6176459 E-27)
     1000     0.087531701     0.087531701   (1.9429811 E-199)     0.087531701       (3.4555879 E-44)
    10000    0.0098539050    0.0098539050   (1.7019105 E-198)    0.0098539050       (1.3751738 E-61)
   100000   0.00099851296   0.00099851296   (1.7766216 E-197)   0.00099851296       (1.7278168 E-79)
  1000000  0.000099985103  0.000099985103   (4.0133228 E-196)  0.000099985103       (1.7700314 E-97)

  N=1000000         d=3                                                                   
          a         su   su_lngamma      err_lngamma     su_zetamat           err_zetamat
  ----------------------------------------------------------------------  ----------------
         10   4.2376194   4.2376194  (-1.0962866 E-196)   4.2376194  (0.00000000030768858)
        100   3.4396673   3.4396673  (-9.8020930 E-196)   3.4396673       (3.6176459 E-27)
       1000   2.6692336   2.6692336  (-7.2429241 E-196)   2.6692336       (3.4959594 E-44)
      10000   1.9024033   1.9024033  (-4.3753696 E-196)   1.9024033       (3.4815251 E-61)
     100000   1.1446656   1.1446656   (1.5592125 E-196)   1.1446656       (3.4800708 E-78)
    1000000  0.46209837  0.46209837   (7.7997020 E-196)  0.46209837       (3.4800344 E-95)

Obviously the lngamma()-solution as implemented in Pari/GP provides maximal precision while the use of the Zeta-matrix and the $O(x^{16})$ resp $O(x^{32})$ truncation of its (only asymptotic!) powerseries has of course little precision for small startvalues $a$.

For the application of the problem (eventually searching an integer (lower bound) for $a$) the orders of the found errors show, that the errors themselves are a minuscle problem, but of course one likes an arbitrary-precise solution much much more than an asymptotic one, if the other requirements are not too costly...


Timing


The average timing with large $a$ and $N$ was $1.3 \text{msec}$ for the lngamma and $0.3 \text{msec}$ for the matrix-solution.
I've not yet checked the binary-search/Newton-iteration application for time-consumption, I'll add this soon. (Thanks also to Claude for the simple(!) idea to apply Newton, I've just crooked around with a binary search...)

Given some value $\Lambda$ for the sum-of-logarithms (given $d,N$) we can find a very nice estimate $a_\text{init}$ for the $a$ for the Newton-algorithm:
From $$ \Lambda = \sum_{k=0}^{N-1} \log(1+1/(a+k \cdot d)) $$ using the logarithmic expression which occurs in the zetamatrix-method (I'll explain this method soon in another answer) this gives
$$ a_\text{init} = N \cdot {d \over \exp(\Lambda\cdot d)-1} \\ \small{\text{with } \mid a-a_\text{init} \mid \lt 1 \text{ heuristically}} $$
$\qquad \qquad$ (1)(previous, insufficient solution moved to bottom of text)

  • Binary search

With this I get the average timeconsumption for the binary-search with the su_lngamma-solution of about $37 \text{msec} $ with my default real-precision of $200$ digits.

  • Newton-Algorithm

Testing the su_lngamma with the Newton-Algorithm (with inital values $a_\text{init}$ using random $\Lambda$ as targets I get an average time of $14 \text{ msec }$ getting nearly full precision of $200$ decimal digits for the found $a$ which satisfies the equation.
Using the su_zetamat method I get an average finding-time of $6.5 \text{ msec }$ with error about $1e-80$ for $a >1000$ and $32$-terms of the powerseries.
It is remarkable that the estimate $a_\text{init}$ is at most $1$ away from the true value $a$ to be found in all experimental tests and even converges from below when $N$ increases.


Numerical issues (Update)

When testing with high $N$ the lngamma()-method ran in numerical problems where the zetamatrix-method stayed stable. Having the default precision for $200$ decimal digits I looked at $N$ from the convergents of the continued fractions of $\beta=\log_2(3)$ and $\Lambda$ defined using $S=\lceil \beta \cdot N \rceil$ as $$\Lambda = S \log 2 - N \log 3$$
For application of the methods discussed here in the thread for the problem of estimates of upper bounds for the minimal element $a_1$ for attempted Collatz-cycles the basic sum-of-logarithms entries must get a scaling factor of $m=3$ such that we try to fit the given $\Lambda$ with that sum $$ \small{\Lambda = \log(1+1/3/a_1) + \log(1+1/3/(a_1+d)) + \log(1+1/3/(a_1+2d)) +...+\log(1+1/3/(a_1+(N-1)d))} \\ \qquad \qquad \to \text{search that $a_1$ that gives equality} $$ That additional coefficient $m$ can easily be introduced into the lngamma() and psi() function given by Claude Leibovici as well as in the matrix-computation.

The convergents of the continued fraction gives values for $\Lambda$ alternating going towards zero or towards $\log 2$. With default real precision of $200$ I could go to the 16'th convergent before the Newton-algorithm with the lngamma()-method ran into numerical errors and diverged. I needed $800$ digits internal precision to allow $N$ from the $\approx 90$'th indexes, while the zetamatrix-method gave always reliably its approximates.

Note that the ini-values for the Newton-algorithm was the simple expression $$a_\text{ini}= {d \cdot N \over \exp( d \cdot m \cdot \Lambda)-1}= {3 N \over \exp( 9 \Lambda)-1}$$ and gave the extremely nice initial values at most $1.3333...$ aside of the final value for $a_1$.

mystat(cvgts[1,12])
 N=79335  S=125743   m=3 d=3
 sub_cyc=90957.1975845
     cyc=272871.592753                 (lower bounds for a1) by "1-cycle"-formula
 ---------
     ini=7215983491.01                 (upper bounds for a1)
     seq=7215983492.34    by Newton-lngamma 16 msec
     seq=7215983492.34    by Newton-mat 16 msec
    mean=7216102492.69                    (by a_1 ~ mean(a_k) formula)

mystat(cvgts[1,22])
 N=6586818670  S=10439860591   m=3 d=3
 sub_cyc=32927907417.2
     cyc=98783722251.5                     (lower bounds for a1)
 ---------
     ini=2.16890155331 E20                 (upper bounds for a1)
     seq=diverg    by Newton-lngamma 15 msec
     seq=2.16890155331 E20    by Newton-mat 16 msec
    mean=2.16890155341 E20

Changing precision to 800 digits

prec=800 display=g0.12
mystat(cvgts[1,22])
 N=6586818670  S=10439860591   m=3 d=3
 sub_cyc=32927907417.2
     cyc=98783722251.5                     (lower bounds for a1)
 ---------
     ini=2.16890155331 E20                 (upper bounds for a1)
     seq=2.16890155331 E20    by Newton-lngamma 187 msec
     seq=2.16890155331 E20    by Newton-mat 47 msec
    mean=2.16890155341 E20

mystat(cvgts[1,32])
 N=5750934602875680  S=9115015689657667   m=3 d=3
 sub_cyc=3.13413394194 E15
     cyc=9.40240182583 E15                 (lower bounds for a1)
 ---------
     ini=1.80241993368 E31                 (upper bounds for a1)
     seq=1.80241993368 E31    by Newton-lngamma 187 msec
     seq=1.80241993368 E31    by Newton-mat 47 msec
    mean=1.80241993368 E31

mystat(cvgts[1,92])
 N=31319827079776296150692564373472726097745399  S=49640751450516424688384944890954638315952916   m=3 d=3
 sub_cyc=1.22784746078 E44
     cyc=3.68354238234 E44                 (lower bounds for a1)
 ---------
     ini=3.84559701519 E87                 (upper bounds for a1)
     seq=3.84559701519 E87    by Newton-lngamma 94 msec
     seq=3.84559701519 E87    by Newton-mat 16 msec
    mean=3.84559701519 E87

mystat(cvgts[1,93])
 N=252466599014583305866715048411999465710443694  S=400150092122719341414742538102872703744992098   m=3 d=3
 sub_cyc=0.333333333333
     cyc=1.00000000000                     (lower bounds for a1)
 ---------
     ini=1.48219138365 E42                 (upper bounds for a1)
     seq=1.48219138365 E42    by Newton-lngamma 140 msec
     seq=1.48219138365 E42    by Newton-mat 31 msec
    mean=1.21410770129 E44

Reducing precision to 400 digits

prec=400 display=g0.12
mystat(cvgts[1,24])
 N=137528045312  S=217976794617   m=3 d=3
 sub_cyc=370924750025.
     cyc=1.11277425008 E12                 (lower bounds for a1)
 ---------
     ini=5.10125558286 E22                 (upper bounds for a1)
     seq=5.10125558286 E22    by Newton-lngamma 31 msec
     seq=5.10125558286 E22    by Newton-mat 16 msec
    mean=5.10125558288 E22

mystat(cvgts[1,44])
 N=205632218873398596256  S=325919355854421968365   m=3 d=3
 sub_cyc=3.74500056370 E21
     cyc=1.12350016911 E22                 (lower bounds for a1)
 ---------
     ini=7.70092775595 E41                 (upper bounds for a1)
     seq=7.70092775595 E41    by Newton-lngamma 32 msec
     seq=7.70092775595 E41    by Newton-mat 15 msec
    mean=7.70092775595 E41

mystat(cvgts[1,54])
 N=2438425051595107335801557824  S=3864812267597295609689840565   m=3 d=3
 sub_cyc=1.76555780448 E27
     cyc=5.29667341343 E27                 (lower bounds for a1)
 ---------
     ini=4.30518038047 E54                 (upper bounds for a1)
     seq=diverg    by Newton-lngamma 47 msec
     seq=4.30518038047 E54    by Newton-mat 0 msec
    mean=4.30518038047 E54

mystat(cvgts[1,64])
   *** for: division by zero

Surely, the problem is here with the Pari/GP-implementation of the lngamma() and/or the psi()-functions, and possibly other software works here more robustly.

Finally, it really amazes me, that the $a_\text{ini}$ values is so precise without any Newton-algorithm at all and I consider to just accept this value, possibly corrected a little bit upwards as upper bound for the $a_1$ for this estimation-method (based on the sum of logarithms of linearly progressing arguments).



Conclusion

Well, after the finding routines go for about 15 msec (nearly independently of size of $N,d$ and $\Lambda$) plus the advantage of the system-immanent precision of the lngamma() and psi() function I think the solution provided by Claude Leibovici is much favorable over my initial rough approach using the ansatz via Bernoulli-numbers/ZETA-matrix.


(1) Original, but insufficient initializing was:
We can determine, assuming $d=0$ an upper bound for $a$ (call it $a_u$): $$ \Lambda = N \log(1+1/(a_u+0)) $$ then $$ a_u=1/(\exp(\Lambda/N)-1) \qquad \qquad \qquad \; \\ a_l=a_u-N \cdot d \qquad \qquad \text{lower bound} $$