Probability that a linear Diophantine equation has a solution

240 Views Asked by At

Given a linear Diophantine equation $(\mathcal D)$ with two variable $x$ and $y$:

$$(\mathcal D):\quad ax+by=c$$

I want to calculate the probability $P$ that $(\mathcal D)$ has a solution.

We know that $(\mathcal D)$ has a solution if, and only if, $d:=\gcd(a,b)$ divides $c$.

So if $a$ and $b$ are coprime, $(\mathcal D)$ has a solution, so using a result on coprime integers, we have

$$p\geqslant \mathbb P(d=1)=\frac 1{\zeta(2)}=\frac 6{\pi^2}.$$

Since $d$ must divides $c$, for each $d$ there is $\frac 1d$ integers which will work for $c$, so

$$p=\sum_{k=1}^\infty \mathbb P(d=k)\frac 1k.$$

Then we can majorate $\mathbb P(d=k)$ by $\mathbb P(d\geqslant k)=\frac 1k\times \frac 1k$ for $k\geqslant 2$.

So

$$p=\mathbb P(d=1)\times 1+\sum_{k=2}^\infty\mathbb P(d=k)\frac 1k\leqslant \frac 1{\zeta(2)}+\sum_{k=2}^\infty\frac 1{k^2}\frac 1k=\frac 1{\zeta(2)}+\zeta(3)-1.$$

What we get so far with our rough analysis is:

$$\frac 1{\zeta(2)}\leqslant p\leqslant \frac 1{\zeta(2)}+\zeta(3)-1$$

$$0.6079<p<0.8100.$$

I have read somewhere (if I understand the article correctly) that

$$\mathbb P(d=k)=\frac 1{\zeta(2)k^2}.$$

This would mean that

$$p=\sum_{k=1}^\infty \mathbb P(d=k)\frac 1k=\frac 1{\zeta(2)}\sum_{k=1}^\infty\frac 1{k^3}=\fbox{$\frac {\zeta(3)}{\zeta(2)}\approx0.7308$}.$$

My questions.

  • Thanks to comments, I have the new following question: can I formalize this approach to make it rigorous, with a well chosen probabilistic distribution?

  • I am really unused to do probabilities, so is the following approach correct?

  • Can we generalize the result / the approach to find $p_n$ the probability that a linear Diophantine equation with $n$ variables has a solution?

  • If we can't get an exact value for $n$ variables, can we get a numerical approximation of $p_n$?

  • What is the limit of $p_n$?

1

There are 1 best solutions below

0
On

The modeling

The text does not mention the adopted signal model. But this model is presented in the article cited. This allows us to take into account the model in which $a,\ b,\ c$ distributed over a uniform discrete law $$a,b,c\in[1,n]\cap\mathcal N,\tag1$$ $$\Pr(a=m) = \Pr(b=m) = \Pr(c=m) = \begin{cases} {1\over n},\quad\text{if}\quad m\in[1,n]\cap\mathcal N,\\ 0,\quad\text{otherwise}. \end{cases}$$

It will be also considered the series of dividers $$p_{0\dots5} = 2,3,4,5,7,9;\quad p_{k} = k\log(k)\text{ for }k>5.\tag2$$ (the table for $k>5$ is there).

The index $k_{max}$ of greatest simple divider for $k\in[1,n]$ must satisfy the inequality $$k_{max}(n)\log(k_{max}(n)) < n,$$ $$k_{max}(n) = [e^{W(n)}],$$ where $W(x)$ is the Lambert W function (see also Wolfram Alpha)

It should be borne in mind that in the model under consideration, prime divisors are computed approximately, and accounting for the powers of prime divisors is bounded by the cases $ p_k = 4,\ 9.$ This could reduce the final result.

Will be assumed that $$\Pr(p\mid m) = {1\over p}\Pr(p\le m) = {n-p\over np},\quad\text{if}\quad m\in[1,n]\cap\mathcal N.$$

Calculations for the fixed $\mathbf n$

Let $S(p_k)$ is event "parameters $p,a,b,c$ allow the solution", then the opposite event $\overline{S(p_k)}$ is

$$\overline{S(p_k)} = (p_k\mid a) \wedge (p_k\mid b) \wedge (p_k\nmid c),$$ where the events $(p_k\mid a),\ (p_k\mid b),\ (p_k\nmid c)$ are independent and $$\Pr(p\mid a) = \Pr(p\mid b) = \Pr(p\mid c) = {n-p\over np},\quad \Pr(p\nmid c) = 1-{n-p_k\over np_k},$$ so $$\Pr(\overline S(p_k)) = \left({n-p_k\over np_k}\right)^2\left(1-{n-p_k\over np_k}\right),$$ $$\Pr(S(p_k)) = 1 - \left({n-p_k\over np_k}\right)^2\left(1-{n-p_k\over np_k}\right).\tag3$$

If $k=0\dots5$, then
$$p_k << n\rightarrow \Pr(S(p_k)) = 1-{p_k-1\over p_k^3},$$

The probability that the Diophantine equation has a solution will be asserted as the product of the corresponding probabilities for each of the dividers $p_k$. For example, $$P_{(2\dots9)} = \Pr(S(2)\wedge S(3)\wedge S(4)\wedge S(5)\wedge S(7)\wedge S(9)) = {7\over8} \cdot{25\over27} \cdot{61\over64} \cdot{121\over125} \cdot{337\over343} \cdot{721\over729},$$ $$P_{(2\dots9)}\approx0.726362.$$ Then for $n>>10$ $$P_n \approx \Pr(\bigwedge\limits_{k=0\dots k_{max}(n)} S) = P_{(2\dots9)} \prod\limits_{k=6}^{k_{max}(n)} \Pr(S(k\log(k))),$$ $$P_n\approx 0.726362e^{R(n)},\tag4$$ $$R(n) = \sum\limits_{k=6}^{[e^{W(n)}]} \left.\log\left(1 - \left({n-p_k\over np_k}\right)^2\left(1-{n-p_k\over np_k}\right)\right)\right|_{p_k=k\log(k)} \tag5.$$ For example, $$R(100) \approx -0.017900$$ (see also exp W(100) and R(100)), $$P_{100} \approx 0.726362e^{-0.017900} = 0.713476.$$ For $n>100$ $$P_n\approx 0.713476e^{Q(n)},\tag6$$ $$Q_n = \sum\limits_{k=30}^{[e^{W(n)}]} \lim\limits_{p_k\to k\log(k)}\log\left(1 - \left({n-p_k\over np_k}\right)^2\left(1-{n-p_k\over np_k}\right)\right),$$ $$Q_n \approx -\sum\limits_{k=30}^{[e^{W(n)}]} \lim\limits_{p_k\to k\log(k)}\left({n-p_k\over np_k}\right)^2\left(1-{n-p_k\over np_k}\right),$$ For example, $$Q(100000) \approx -0.001939$$ (see also exp W(100000) and Q(100000)), $$P_{100000} \approx 0.713476e^{-0.001939} = 0.712094.$$

This approach gives $$p\approx 0.712$$

On the other hand, direct multiplication of probabilities for factors $$p\in\{2,3,4,5,7,8,9,11,13,16,17,19,23,25,27,29,31,32,37,41,47\}$$ gives $$\boxed{p<0.641796}.$$

Conclusion

Comparison of the OP estimates with the results of accurate modeling shows that the evaluation of the required probability should be substantially reduced.

In the same time, the stability of the model parameters is achieved due to the identity of the laws of the distribution of the parameters $a,b,c.$

The proposed model can be used for the estimations based on the another distribution laws.