Elementary proof of Wright-Fisher's time until fixation formula?

237 Views Asked by At

Consider the Markov process $\def\Z{\mathbb{Z}}\def\N{\mathbb{N}}\def\E{\text{E}}\newcommand{\bp}[1]{{\left({#1}\right)}}(X_t^n)_{t\in\Z_+}$ given by $X_0^n = p \in [0,1]$ and $X_{t+1}^n|X_t^n \sim \frac1n \text{Binomial}(n, X_t)$, where $n\in\N$. In other words, $X_t^n \in \{0, \frac1n, \ldots, \frac{n-1}n, 1\}$ for $t\geq 1$, and $$\Pr\bp{X_{t+1}^n = \frac kn \mid X_t^n = p} = \binom{n}{k} p^k (1-p)^{n-k}.$$ It's easy to see that $X_t^n$ is a martingale and it converges almost surely to either $0$ or $1$. Let $\tau_n = \min\{t\geq1 : X_t^n \in \{0, 1\}\}$ be the time until fixation. It's known that $$\lim_{n\to\infty} \E\bp{\frac{\tau_n}n \mid X_0=x} = -2x\log x - 2(1-x)\log(1-x).$$ The proof of that formula requires using a diffusion approximation, which I find super hard to construct (it's done rigorously in this book, for example). Can you help me find an elementary proof?

My idea is the following: let $f_n(x) := \E(\frac1n\tau_n|X_0=x)$. We have $f_n(x) = \frac1n + \sum_{k=1}^{n-1} p_n(x,k) f_n(k/n)$ for $x\in[0,1]$, where $p_n(x,k) := \binom{n}{k} x^k (1-x)^{n-k}$. For each $n$, $f_n(x)$ is a polynomial in $x$, so we get $$\sum_{k=0}^n p_n(x,k) \bp{f_n\bp{\frac kn} - f_n(x)} + \frac1n\bp{1 - p_n(x,0) - p_n(x,1)} = 0.$$ We can do a Taylor expansion $$\sum_{k=0}^n p_n(x,k) \bp{f_n'(x)\bp{\frac kn - x} + \frac12 f_n''(x) \bp{\frac kn - x}^2 + \frac16 f_n'''(\xi_k) \bp{\frac kn - x}^3} + \frac1n\bp{1 - p_n(x,0) - p_n(x,1)} = 0.$$ We have $\sum_{k=0}^n p_n(x,k)\bp{\frac kn - x} = 0$, $\sum_{k=0}^n p_n(x,k) \bp{\frac kn - x}^2 = \frac1n x(1-x)$ and $\sum_{k=0}^n p_n(x,k) \bp{\frac kn - x}^3 = \frac{1}{n^2} x(1-x)(1-2x)$, so replacing we get $$\frac12 f_n''(x) x(1-x) = -1 + R_n(x),$$ where $|R_n(x)| \leq \frac1n \|f_n'''\|_\infty x(1-x)(1-2x) + p_n(x,0) + p_n(x,1)$. If we can prove that $f_n, f_n', f_n'', f_n'''$ are all bounded we can pass to the limit, and we get the result. This doesn't sound so hard.

2

There are 2 best solutions below

1
On BEST ANSWER

I could not get a solution in the given framework, at some point one has to use in a clever manner the fact that we have a very specific Markov chain to work with.

I do not have and elementary solution so far, but below there are some thoughts that may be useful in the context. My hope is that the story can be converted in a proof not involving an approximation by continuous processes.


Warming up:

To see what happens in some very small case, let us consider the toy case $N=7$. We fix this $N$ and consider the Markov chain with states displayed (and identified) $k$ instead of $k/N$:

          4       4      7-4
         C  . p(1) . q(1)
          7
             .--.
           /      \
         /          \
       /             V
(x)----o----o----o----o----o----o----(x)
 0     1    2    3    4    5    6     7   

and from each state but the extremal $0$, $7$ we can jump into any other state. Only one such jump was shown. So if we are on the left side, the left side is the "preferred" side to jump into. The transition matrix is explicitly, just for fixing ideas: $$ A = A(7) = \begin{bmatrix}1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ \frac{279936}{823543} & \frac{46656}{117649} & \frac{23328}{117649} & \frac{6480}{117649} & \frac{1080}{117649} & \frac{108}{117649} & \frac{6}{117649} & \frac{1}{823543} \\ \frac{78125}{823543} & \frac{31250}{117649} & \frac{37500}{117649} & \frac{25000}{117649} & \frac{10000}{117649} & \frac{2400}{117649} & \frac{320}{117649} & \frac{128}{823543} \\ \frac{16384}{823543} & \frac{12288}{117649} & \frac{27648}{117649} & \frac{34560}{117649} & \frac{25920}{117649} & \frac{11664}{117649} & \frac{2916}{117649} & \frac{2187}{823543} \\ \frac{2187}{823543} & \frac{2916}{117649} & \frac{11664}{117649} & \frac{25920}{117649} & \frac{34560}{117649} & \frac{27648}{117649} & \frac{12288}{117649} & \frac{16384}{823543} \\ \frac{128}{823543} & \frac{320}{117649} & \frac{2400}{117649} & \frac{10000}{117649} & \frac{25000}{117649} & \frac{37500}{117649} & \frac{31250}{117649} & \frac{78125}{823543} \\ \frac{1}{823543} & \frac{6}{117649} & \frac{108}{117649} & \frac{1080}{117649} & \frac{6480}{117649} & \frac{23328}{117649} & \frac{46656}{117649} & \frac{279936}{823543} \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{bmatrix} $$ Yes, this is a mess, but the eigenvalues are not, they are in decreasing order: $$ \boxed1\ ,\ \boxed1\ ,\ \frac{6}{7}\ ,\ \frac{30}{49}\ ,\ \frac{120}{343}\ ,\ \frac{360}{2401}\ ,\ \frac{720}{16807}\ ,\ \frac{720}{117649} \ . $$ And eliminating the boxed entries, these are: $$ \frac 67\ ,\ \frac {6\cdot 5}{7^2}\ ,\ \frac {6\cdot 5\cdot 4}{7^3}\ ,\ \frac {6\cdot 5\cdot 4\cdot 3}{7^4}\ ,\ \frac {6\cdot 5\cdot 4\cdot 3\cdot 2}{7^5}\ ,\ \frac {6\cdot 5\cdot 4\cdot 3\cdot 2\cdot 1}{7^6}\ . $$ Let $D$ be the diagonal matrix having these entries on the diagonal. Let $S,T$ be base change matrices, inverse to each other, so that $A$ diagonalizes as $$ \tag{$\bbox[yellow]{1}$} A=SDT\ , \qquad D= \begin{bmatrix} \boxed 1\\ & \boxed 1\\ & & \frac{6}{7}\\ & & & \frac{30}{49}\\ & & & & \frac{120}{343}\\ & & & & & \frac{360}{2401}\\ & & & & & & \frac{720}{16807}\\ & & & & & & & \frac{720}{117649} \end{bmatrix}\ . $$ Denote by $$ \tag{$\bbox[yellow]{2}$} \Delta = \begin{bmatrix} \bbox[lightblue]{\boxed 0}\\ & \bbox[lightblue]{\boxed 0}\\ & & \frac{6}{7}\\ & & & \frac{30}{49}\\ & & & & \frac{120}{343}\\ & & & & & \frac{360}{2401}\\ & & & & & & \frac{720}{16807}\\ & & & & & & & \frac{720}{117649} \end{bmatrix} $$ the matrix obtained from $D$ by replacing the boxed entries $\boxed 1$ by $\bbox[lightblue]{\text{zero}}$. The denominators of the eigenvalues are powers of $7$. (It is the reason we took a prime $N$ to illustrate.) The first two $\boxed1\ ,\ \boxed1$ eigenvalues correspond to the absorbant states $0$ and $N=7$. In fact, taking (the left eigenvectors) $w_0=(1,0,\dots,0)$ and $w_N=(0,0,\dots,0,1)$ we have $$ \begin{aligned} w_0A &=w_0\ ,\\ w_NA &=w_N\ . \end{aligned} $$ Define similarly $w_k$ to be the row vector with $(N+1)=8$ entries, all but one being zero, on the state position $k$ is this non-zero entry, and it is $1$. For instance, $w_1=(0,1,0,\dots,0)$.

There is an obvious symmetry in the way the transition is done, and in fact we may identify in computations the states $k$, and $N-k$, i.e. $x=k/N$ and its complement $1-x=(N-k)/N$. (But not in genetics, where the difference counts.) This is also the reason why the result $$ \bbox[yellow]{\qquad f(x) = -2x\log x-2(1-x)\log(1-x) \qquad} $$ shows the same symmetry. But now it comes, let us start with $w_1=(0,1,0,\dots,0)$ instead (of $w_0$, $w_N=w_7$), which corresponds to a starting point $x=1/7$ for the process. Which is the probability to land in $D=\{0,N\}$ before or in step $10$? Numerically computing $w_1A^{10}$ delivers $$ \scriptstyle (\ 0.7874\dots\ ,\qquad 0.0223\dots\ ,\qquad 0.0247\dots\ ,\qquad 0.0244\dots\ ,\qquad 0.0234\dots\ ,\qquad 0.0217\dots\ ,\qquad 0.0182\dots\ ,\qquad 0.0774\dots\ ) $$ and this is an approximative list of probabilities to land after ten steps in $0,1,2,3,4,5,6,7$. So starting in $1$ we land in either $0$ or $7$ with a high probability in or before step ten, it is the sum of the extremal values, $\approx 0.864916775\dots$ -- but we want a general expression.

Let now $\tau = \tau(N) = \tau(7)$ be the time of entrance in the final set of states $D=\{0,N\}$. Let us compute its expectation, conditioned by the start in one of the states $1,2,3,4,5,6$. The process is $X_0,X_1,X_2,\dots$ - and we omit here the upper index $N=7$, which is irritating for me. $$ \begin{aligned} \Bbb E[\ \tau \cdot 1_{\tau \le M}\ |\ X_0=k \ ] &= \sum_{1\le m\le M} m\cdot\Bbb P(\tau=m\ |\ X_0=k) \\ &= \sum_{1\le m\le M} \Bbb P(m\le \tau\le M\ |\ X_0=k) \\ &= \sum_{1\le m\le M} \Big(1-\Bbb P(\tau< m\ |\ X_0=k)\Big) \\ &= M - \sum_{m< M} \Bbb P(\tau< m\ |\ X_0=k) \\ &= M - \sum_{m< M} w_k\cdot A^m\cdot (w_0+w_N)^T \\ &= M - \sum_{m< M} w_k\cdot S\; D^m\; T\cdot (w_0+w_N)^T \\ &= - \sum_{m< M} w_k\cdot S\; \Delta^m\; T\cdot (w_0+w_N)^T \ . \end{aligned} $$ And now we can pass to the limit. We have the formula for the mean value of $\tau=\tau(N=7)$, conditioned by the start in $k$: $$ f_7(k/7) :=\Bbb E[\ \tau\ |\ X_0=k\ ]=w_k\ F(A(7))\ (w_0+w_7)^T\ , $$ where $F$ is the functional calculus for matrices implemented by the function $$ F(\lambda)= \begin{cases} \frac 1{1-\lambda} &\text{ for }\lambda \ne 1\ ,\\ 0 &\text{ for }\lambda = 1\ . \end{cases} $$ This allows us to compute explicitly (i did it with sage): $$ \scriptstyle \begin{aligned} f_7(1/7) &=\frac 17\Bbb E\Big[\ \tau(7)\ \Big| X_0=\frac 17\ \Big] = \frac{2546425}{3587401} \approx 0.70982446623614\dots\ ,\ & f(1/7) &\approx 0.82023263657682\dots \ ,\\ f_7(2/7) &=\frac 17\Bbb E\Big[\ \tau(7)\ \Big| X_0=\frac 27\ \Big] = \frac{3749893}{3587401} \approx 1.0452951872400\dots\ ,\ & f(2/7) &\approx 1.1965391771705\dots \ ,\\ f_7(3/7) &=\frac 17\Bbb E\Big[\ \tau(7)\ \Big| X_0=\frac 37\ \Big] = \frac{4310467}{3587401} \approx 1.2015570603900\dots\ ,\ & f(3/7) &\approx 1.3658162094009\dots \ , \end{aligned} $$ and the next values come by central symmetry w.r.t $N/2$. We are far away from a good approximation. The above data in a table is: $$ \scriptstyle \begin{array}{|r|c|l|l|} \hline k & f_7(k/7) & \approx & f(k/7)\\\hline\hline 1 & \frac{2546425}{3587401} & \approx 0.70982446623614 & 0.82023263657682\\\hline 2 & \frac{3749893}{3587401} & \approx 1.0452951872400 & 1.1965391771705\\\hline 3 & \frac{4310467}{3587401} & \approx 1.2015570603900 & 1.3658162094009\\\hline 4 & \frac{4310467}{3587401} & \approx 1.2015570603900 & 1.3658162094009\\\hline 5 & \frac{3749893}{3587401} & \approx 1.0452951872400 & 1.1965391771705\\\hline 6 & \frac{2546425}{3587401} & \approx 0.70982446623614 & 0.82023263657682\\\hline \end{array} $$

Now i am trying to understand your way to proceed. Above we have explicit values for $f_7(k/7)$ for $k=1,2,3,4,5,6$. We do not know these values, they were obtained not so easy, and they will possibly incorporate for a huge $N$ (instead of $7$) the qualities of the function we need, the claimed $f$.

Now you build a polynomial $\tilde f_7$, that "interpolates somehow" around the true $f_7$, computed only in the points $k/7$. I am using a distinct notation for the polynomial you want to use. But we do not have any control about values the interpolation points! At some essential place in the argumentation we need the information on $A(7)$. And i do not see how to get it for $\tilde f_7$, without knowing something real about the structure. I was really trying to understand the OP point of view, and to make similar ideas work, unfortunately with no success.

Explicitly, $\tilde f_7$ is given by the formula $$ \begin{aligned} \tilde f_7(x) &= \frac 17 + \binom 71 (x^1(1-x)^6 + x^6(1-x)^1)\cdot f_7(1/7) \\ &\qquad\qquad + \binom 72 (x^2(1-x)^5 + x^5(1-x)^2)\cdot f_7(2/7) \\ &\qquad\qquad\qquad\qquad + \binom 73 (x^3(1-x)^4 + x^4(1-x)^3)\cdot f_7(3/7) \ . \end{aligned} $$ Then indeed, $\tilde f_7$ reproduces the values in $k/7$ for $k=1,2,3,4,5,6$. But it is "far away" from the final $f$, since: $$ \tilde f_7(0) = \frac 17= \tilde f_7(1)\ . $$ (This is not important for the step of bounding some first few derivatives of $\tilde f_7$, but shows that $\tilde f_7$ differs from $f_7$ in $0$ and $1$.)

And now my counterquestion: How do you want to easily control the values of interpolation in $k/7$, $k=1,2,3,4,5,6$, without making any usage of the data for the given Markov process? At some core point this input must show up. And it is the "combinatorial complexity" of the transition matrix $A(7)$ that rules the case $N=7$.


Some more experimental data: The same experiment for a bigger value of $N$, say $N=20$ is as follows: $$ \scriptstyle \begin{array}{|r|c|l|l|} \hline k & f_{20}(k/20) & \approx & f(k/20)\\\hline\hline % 1 & \frac{7049188017677089148918383035074123555908980878133556724593594463258218383156940098323116366369}{6823789235596765144140629397025116769329495612742284464608652858407310928270219445149970976240} & \approx 0.36156096283229 & 0.39703048669174\\\hline 2 & \frac{2335814664092743620771139113457792637469879134230458597696621742955688645284519438410381117045}{1364757847119353028828125879405023353865899122548456892921730571681462185654043889029994195248} & \approx 0.59903310624523 & 0.65016594678290\\\hline 3 & \frac{3066866851137700430187978117403306202837994846657109853011948446724461100298583199265123415811}{1364757847119353028828125879405023353865899122548456892921730571681462185654043889029994195248} & \approx 0.78651564463532 & 0.84541817561198\\\hline 4 & \frac{11082195624050198044509776245729220461577721976957164252984248241915031612212168990733435588}{4140648808007745839891158614699706777505761900935852223670299064567543038998919566231778505} & \approx 0.93675379107648 & 1.0008048470764\\\hline 5 & \frac{8242352454110531301679729643769368646296120374656279626498263793099977510627523852061645098615}{2729515694238706057656251758810046707731798245096913785843461143362924371308087778059988390496} & \approx 1.0568993484917 & 1.1246702892376\\\hline 6 & \frac{22444958821943318833036770508497197960931017164921692885572709363738404636686839929695028046929}{6823789235596765144140629397025116769329495612742284464608652858407310928270219445149970976240} & \approx 1.1512277587209 & 1.2217286041098\\\hline 7 & \frac{9533245189029204917354883113933211804063813476242535247406674747794722799275789080450639972109}{2729515694238706057656251758810046707731798245096913785843461143362924371308087778059988390496} & \approx 1.2224277820432 & 1.2948932780693\\\hline 8 & \frac{310050519249895938543275899853764306860070874310658824806810900982118668951562345522587685054}{85297365444959564301757867462813959616618695159278555807608160730091386603377743064374637203} & \approx 1.2722278252250 & 1.3460233340185\\\hline 9 & \frac{204668169752762401897820253160959776770142211496912797141711476828487349284551071140420616811}{55030558351586815678553462879234812655883029135018423101682684341994442969921124557661056260} & \approx 1.3017105688044 & 1.3762776274272\\\hline 10 & \frac{5113842279229708841374847281282699449258692927812485655595027235115853099434657694262394715285}{1364757847119353028828125879405023353865899122548456892921730571681462185654043889029994195248} & \approx 1.3114742674008 & 1.3862943611199\\\hline \end{array} $$ And we are still far away from an approximation.

Note that the involved eigenvalues of the transition matrix are all rational, this happens all the time: $$ \scriptstyle 1\ ,\ 1\ ,\ \frac{19}{20}\ ,\ \frac{171}{200}\ ,\ \frac{2907}{4000}\ ,\ \frac{2907}{5000}\ ,\ \frac{8721}{20000}\ ,\ \frac{61047}{200000}\ ,\ \frac{793611}{4000000}\ ,\ \frac{2380833}{20000000}\ ,\ \frac{26189163}{400000000}\ ,\ \frac{26189163}{800000000}\ ,\ \frac{235702467}{16000000000}\ ,\ \frac{235702467}{40000000000}\ ,\ \frac{1649917269}{800000000000}\ ,\ \frac{4949751807}{8000000000000}\ ,\ \frac{4949751807}{32000000000000}\ ,\ \frac{4949751807}{160000000000000}\ ,\ \frac{14849255421}{3200000000000000}\ ,\ \frac{14849255421}{32000000000000000}\ ,\ \frac{14849255421}{640000000000000000}\ . $$ They are explicitly, if we neglect the two $\boxed 1$ "absorbant" eigenvalues: $$ \frac {19}{20}\ ,\ \frac {19\cdot 18}{20^2}\ ,\ \frac {19\cdot 18\cdot 17}{20^3}\ ,\ \frac {19\cdot 18\cdot 17\cdot 16}{20^4}\ ,\ \frac {19\cdot 18\cdot 17\cdot 16\cdot 14}{20^5}\ ,\ \frac {19\cdot 18\cdot 17\cdot 16\cdot 14\cdot 13}{20^6}\ , \ \dots\ $$ And for $N=40$ the table is, this time omitting the exact values: $$ \scriptstyle \begin{array}{|r|l|l|} \hline k & f_{40}(k/40) & f(k/40)\\\hline\hline % 1 & \approx 0.21673266441314 & 0.23381369827506\\\hline 2 & \approx 0.37223854481506 & 0.39703048669174\\\hline 3 & \approx 0.50406088185097 & 0.53276892653584\\\hline 4 & \approx 0.61875521092950 & 0.65016594678290\\\hline 5 & \approx 0.72004737593955 & 0.75354032251287\\\hline 6 & \approx 0.81024430522157 & 0.84541817561198\\\hline 7 & \approx 0.89088644809320 & 0.92745287963882\\\hline 8 & \approx 0.96306563545260 & 1.0008048470764\\\hline 9 & \approx 1.0275905471024 & 1.0663276814746\\\hline 10 & \approx 1.0850792979196 & 1.1246702892376\\\hline 11 & \approx 1.1360149118420 & 1.1763375547084\\\hline 12 & \approx 1.1807805433076 & 1.2217286041098\\\hline 13 & \approx 1.2196828831926 & 1.2611620567720\\\hline 14 & \approx 1.2529682563011 & 1.2948932780693\\\hline 15 & \approx 1.2808339711431 & 1.3231264763160\\\hline 16 & \approx 1.3034364540774 & 1.3460233340185\\\hline 17 & \approx 1.3208971213928 & 1.3637092174616\\\hline 18 & \approx 1.3333065992289 & 1.3762776274272\\\hline 19 & \approx 1.3407276869489 & 1.3837933184102\\\hline 20 & \approx 1.3431973194352 & 1.3862943611199\\\hline \end{array} $$ We take so far the message, that the error $ \displaystyle \left|f_N\left(\frac kN\right)-f\left(\frac kN\right)\right| $ is of the shape $1/N+$lower terms. I am stopping here the experiments.



An elementary solution?

It seems that everybody uses the diffusion process approximation. This is done in all references i found, first of all the time line $0,1,2,\dots,N$ is rescaled to fit in the interval $[0,1]$, then the prcess is replaced by a continuous one having the same distribution, and the same "main features" (mean, covariance, independent increments, and so on), the continuous process is governed by a generator, a differential operator appears, and we are searching for solution(s) to a differential equation...

For the suggested path in the present question i have no continuation.

So i'll try to go (elementary) my way.

We fix an $N$. Things can be done in generality, but i will display formulas for $N=7$ again. Recall that we have a matrix $A(7)$. It has on the positions pairs (corresponding to the states, and the transition between them) $(0,0)$ and $(N,N)=(7,7)$ an entry equal to one, which in fact leads to the entries $\boxed 1$ in the spectrum (set of eigenvalues) and so on the diagonal of the diagonal matrix $D$, realizing the diagonalization $\bbox[yellow]{(1)}$. We want to get rid of these entries in $D$, thus passing from $D$ to $\Delta$. It is enough to obtain a "partial diagonalization" (only for the data corresponding to the $\boxed 1$ eigenvalues) in the form $$ \begin{bmatrix} \boxed1\\ & Q\\ &&\boxed1 \end{bmatrix}\ . $$ We do this as follows. The column vectors with the entries $N,\dots2,1,0$ and $0,1,2,\dots,N$ are right eigenvectors for $A(N)=A(7)$. So we build the "partial Jordan form" base change matrix: $$ U = U(7) = \begin{bmatrix} 7 & & & & & & & 0 \\ 6 & 1 & & & & & & 1 \\ 5 & & 1 & & & & & 2 \\ 4 & & & 1 & & & & 3 \\ 3 & & & & 1 & & & 4 \\ 2 & & & & & 1 & & 5 \\ 1 & & & & & & 1 & 6 \\ 0 & & & & & & & 7 \end{bmatrix} = \begin{bmatrix} 7 & & 0 \\ c & I & d \\ 0 & & 7 \end{bmatrix} \ . $$ Denote by $V(7)$ the inverse of $U(7)$, here is a good point to introduce it, (and forget about it for a while): $$ V = V(7) = \begin{bmatrix} 1/7 & & & & & & & 0 \\ -6/7 & 1 & & & & & & -1/7 \\ -5/7 & & 1 & & & & & -2/7 \\ -4/7 & & & 1 & & & & -3/7 \\ -3/7 & & & & 1 & & & -4/7 \\ -2/7 & & & & & 1 & & -5/7 \\ -1/7 & & & & & & 1 & -6/7 \\ 0 & & & & & & & 1/7 \end{bmatrix} \ . $$ We write $A(7)$ also in this block format $(1,6,1)\times(1,6,1)$, with the $6\times 6$-matrix "in the middle" denoted by $Q$, and with left and right neighbour $6\times 1$ vectors $a,b$. And we multiply $A(7)\cdot U(7)$. Since the first and the last columns of $U$ are right $1$-eigenvectors, they are kept in place. So $$ \begin{aligned} A(7)\cdot U(7) &= \begin{bmatrix} 1 & 0 & 0\\ a & Q & b\\ 0 & 0 & 1 \end{bmatrix} \cdot \begin{bmatrix} 7 & & 0 \\ c & I & d \\ 0 & & 7 \end{bmatrix} = \begin{bmatrix} 7 & & 0 \\ a & Q & b \\ 0 & & 7 \end{bmatrix} = \begin{bmatrix} 7 & & 0 \\ c & I & d \\ 0 & & 7 \end{bmatrix} \underbrace{ \begin{bmatrix} \boxed 1 & & \\ & Q & \\ & & \boxed 1 \end{bmatrix}}_{:=B(7)} \\ &=U(7)\cdot B(7) \ . \end{aligned} \ . $$ The last equality defines the notation $B(7)$.

Now as explained at the beginning, we want to understand for each $k$ (with $0<k<N$) the "object": $$ w_k\cdot F(A)\cdot (w_0+w_N)^T\ . $$ Here, $F(A)$ is the functional calculus in $A$, which sends $1$-eigenvalues to $0$, else each eigenvalue $\lambda$ to $1/(1-\lambda)$. In particular, $F(Q)=(I-Q)^{-1}$. So the information is inside $F(A) \cdot (w_0+w_N)^T$, which is: $$ \begin{aligned} F(A) \cdot (w_0+w_N)^T &=F(\ UBV\ ) \cdot (w_0+w_N)^T\\ &=U\ F(B)\ V \cdot (w_0+w_N)^T\\ &=U\ F\left(\ \begin{bmatrix} \boxed 1 & & \\ & Q & \\ & & \boxed 1 \end{bmatrix} \right)\ V \cdot (w_0+w_N)^T\\ &=U \begin{bmatrix} \boxed 0 & & \\ & (I-Q)^{-1} & \\ & & \boxed 0 \end{bmatrix} \begin{bmatrix} 1/7 \\ -\underline1\\ 1/7 \end{bmatrix} \\ &=U \begin{bmatrix} 0 \\ -(I-Q)^{-1}\underline1\\ 0 \end{bmatrix} \\ &= \begin{bmatrix} 0 \\ -(I-Q)^{-1}\underline1\\ 0 \end{bmatrix}\ . \end{aligned} $$ Here, $\underline1$ is the column vector with all $(N-1)=6$ components equal to one.

To obtain information on this "object", we do not need to get the "full inverse" of $(I-Q)$, it is enough to "solve a system" of the shape $(I-Q)\underline f=\underline 1$. Which is arguably a much simpler task. (Although the experimental data shows that we cannot expect a total simplification and an immediate asymptotic insight.)

The right eigenvectors of $Q$ have a combinatorial description. I was trying to get an approximation based on this information, but the problem is in the details. I may come back, but for the moment have to stop here...

7
On

Let $H(x) = -x\log_2(x)-(1-x)\log_2(1-x)$.

Then $f_n(x) = \frac{P(x,0)+P(x,1)}{n} + \sum_{k = 1}^{n-1} P(x,k) f_n(k/n)$

Hence we have $$f_n(x) = \sum_{k = 0}^{n} P(x,k) f_n(k/n) = E\left(H\left(\frac{X^n_1}{n}\right)\right)$$

where we have substituted $f_n(y) = H(y)$.

Since we have, binomial coefficients, by central limit theorem for binomial coefficients, we have convergence in distribution to a gaussian random variable and since $0 \leq H(s) \leq 1$ bounded, we have convergence in expectation $E(H(Y_n)) \rightarrow E(H(Z_n)) \rightarrow H(x)$ after transforming $Y_n = \frac{X^n_1}{n} \rightarrow distribution \rightarrow Z_n = N(x,\frac{x(1-x)}{n})$ and hence the proof.

This just proves that given expression is one of the solutions.

For applying above, we need to extend $H(y) = 0$ for $y<0$ and $y>1$.