Entropy of partitioning for set of all $\binom{n}{k}$ combinations

137 Views Asked by At

Let $\binom{[n]}{k}$ represent the set of all $k$-combinations of the set $[n]=\{1,\dotsc,n\}$. We will use the notation: $$\{x_1, \dotsc, x_k\} \in \binom{[n]}{k},$$ where $x_1 < x_2, < \dotsc < x_k$. Let $k$ be odd, and let $t = \left\lfloor \frac{k}{2}\right \rfloor$. One way to partition this set is to say all elements with the same values of $x_{2i}, \;\; i=1,\dotsc, t$ belong to the same part.

There are a total of $\binom{n-t-1}{t}$ parts in this partitioning. To see this, we let $x_{2i} - i = d_i, \;\;i=1,\dotsc,t$. We must have that $x_{2i} \geq x_{2(i-1)}+2$, and $x_{2i} \in \{2, \dotsc, n-1\}$. From these conditions, any set of $d_i$'s must satisfy $0 < d_1, < d_2, \dotsc, < d_t < n-t$. Clearly there are $\binom{n-t+1}{t}$ ways this inequality can be satisfied, which is equal to the number of parts.

For each part in the partition, indexed by $\mathbf{d} = \{d_1, \dotsc, d_t\}$ let the number of elements in that part be $c (\mathbf{d})$. We can show by a simple counting argument that

$$c (\mathbf{d}) = d_1(d_2-d_1)\dotsc,(d_t - d_{t-1})(n-t-d_t),$$ which is the form of a vendermonde determinant. My question is, if we let $p(\mathbf{d}) = \frac{c(\mathbf{d})}{\binom{n}{k}},$ is it possible to analytically calculate the following sum? $$\sum \limits_{0 < d_1 < \dotsc, < d_t<n-t} p(\mathbf{d})\log\left( p(\mathbf{d}) \right)$$

I've tried myself, but there doesn't seem to be any obvious way to do this. Furthermore, if there is no easy way to analytically compute this sum, is there an easy bound for it in terms of $n,k,t$? Obviously, there is the simple lower bound $-\log \left( \binom{n-t-1}{t}\right)$, but I'm wondering if there is anything better, which makes use of the structure, and not just the number of parts in the partition.

Somewhat loosely related to this question (Minimum number of $k$-partitions of a set of size $n$ to enumerate all $n \choose k$ combinations).

1

There are 1 best solutions below

6
On BEST ANSWER

$$ \begin{array}{lll} \displaystyle \sum_{\mathbf{d}} p(\mathbf{d})\ln p(\mathbf{d}) & = & \displaystyle \sum_{\mathbf{d}}\frac{c(\mathbf{d})}{\binom{n}{k}}\ln\frac{c(\mathbf{d})}{\binom{n}{k}} \\[10pt] & = & \displaystyle \binom{n}{k}^{-1} \sum_{\mathbf{d}} \left[ c(\mathbf{d})\ln c(\mathbf{d})-c(\mathbf{d})\ln\binom{n}{k}\right] \\[10pt] & = & \displaystyle \binom{n}{k}^{-1}\left[-\binom{n}{k}\ln\binom{n}{k}+\sum_{\mathbf{d}}c(\mathbf{d})\ln c(\mathbf{d})\right] \\[10pt] & = & \displaystyle -\ln\binom{n}{k}+\binom{n}{k}^{-1}\sum_{\mathbf{d}}c(\mathbf{d})\ln c(\mathbf{d}) \end{array} $$

Introduce the change of variables $c(\mathbf{d})=b_1\cdots b_tb_{t+1}$ (and later $\mathbf{b}'=(b_1,\cdots,b_t)$) so

$$ \begin{array}{lll} \displaystyle \sum_{\mathbf{d}}c(\mathbf{d})\ln c(\mathbf{d}) & = & \displaystyle \sum_{\mathbf{b}}b_1\cdots b_{t+1} \ln(b_1\cdots b_{t+1}) \\[10pt] & = & \displaystyle \sum_i \sum_{\mathbf{b}} b_1\cdots b_{t+1}\ln b_i \\[10pt] & = & \displaystyle (t+1)\sum_{\mathbf{b}} b_1\cdots b_{t+1}\ln b_{t+1} \\[10pt] & = & \displaystyle (t+1)\sum_{b_{t+1}=1}^{n-t} \left(\sum_{\mathbf{b}'}b_1\cdots b_t\right)b_{t+1}\ln b_{t+1} \end{array} $$

Evaluate the inner sum generatingfunctionologically (set $k=b_{t+1}$):

$$ \begin{array}{lll} \displaystyle \sum_{\mathbf{b}'}b_1\cdots b_t & = & \displaystyle [x^{n-t-k}]\left(\sum_{b_1\ge1} b_1x^{b_1}\right)\cdots\left(\sum_{b_t\ge1}b_tx^{b_t}\right) \\[10pt] & = & \displaystyle [x^{n-t-k}](1x+2x^2+\cdots)^t \\[1pt] & = & \displaystyle [x^{n-t-k}]\left(\frac{x}{(1-x)^2}\right)^t \\[10pt] & = & \displaystyle [x^{n-2t-k}](1-x)^{-2t} \\[10pt] & = & \displaystyle \binom{-2t}{n-2t-k}(-1)^{n-k} \end{array} $$

Therefore

$$ \sum_{\mathbf{d}} p(\mathbf{d})\ln p(\mathbf{d})=-\ln\binom{n}{k}+\frac{t+1}{\binom{n}{k}}\sum_{k=1}^{n-t} \binom{-2t}{n-2t-k}(-1)^{n-k} k\ln k $$

I'm not sure what else can be done besides express it as a rational combination of $\ln$s.

I'm not sure this is useful for getting an estimate either.