Is entropy "interpolative" (in the sense defined below...)?

73 Views Asked by At

Define the usual Shannon $H(p)=-\sum_i p_i\log_2(p_i)$ for discrete pdf (probability distribution function) $p=(p_i,i=1\ldots n)$. Now consider three pdf's: $p_a,p_b$ are given (with the same cardinality $n$), and $p_c$ is a convex combination $p_c=\alpha p_a+(1-\alpha)p_b$ for a given $0\leq\alpha\leq1$.

Then, is it true that $H(p_a)\leq H(p_c)\leq H(p_b)$? I've tried a variety of numerical cases (wrote a small program to try them for me), and it's been true for those cases. But I haven't been able to conjure up a general proof. So (a) can you conjure one up?, and (b) even more concretely, can you (is it even possible to) derive a closed-form formula for $H(p_c)=f(\alpha,H(p_a),H(p_b))$ in terms of some $f(\cdot)$?

P.S. Also, as an auxiliary question, does the same interpolation property hold (or not hold) for continuous pdf's (with the usual sum$\to$integral for $H$)?

    E d i t
--------------

Thanks to @IvanNeretin for answering the question, even if the answer has to be "no".
So let me ask a related question: Given $p_a$ and $p_b$ as above, and supposing $H(p_a)\leq H(p_b)$ (just flip $a\leftrightarrow b$ if not), is there any way to construct a $p_c$ such that $H(p_a)\leq H(p_c)\leq H(p_b)$? Ivan cleverly demonstrated that my not-so-clever convex combination isn't such a construction. But that doesn't necessarily rule out some construction that does satisfy the interpolative condition.

    E d i t # 2
-------------------

Re Ivan's comment/question beneath his answer, "How did you produce your numerical cases?", the C language program is below. Save it in file entinterp.c (or whatever you like), and compile it as
    cc entinterp.c -lm -o entinterp
and run it as, e.g.,
    ./entinterp .5 ".1,.2,.3,.4&.5,.6,.7,.8"
The first .5 arg is $\alpha$. Then a space followed by the two pdf's, $p_a$ and $p_b$, all enclosed in quotes. Each $p_i,i=1\ldots n,$ is separated by a comma, and the two pdf's are separated by an ampersand. Note that each pdf is normalized by the program, so you don't have to worry about $\sum_i p_i=1$. But you do have to input the same number $n$ of $p_i$'s both before and after the ampersand.

Output from the example illustrated above is...

bash-5.0$ ./entinterp .5 ".1,.2,.3,.4&.5,.6,.7,.8"
H(0.100,0.200,0.300,0.400) = 1.8464
H(0.192,0.231,0.269,0.308) = 1.9785
H(0.146,0.215,0.285,0.354) = 1.9289
entinterp> end-of-job

where the first two output lines are $H(p_a)$ and $H(p_b)$, and the third line is $H(p_c)$.

Now, the above example, like all I originally tried, is indeed interpolative. But after Ivan's answer, I tried some more and found that's it's real, real easy to come up with examples that don't work. So my first comment under Ivan's answer is pretty much irrelevant, based on my mistaken assumption that most cases work.

Anyway, here's the code...

/* --- is entropy interpolative? --- */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>

/* --- entry point --- */
int main ( int argc, char *argv[] ) {
  /* --- command-line args --- */
  double  alpha = ( argc>1? atof(argv[1]) : 0.5 ); /* convexity multiplier */
  char *probstr = ( argc>2? argv[2] : ".1,.2,.3,.4&.5,.6,.7,.8" );
  /* --- other variables --- */
  char *probstr1=probstr, *amp=strchr(probstr,'&'), *probstr2=amp+1;
  double *probs1=NULL, *probs2=NULL, *getprobs(), probs3[99];
  double H=0.0, entropy();
  int nprobs = 0;
  /* --- get pdf's --- */
  if ( amp == NULL ) goto end_of_job;
  probs1 = getprobs(probstr1);
  probs2 = getprobs(probstr2);
  while ( 1 ) {
    if ( probs1[nprobs]<0.0 || probs2[nprobs]<0.0 ) break;
    probs3[nprobs] = alpha*probs1[nprobs] + (1.0-alpha)*probs2[nprobs];
    nprobs++; }
  probs3[nprobs] = (-1.0);
  H = entropy(probs1);
  H = entropy(probs2);
  H = entropy(probs3);
  end_of_job:
    printf("entinterp> end-of-job\n"); exit(0);
  } /* --- end-of-function main() --- */

/* --- entry point --- */
double *getprobs( char *probstr ) {
  double *probs = (double *)malloc(99*sizeof(double));
  int    iprob=0, nprobs=0;
  char   *thisptr=probstr, *endptr=NULL;
  double thisprob=0.0, totprob=0.0;
  if ( probs==NULL || probstr==NULL ) return(NULL);
  while ( 1 ) {
    while ( *thisptr!='\000' && strchr(" ,",*thisptr)!=NULL ) thisptr++;
    if ( *thisptr=='\000' || *thisptr=='&' ) break;
    thisprob = strtod(thisptr,&endptr);
    if ( thisprob < 0.0 ) break;
    probs[nprobs++] = thisprob;
    totprob += thisprob;
    thisptr = endptr;
    } /* --- end-of-while() --- */
  if ( nprobs>0 && totprob>0.0 )
    for ( iprob=0; iprob<nprobs; iprob++ ) probs[iprob] /= totprob;
  end_of_job:
    probs[nprobs] = (-1.0);
    return ( probs );
  } /* --- end-of-function getprobs() --- */

/* --- entry point --- */
double entropy( double *probs ) {
  double H = 0.0, log2=log(2.0);
  int    nprobs = 0;
  FILE   *prtfile = stdout; /* NULL for no print */
  if ( probs == NULL ) { H=(-1.0); goto end_of_job; }
  if ( prtfile != NULL ) fprintf(prtfile,"H(");
  while ( probs[nprobs] >= 0.0 ) {
    if ( prtfile != NULL )
      fprintf(prtfile,"%c%.3lf",(nprobs>0?',':'\000'),probs[nprobs]);
    if ( probs[nprobs] > 0.0 ) H -= probs[nprobs]*(log(probs[nprobs])/log2);
    nprobs++; }
  if ( prtfile != NULL ) fprintf(prtfile,") = %.4lf\n",H);
  end_of_job:
    return ( H );
  } /* --- end-of-function entropy() --- */
/* --- end-of-file entinterp.c --- */
1

There are 1 best solutions below

4
On BEST ANSWER

This is obviously not true.

Say, $p_a(i)={2\over n}$ for $i\in[1,{n\over2}]$ and $0$ elsewhere. Also, $p_b(i)={2\over n}$ for $i\in[{n\over2}+1,n]$ and $0$ elsewhere, and $\alpha={1\over2}$. Now $H(p_a)=H(p_b)=\log_2{n\over2}\mathop{\color{red}{\mathbf<}}H(p_c)=\log_2n$.

When you mix two gases, the entropy increases - that's the physical intuition which kinda justifies the "obviously" in the first phrase.

So it goes.