Consider the following problem. There are $10$ stacks of $10$ coins, each visually identical. In $9$ of the stacks, the coins weigh $10$ grams each, while in the other stack the coins weigh $11$ grams each. You have a digital scale but can only use it once to identify the heavier stack. How do you do it?
One answer to this is to to take $k$ coins from the $k$-th stack, for each $1\leq k\leq 10$, and weigh them all at once. The resulting weight then contains enough "information" to tell which stack the heavier coins came from.
Now if we modified the situation so that each stack only has $9$ coins, this method wouldn't work. In particular, there's a sense in which the strategy is able to retrieve less information in this set-up compared to the set-up with 10 coins per stack.
Question: I would like to understand (perhaps using language from information theory) the precise sense in which this strategy yields less information.
Apologies if the question is a bit vague, but I hope it is understandable to someone who knows about information theory. I have no background in information theory, but I am trying to make a conceptual connection to things like entropy.
Not sure if this is a useful way to thing about the problem, but maybe it is at least an interesting exercise.
To talk about entropy, we need probability, so suppose each of the 10 stacks is equally likely to be the outlier. Then, the amount of entropy in this information is just $\log(10)$ (for whatever choice of $\log$ base).
Consider a given weighing strategy (for stacks of 8 coins). The result will be 10 grams times the total number of coins weighed plus 1 gram times the number of coins weighed from the outlier stack. No matter what our strategy is, there are at most 9 possible outcomes (from 0 to 8) for the number of coins we weighed from the outlier stack. The entropy of any measurement with 9 (or fewer) outcomes is bounded by the entropy of 9 equally likely outcomes. The entropy of 9 equally likely outcomes is $\log(9)$. This is less than $\log(10)$, so it is impossible to have recovered all the original information.
Note, first, that this is obvious. If there are 9 outcomes for 10 inputs, it must be the case that (at least) two inputs map to the same output. (This is the pigeonhole principle.) This is exactly what makes things impossible — we cannot distinguish between two inputs that have the same output. Probably this corresponds to your ordinary understanding of the problem. (But I guess that doesn't mean that entropy will always be fruitless.)
Also note that there is no weighing strategy that will actually have entropy $\log(9)$. Different strategies will have different entropies based on the distribution of results (which is in turn induced by the distribution we initially chose for the unknown information). For example, weighing nothing (or otherwise weighing the same number of coins from each stack) will always give the same result; this is zero entropy. The highest entropy will be when we take different numbers of coins from every stack except for a single pair of stacks from which we take the same number of coins. We can calculate that this has entropy $-\frac{2}{10}\log(\frac{2}{10})-\frac{8}{10}\log(\frac{1}{10}) = \log(5 \cdot 2^{4/5})$.