Entropy of finding error in coffee can packing problem

74 Views Asked by At

I can't understand the solution I was given to this problem:

12 coffee cans are filled with 12 ounces of coffee. 1 of the cans has been filled in error and has either more or less than 12 ounces. Using a balance, where 12 cans can fit on either side of the balance. We want to use the balance to identify which can has the error. What is the minimum number of measurements required to find the can in error?

The solution I was given states: Since the chance for $12$ cans to be heavier or lighter is equally likely, there are $24$ cases with probability $\frac{1}{24}$, and the expected information of the problem is $-\sum_{i=1}^{24} 1/24 \cdot log_2(1/24) = 3 + log_2(3)$ bits.

There are $3$ possibilities for measurements, both sides of balance are equal, left side of balance heavier, or right side is heavier. The information gain on a measurement should be $log_2(3)$. $n log_2(3) > 3 + log_2(3) => n > \frac{3}{log_2(3)} + 1 \rightarrow n=3$ is the minimum number of measurements.

I don't understand this solution at all. I understand $P(can = error) = 1/12$, and $P(can = Lighter | can = error) = 1/12 * 1/2 = 1/24$. But I don't follow why there are "24 cases." I see that each can has $3$ states, light, heavy, or normal. So shouldn't the entropy be $-(\frac{1}{24} \cdot log_2(\frac{1}{24}) + \frac{1}{24} \cdot log_2(\frac{1}{24}) + \frac{11}{12} \cdot log_2(\frac{11}{12})) = 0.4971502$?

This problem comes from the book: Information-Statistical Data Mining (Sy & Gupta).

Edit.

Formula for entropy : $H(X) = -\sum_1^n p(x) log_2(p(x))$.

If $x$ is the state of the coffee can, then : $$ \begin{array}{cc} p(x) = \{ & \begin{array}{cc} 1/24 & \text{can is light} \\ 1/24 & \text{can is heavy} \\ 11/12 & \text{can is normal} \end{array} \end{array} $$

So $$ H(X) = -\sum_1^n p(x) log_2(p(x)) = -1 \cdot (\frac{1}{24} \cdot log_2(\frac{1}{24}) + \frac{1}{24} \cdot log_2(\frac{1}{24}) + \frac{11}{12} \cdot log_2(\frac{11}{12})) = 0.4971502 $$

$12$ coffee cans should mean the total entropy is $12 * 0.497 = 5.965$

3

There are 3 best solutions below

2
On BEST ANSWER

As you correctly reminded, the definition of entropy for a discrete random variable ${\displaystyle X}$, whose possible outcomes ${\displaystyle x_{1},...,x_{n}}$ occur with probability ${\displaystyle P(x_{1}),..., P(x_{n})}$, is

$${\displaystyle \mathrm {H} (X)=-\sum _{i=1}^{n}{\mathrm {P} (x_{i})\log {P} (x_{i})}}$$

Different definitions of entropy can be obtained by varying the base of the logarithm. One of the most commonly used is that in base $2$, which gives the resulting entropy in bits or "shannons". Note that the term $-\log P(x_i)$ represents the so-called Shannon self-information - a measure that attempts to quantify the level of "surprise" of a particular outcome, ranging from $0$ for an outcome that occurs with probability $1$ to $\infty$ for an outcome that occurs with probability $0$. In this view, entropy can be interpreted as the expectation of the self-information of a variable.

An important issue is that the summation used to determine the entropy has to be calculated over all possible outcomes. For scenarios where $>1$ elements are included (e.g., coins, bins, dice, and so on), it is necessary to consider simultaneously all elements - and not only one or part of them - to identify all possible outcomes.

In the question described by the OP, we have $12$ coffee cans: all are filled with the same quantity of coffee except one, which can contain either more or less coffee than the others. Looking simultaneously at all elements of the problem - i.e. at all $12$ cans - we have $12$ possible cases for the "different" can, and for each of these cases the difference can be a larger or a smaller coffee quantity. This directly leads to $24$ cases. Each of these cases is equally likely, so that the probability is $1/24$ for all. Applying the formula we get

$$\displaystyle \mathrm {H}=-\sum _{i=1}^{24}\, \frac{1}{24}\, \log_2 \left(\frac{1}{24}\right)= -\log_2 \left(\frac{1}{24}\right)= \log_2 (24)=\log_2 (8)+\log_2 (3)\\ =3+\log_2 (3)$$

The problem in your calculations giving $H=0.4971502$ is that you focused on the three possibilities of a single can (normal, light, heavy) instead of considering all possibilities for the whole set of cans. As stated above, in this type of problems, it is fundamental to look simultaneously at all elements.

6
On

There are $24$ cases because you also know that only $1$ can is non-conforming; i.e., the cans' states are not independent of each other.

0
On

Let us first solve the simplest case. Reduce the problem set to 3 cans. Label the cans as $A, B, C$. One of them is defective.

Measurement #1: You weigh $A$ and $B$. If they are balanced, then $C$ is defective.

Measurement #2: If $A$ and $B$ are not balanced, remove $A$ from the balance and replace with $C$. Now if they balance, then $A$ is defective. If they are not, $B$ is defective.

Minimum measurements required = $1$

Maximum measurements required = $2$


Let us try $4$ cans. Label them $A, B, C, D$.

Measurement #1: Weigh $A, B$.

If $A,B$ are balanced, then remove $A$ and replace with $C$.

Measurement #2.a If $B, C$ are balanced, D is defective else $C$ is defective.

If $A, B$ are not balanced, then remove $A$ and replace with $C$.

Measurement #2.b If they are balanced, $A$ is defective else $B$ is defective

Minimum measurements required: $2$

Maximum measurements required: $2$


We first do a $3$-lot measurement which is analogous to the $3$-can measurement above. Each lot has $4$ cans. Our best case would identify defective lot of $4$ cans in $1$ measurement, worst case $2$ measurements.

We then do a $4$-can measurement with the defective lot and identify the defect in $2$ measurements.

So, minimum measurements is $1 + 2 = 3$.

Maximum measurements is $2 + 2 = 4$.

Now convert this into entropy.

Note: Each measurement reveals more than one bit of information (about the two articles that are being measured and the others that are not being measured).