Calculating expectation value based on previous expectation value - why exactly is it wrong?

184 Views Asked by At

I was trying to solve this question. The following is a modified shorter question:

  1. You have $n$ blocks (numbered 1 through n) with positive integral weights $w_1, w_2, \ldots, w_n$. You are also given integers $a_1$ through $a_n$, where $a_i$ is 1 if you like the i-th block and 0 otherwise.
  2. You will perform $m$ operations. In each operation:
    • the probability that you pick the i-th block is proportional to its weight, i.e., $p_i = \frac{w_i}{\sum_i{w_i}}$
    • you will pick only one block, and increment its weight by 1 if you like that block, or by -1 if you do not.
  3. Find the expected weight of each block after all the operations are finished.

My approach: repeat the following steps, $m$ times:

  1. Compute $S = \sum\limits_{i=1}^nw_i$.
  2. For each $i$, update $w_i$ to the following value: $w_i = \begin{cases} \displaystyle w_i + \frac{w_i}{S},& \text{if } a_i = 1 \\ \displaystyle w_i - \frac{w_i}{S},& \text{otherwise}\end{cases}$

Basically, this is updating the expected value of all the blocks, based on their expectation values at the previous step.

To me this method feels very intuitive, and I do not have a mathematically rigorous way of proving why this is wrong (I know it's wrong because it failed tests). Could anyone help prove concretely why this is wrong?

Moreover, this approach gives the right answer if all $a_i$ are 1 (that is, the weights always gets incremented) What difference does all $a_i$ being one cause, to produce right answers?

3

There are 3 best solutions below

3
On

The expected value depends on each of the states, and not just the summarized expected value obtained at the previous step.

As an example, consider starting with $w _ 1 = w_2 = 1, a_ 1 = 1, a _ 2 = 0 $, and taking the first step.
Half of the time, you get $ (w_1, w_2 ) = (2,1)$ and the other half of the time you get $(1,0)$. The expected value then becomes $ (\frac{3}{2}, \frac{1}{2}) $.
Clearly, at the next step, the adjustments of weights isn't equal across all cases. We actually end up with weighted expected value of $\frac{2}{6} (3,1) + \frac{1}{6} (2,0) + \frac{1}{2} (2,0) = ( \frac{7}{3}, \frac{1}{3})$, whereas OP's approach yields $(\frac{3}{2} + \frac{3}{4}, \frac{1}{2} - \frac{1}{4} ) (\frac{9}{4}, \frac{1}{4})$.

0
On

When you are doing iterative updates in dynamic programming, you are essentially saying that you have a n-ary tree of futures, but the value you are interested in computing for the next layer is an average over all the futures (the expectation) and can be computed from the average over the futures of the previous state (the previous expectation).

The fundamental flaw with your approach, which we shall explore later, is that you are adding numerators and denominators separately, and certainly $\frac{a}{x} + \frac{b}{y} \neq \frac{a + b}{x + y}$. Lets see why this is happening.

If at stage 2 you have 2 possible states, each with some probability:

  • Weight of concerned element $i$ = $a$, Sum of all weights = $x$, with probability $P_1$
  • Weight of concerned element $i$ = $b$, Sum of all weights = $y$, with probability $P_2$

And I want to compute the probability with which I will get picked in the next state. The correct answer would be the following: $$P_{\text{select}:i, \text{stage}:3} = P_1 \times \frac{a}{x} + P_2 \times \frac{b}{y}$$

However, what you are computing the average weight $E(w_i) = (P_1 \times a + P_2 \times b)$, and in the denominator the average sum of weights $E(\sum w_i) = (P_1 \times x + P_2 \times y)$. So your result will be: $$P^{\text{incorrect}}_{\text{select}:i, \text{state}:3} = \frac{E(w)}{\sum E(w)} = \frac{P_1 \times a + P_2 \times b}{P_1 \times x + P_2 \times y}$$

You can try this with an example of 2 elements and update for 2 days:

  • Element 1: Weight = 1, Value Decreases on Selection
  • Element 2: Weight = 2, Value Increases on Selection The probability of selection of element 1 on day 2 if computed correctly will be $\frac{1}{3} \times \frac{0}{2} + \frac{2}{3} \times \frac{1}{4} = \frac{2}{12}$, where as you would get $\frac{2}{18}$ by your method.
0
On

Intuitive Understanding:

You are assuming that at any particular stage, $P(w_i = x)$ is independent of $P(S = y)$. This is not necessarily true. If there are 2 starting objects:

Object A with weight a, increases in value on being selected.
Object B with weight b, decreases in value on being selected.

After the first operation, $w_A = a + 1, w_B = b$ directly implies a weight of $a + b + 1$ whereas $w_A = a, w_B = b - 1$ directly implies a weight of $a + b - 1$. This shows the correlation between object weight and total weight. Therefore, you cannot handle them separately.

However, notice that if all objects increase (or decrease) in weight on being selected, the total weight is actually independent of the tuple of the object weights for a given stage. This is why your method works just for this case.