AdaBoost definition: Why $D_{t+1}(i)= \frac{D_t(i) \exp(-\alpha_t y_ih_t(x_i))}{Z_t}$ entails correct predictions will not be updated?

20 Views Asked by At

Consider the algorithm for AdaBoost optimization.

enter image description here

I think I understand it fully except for a single detail. In theory,

$$\large D_{t+1}(i)= \frac{D_t(i) \exp(-\alpha_t y_ih_t(x_i))}{Z_t}$$

should emphasize the pair $i \in S$ proportionally to the error $e_t$ if the prediction $h_t(x_i) \neq y_i$. I can see this is the case, provided that we are updating $D_t(i)$ by a factor of $\exp(-\alpha_t y_ih_t(x_i))$.

However, it should also be the case that pair $i \in S$ should not be updated (i.e., it should be the case that $D_t(i)=D_{t+1}(i)$) if $h_t(x_i)=y_i$. In other words, we want to leave untouched the $(x, y)$ that are already being correctly predicted. I can not see how this follows from the definitions above, but I am sure I am missing something.

How does it follow from the definitions above that $(x_i, y_i)$ pairs that satisfy $h_t(x_i)=y_i$ will have the same probability on $D_t$ and $D_{t+1}$?