The log-likelihood function of logit model is
$$\mathcal{L}(\beta)=\sum_n\sum_iy_{ni}\ln P_{ni}$$
where $y_{ni}=1$ if person $n$ chose $i$ and zero otherwise. and
$$P_{ni}=\frac{e^{V_{ni}}}{\sum_j e^{V_{nj}}}$$
and the observed utility $$V_{ni}=\beta x_{ni}$$
At the maximum of the likelihood function, its derivative with respect to each of the parameters is zero
$$\frac{\partial\mathcal{L}(\beta)}{\partial\beta}=0$$
Now, I calculate the derivative by myself by using chain rule
$$\frac{\partial\mathcal{L}(\beta)}{\partial\beta}=\frac{\partial\mathcal{L}(\beta)}{\partial P_{ni}}\cdot\frac{\partial P_{ni}}{\partial e^{V_{ni}}}\cdot\frac{\partial e^{V_{ni}}}{\partial V_{ni}}\cdot\frac{\partial V_{ni}}{\partial \beta}$$
$$=\sum_n\sum_i\bigg[\bigg(\frac{y_{ni}}{P_{ni}}\bigg)\cdot\bigg(\frac{\sum_je^{V_{nj}}-e^{V_{ni}}}{(\sum_je^{V_{nj}})^2}\bigg)\cdot\bigg(e^{V_{ni}}\bigg)\cdot\bigg(x_{ni}\bigg)\bigg]$$
$$=\sum_n\sum_i\bigg[\bigg(\frac{y_{ni}}{P_{ni}}\bigg)\cdot\bigg(\frac{1-P_{ni}}{\sum_je^{V_{nj}}}\bigg)\cdot\bigg(e^{V_{ni}}\bigg)\cdot\bigg(x_{ni}\bigg)\bigg]$$
$$=\sum_n\sum_i\bigg[\bigg(\frac{y_{ni}}{P_{ni}}\bigg)\cdot\bigg({1-P_{ni}}\bigg)\cdot\bigg(\frac{e^{V_{ni}}}{\sum_je^{V_{nj}}}\bigg)\cdot\bigg(x_{ni}\bigg)\bigg]$$
$$=\sum_n\sum_i\bigg[\bigg(\frac{y_{ni}}{P_{ni}}\bigg)\cdot\bigg({1-P_{ni}}\bigg)\cdot\bigg(P_{ni}\bigg)\cdot\bigg(x_{ni}\bigg)\bigg]$$
$$=\sum_n\sum_iy_{ni}({1-P_{ni}})x_{ni}$$
This is my result, but it's different from the textbook's result
$$\text{textbook}=\sum_n\sum_i({y_{ni}-P_{ni}})x_{ni}$$
I recheck several times, but I still do not know how the text book includes the $y_{ni}$ into the parenthesis, any idea?