I'm reading the paper Likelihood Ratio Tests for Monotone Functions by Moulinath Banerjee and Jon A. Wellner. I got stuck at equality 2.8, where the authors used summation by parts to derive 2.8 from 2.7.
Suppose we have $\Delta_i \sim Bernoulli(\omega_i)$, we want to obtain the $\hat{\omega_i}$ that maximize the log-likelihood
$$\Phi(\omega_1, \omega_2, ..., \omega_m) = \sum_{i=1}^m \bigg\{\Delta_i\log\omega_i + (1 - \Delta_i)\log (1-\omega_i)\bigg\}\ ,$$
under the constaint that $$0 < \omega_1 \leq \omega_2 \leq ... \leq \omega_m \leq \theta_0\ .$$
For such $\hat{\omega_i}$ to exist, the authors claimed that
$$\sum_{i=1}^m \left[\frac{\Delta_i}{\hat{\omega_i}} - \frac{1 - \Delta_i}{1 - \hat{\omega_i}}\right]\hat{\omega_i} = \sum_{i=1}^m \left[\frac{\Delta_i}{\hat{\omega_i}} - \frac{1 - \Delta_i}{1 - \hat{\omega_i}}\right] \theta_0$$ has to hold.
To prove the claim, the authors started from the concavity of $\Phi$, by defining $$S_i = \sum_{j\leq i} \left[ \frac{\Delta_j}{\hat{\omega_j}} - \frac{1 - \Delta_j}{1 - \hat{\omega_j}} \right]\ , i=1, ..., m\ ,$$
then
\begin{align} \frac{d}{dt}\Phi((1-t)\hat{\omega} + t\omega)\bigg\rvert_{t=0} &= \sum_{i=1}^m \left[\frac{\Delta_i}{\hat{\omega_i}} - \frac{1 - \Delta_i}{1 - \hat{\omega_i}}\right](\omega_i - \hat{\omega_i}) \tag{2.7}\\ &= -\sum_{i=1}^m S_i\left[\omega_{i+1} - \omega_i - (\hat{\omega_{i+1}} - \hat{\omega_i})\right] \tag{2.8}\\ &\leq 0\ . \end{align}
What makes me confused was how to derive equation (2.8) from (2.7). The authors said they used summation by parts, with $\hat{\omega}_{i+1} = \omega_{i+1} = \theta_0$. Could anyone help me to understand how this summation by parts works here?
Define $\sigma_N:=\sum_{n=0}^Na_nb_n$ and $B_n:=\sum_{k=0}^nb_k$. General summation by parts looks like:
$$\sigma_N=a_NB_N-\sum_{n=0}^{N-1}B_n(a_{n+1}-a_n).$$
So let $b_i:=\frac{\Delta_i}{\hat{\omega}_i}-\frac{1-\Delta_i}{1-\hat{\omega}_i}$ and $a_i:=\omega_i-\hat{\omega}_i$.
Are you able to see the result from here?