Kolmogrov's inequality says that, if $C$ is a convex set, and $P_C(x)$ is an operator for projecting point $x$ into the convex set $C$, if $z = P_C(x)$, then for any $y \in C$ we have $$ (z - y).(x - y) \leq 0 $$
My question is about a results in [1] (Page 17, Proof of Proposition 11): It is saying that, given a strongly convex function $R(\mathbf{w})$ and a projection onto convex set $\mathcal{K}$,
$$ \mathbf{w}_{t+1} = \Pi_{\phi_t, \mathcal{K}}( \bar{\mathbf{w}}_{t+1} ), $$ defined as: $$ \Pi_{\phi_t, \mathcal{K}}( \bar{\mathbf{w}}_{t+1} ) \triangleq \arg\min_{ \mathbf{w} \in \mathcal{K} }D_{\phi_t} ( \mathbf{w}, \tilde{\mathbf{w}}_{t+1}). $$ It says that this holds (Page 17, Proof of Proposition 11): $$ \langle \nabla R(\mathbf{w}_t) - \nabla R(\mathbf{w}_{t+1} ), \mathbf{w}_{t} - \mathbf{w}_{t+1} \rangle \leq \langle \nabla R( \tilde {\mathbf{w}}_t) - \nabla R(\tilde { \mathbf{w}}_{t+1} ), \mathbf{w}_{t} - \mathbf{w}_{t+1} \rangle $$
Question: why the above inequality holds?
[1] http://www-stat.wharton.upenn.edu/~rakhlin/courses/stat991/papers/lecture_notes.pdf
To answer this, you must write down the optimality condition,
given an optimization problem as:
$\mathbf{x^*}= \operatorname{argmin}_{\mathbf{x} \in \mathcal{K}} f(\mathbf{x}) $
The optimality condition asserts that for all $\mathbf{x} \in \mathcal{K}$
$\nabla f(\mathbf{x}^*)^\top(\mathbf{x}-\mathbf{x^*}) \ge 0 $
Writing this condition for the optimal point $\mathbf{w}_{t+1}$ and $\mathbf{w}_{t}$, noting that $\nabla f(\mathbf{w}_{t+1})=\nabla \mathcal{R}(\mathbf{w}_{t+1})-\nabla \mathcal{R}(\mathbf{\hat{w}}_{t+1})$ you get to the inequality that you mentioned.