We are working with discrete-time stochastic processes.
Let $v_k$ be a $\mathcal F_k$-predictable process, and let $X_k, \eta_k$ be $\mathcal F_k$-adapted processes. Define $V_k = v_kX_k+\eta_k$ and $\Delta X_{k+1} =X_{k+1}-X_k$. It is claimed that if one wishes to minimize $$ \text{Var}[V_{k+1}-v_{k+1}\Delta X_{k+1} \mid \mathcal F_k] $$ with respect to $v_{k+1}$ then this will be achieved if and only if $$ \text{Cov}[V_{k+1}-v_{k+1}\Delta X_{k+1}, \Delta X_{k+1} \mid \mathcal F_k]=0 $$ (source is equation (3.3) p 13 in this paper ). I am unable to verify this.
I realise that since $v_{k+1}$ is $\mathcal F_k$-predictable, it is constant conditioned on $\mathcal F_k$. Therefore it seems like it will be akin to minimizing $$ \text{Var}[Y-aZ \mid \mathcal G] $$ wrt a constant $a$ for some random variables $Y, Z$ and a sigma-algebra $\mathcal G$. Then I have tried to expand this thing and the covariance to get $$ \text{Var}[Y-aZ \mid \mathcal G] = \mathbb E[(Y-aZ)^2 \mid \mathcal G]-\mathbb E[Y-aZ \mid \mathcal G]^2 $$ $$ \text{Cov}(Y-aZ, Z \mid \mathcal G) = \mathbb E[(Y-aZ)Z \mid \mathcal G]-\mathbb E[Y-aZ\mid \mathcal G]\mathbb E[Z\mid \mathcal G] $$
but I am unable to see the connection. Not sure what I am missing. I would also imagine that there probably is a more direct way that expanding the terms. Any pointers greatly appreciated.
Using your second notation
$$\text{Var}(Y-aZ|\mathcal{G})=\text{Var}(Y|\mathcal{G})+a^2\text{Var}(Z|\mathcal{G})-2a\text{Cov}(Y,Z|\mathcal{G})$$
The F.O.C. for the minimum is
$$\frac{d}{d a}\text{Var}(Y-aZ|\mathcal{G})=2a\text{Var}(Z|\mathcal{G})-2\text{Cov}(Y,Z|\mathcal{G})$$ $$=-2\text{Cov}(Y-aZ,Z|\mathcal{G})=0$$
so that $$a^*=\frac{\text{Cov}(Y,Z|\mathcal{G})}{\text{Var}(Z|\mathcal{G})}$$
Also,
$$\frac{d^2}{d a^2}\text{Var}(Y-aZ|\mathcal{G})=2Var(Z|\mathcal{G})>0$$