Given the following two equations, I want to show that equation (2) - equation (1) $\geq$ 0 under some conditions.
Eqn (1)
$$(Ab - ADEb)^T(Ab - ADEb)$$
Eqn (2)
$$(Ab - AD^*Eb)^T(Ab - AD^*Eb)$$
where all of the vectors/matrices are real:
- $A$ is $m \times n$
- $b$ is $n \times 1$
- $D$ and $D^*$ are both $n \times p$, but $D \neq D^*$ (note that $D^*$ is not the conjugate transpose of $D$).
- $E$ is $p \times n$
The first thing I tried was to expand both Eqn(1) and (2).
Eqn (1)
\begin{align*} (Ab - ADEb)^T(Ab - ADEb) &= b^TA^TAb - 2b^TE^TD^TA^TAb + b^TE^TD^TA^TADEb \end{align*}
Eqn (2)
\begin{align*} (Ab - AD^*Eb)^T(Ab - AD^*Eb) &= b^TA^TAb - 2b^TE^T{D^*}^TA^TAb + b^TE^T{D^*}^TA^TAD^*Eb \end{align*}
Then I take the difference: Eqn (2) - Eqn (1)
\begin{align*} Eqn (2) - Eqn (1) &= -2b^TE^T{D^*}^TA^TAb + b^TE^T{D^*}^TA^TAD^*Eb + 2b^TE^TD^TA^TAb - b^TE^TD^TA^TADEb\\ &= -2b^TE^T({D^*}^TA^T - D^TA^T)Ab + b^TE^T({D^*}^TA^TAD^* - D^TA^TAD)Eb\\ &\overset{?}{\geq} 0 \end{align*}
This brings me to
\begin{align*}2b^TE^T({D^*}^TA^T - D^TA^T)Ab &\overset{?}{\leq} b^TE^T({D^*}^TA^TAD^* - D^TA^TAD)Eb \end{align*}
It's clear that the LHS = RHS if $D^* = D$. My question is, how can I go about coming up with conditions for which the strict inequality, $$2b^TE^T({D^*}^TA^T - D^TA^T)Ab < b^TE^T({D^*}^TA^TAD^* - D^TA^TAD)Eb$$ would hold?
Denote the matrix variable as $X\,$ and define the vector function $$v(X) = AXEb-Ab$$ Vectorizing the $X$ matrix on the RHS yields $$\eqalign{ v &= {\rm vec}(AXEb) - Ab \\ &= (b^TE^T\otimes A)\,{\rm vec}(X) - Ab \\ &= Mx-Ab \\ }$$ Write down the squared magnitude of this vector, and calculate its gradient. $$\eqalign{ \mu &= v^Tv \\ d\mu &= 2v^Tdv = 2v^TM\,dx = 2(M^Tv)^Tdx \\ \frac{\partial\mu}{\partial x} &= 2M^Tv \;=\; g \qquad\big({\rm the\,gradient}\big) \\ }$$ You are interested in $\mu$ at two different values of $x$ such that $$\eqalign{ \mu(x^*) &> \mu(x) \\ \Delta \mu \equiv \mu(x^*) - \mu(x) &> 0 \\ }$$ The difference $(x^*-x)$ can be parameterized by a direction $(s)$ and a length $(\lambda>0)$ $$\eqalign{ s = \frac{x^*-x}{\|x^*-x\|} ,\qquad \|s\| = 1 ,\qquad (x^*-x)=\lambda s }$$ The gradient can be used with this parameterization to calculate $$\eqalign{ \Delta\mu = g^T(\lambda s) = \lambda(g^Ts)\; > 0 \\ \therefore\;\; g^Ts \; > 0 \\ }$$ In words, the projection of the unit vector $s$ onto the gradient $g$ must be positive. This is closely related to the concept of the directional derivative.