I intend to solve for vector $ x \in \mathbb{R}^{N \times 1} $ by solving the following optimization problem
\begin{align} \arg \min_{x} Tr( (\mathbf{K} \mathbf{W})^T \mathbf{P} ( \mathbf{K} \mathbf{W})) - 2Tr( \mathbf{P} \mathbf{K} \mathbf{W}) \end{align} \begin{align} \text{subject to } & x_{i}^{min} \leq x_{i} \leq x_{i}^{max} \\ \end{align} where $Tr()$ is the trace operator, $\mathbf{P} \in \mathbb{R}^{M \times M}$, $\mathbf{W}=\mathbf{A}diag(\mathbf{B} x)$, and $\mathbf{W} \in \mathbb{R}^{D \times M}$, and $\mathbf{K} \in \mathbb{R}^{M \times D}$.
$\mathbf{A} \in \mathbb{R}^{D \times M}$ and $\mathbf{B} \in \mathbb{R}^{M \times N}$ are both positive metrices.
How do I solve it as an inequality constrained optimization problem for $X$ ?
The problem is given by:
$$\begin{aligned} \arg \min_{x} \quad & \operatorname{Tr} \left( {\left( K A \operatorname{diag} \left( B x \right) \right)}^{T} \left( K A \operatorname{diag} \left( B x \right) \right) \right) - 2 \operatorname{Tr} \left( P K A \operatorname{diag} \left( B x \right) \right) \\ \text{subject to} \quad & {a}_{i} \leq {x}_{i} \leq {b}_{i} \; \forall i \end{aligned}$$
The problem here is the $ \operatorname{diag} \left( \cdot \right) $ operator which makes it hard to infer the gradient.
Yet:
$$ \operatorname{diag} \left( B x \right) = I \circ \left( \boldsymbol{1} {\left( B x \right)}^{T} \right) = I \circ \left( \boldsymbol{1} {x}^{T} {B}^{T} \right) $$
Where $ \circ $ is the Hadamard Product.
Now you can plug it in and use some Matrix Calculus to find the gradient (It seems that using the Frobenius Norm will be useful).
Once you have the gradient you can solve it easily with Projected Gradient Descent Method.
Probably due to use of the Trace Operator you can get better equivalent forms of the problem that takes advantage of $ A $ and $ B $ being Positive definite matrices. As since they are solving for $ y = B x $ is like solving for $ x $.
Remark:
I think the question: Given $ y = A x $ where $ A $ is PD matrix and it is known that $ {a}_{i} \leq {x}_{i} \leq {b}_{i} $ what can be said on $ y $ (Namley how it is bounded) deserves its own question.
Update
Thinking of it, one could calculate the gradient of the Frobenius norm directly:
$$ \frac{\mathrm{d} }{\mathrm{d} x} \frac{1}{2} {\left\| A \operatorname{diag} \left( B x \right) \right\|}_{F}^{2} = {B}^{T} \operatorname{diag} \left( {A}^{T} A \operatorname{diag} \left( B x \right) \right) $$