I have this convex function in $X$, given by $Trace(AX^TBX)$ where $A$, $B$ are p.s.d and all entries are real.
Now if I had a linear function $l(X)$ that prevents a trivial zero-matrix solution for $X$ in the minimization of $Trace(AX^TBX)$ w.r.t $X$. Now would setting the first-derivative(gradient) of $Trace(AX^TBX) + \lambda l(X)$ w.r.t $X$ as zero and solving for $X$ suffice to get the optimal solution for $X$ given the convexity, or are there more conditions/directional derivatives and so forth to consider?
For your convenience, the gradient is $(B \otimes A^T + B^T \otimes A)vecX + \lambda \nabla_Xl(X)$ and the gradient, $\nabla_Xl(X)$ does not contain $X$ in it, but has other terms. $\lambda$ is the Lagrange multiplier for enforcing the linear constraint.
Yes, a convex differentiable function has a minimum at any point where its gradient is $0$, and conversely any point where the gradient is $0$ is a minimum.