I am trying to take the derivative of a scalar function with respect to a matrix. The scalar function includes an augmented vector. The derivative I am interested in is:
$$\frac{df([a|(\vec{b})^T\mathbf{X}]\vec{c})}{d\mathbf{X}}$$
a is a constant, $\vec{b}$ is dimension $m\times 1$, $\mathbf{X}$ is dimension $m\times r$, and $\vec{c}$ is dimension $(r+1)\times 1$
I can figure out the derivative if the $1\times r$ vector $(\vec{b})^T\mathbf{X}$ is not augmented; however, I am unsure of how to handle the augmentated aspect.
You can rewrite $$ [a| \mathbf b^T\mathbf{X}]\mathbf c = a_1 c_1 + \mathbf b^T\mathbf X \mathbf d, $$ where $\mathbf d$ is the column-vector $\mathbf d = (c_2,c_3,\dots,c_{r+1})$. Let $g$ denote the function $g(\mathbf X) = a_1 c_1 + \mathbf b^T\mathbf X \mathbf d$. The function that you are considering is the composition $f(g(\mathbf X))$.
With that, we can use the chain rule: $$ \frac d{dx}f(g(\mathbf X)) = f'(g(\mathbf X)) \cdot \frac{dg}{d\mathbf X} $$ Notably, $f'(g(\mathbf X))$ is a scalar. The form of $\frac{dg}{d\mathbf X}$ depends on your layout convention. The denominator-layout form of $\frac{dg}{d\mathbf X}$ (which is what you would use for gradient descent) is $$ \frac{dg}{d\mathbf X} = \mathbf b \mathbf d^T. $$