Suppose we have a function $f: \mathbb{R}^{d-1} \to \mathbb{R}$ given by
$$ f(x_1, \ldots, x_{d-1}) = \operatorname{trace} \left( A^T \mathbf{1}_{n\times 1} [ x_1, x_2, \ldots, x_{d-1}] B\right)$$
where $A,B$ are $n\times d$ and $(d-1)\times d$ matrices respectively. I'm trying to compute the gradient of this function. Currently I am just trying to expand out the entire expression inside the trace and compute the gradient that way, but it's quite messy and I'm not able to push it through. Can anyone point out a better approach? Thank you.
If you know about the Frechet derivative and its relation to the gradient of a function, this question becomes almost a triviality. For this question, the key result is the following:
What this theorem says is that a linear transformation is its own best linear approximation (i.e it is its own derivative). Now, note that the function $f: \mathbb{R}^{d-1} \to \mathbb{R}$ you have defined is linear (because trace is linear, and matrix multiplication is also linear). So, for every $x = (x_1, \dots, x_{d-1}) \in \mathbb{R}^{d-1}$, by the theorem above, we have \begin{equation} df_x(\cdot) = f(\cdot) \end{equation} In general, the gradient of $f$ at $x$ is the matrix of $df_x$ relative to the standard basis. So, \begin{align} \nabla f(x) &= \text{matrix of $df_x$ wrt standard basis} \\ &= \text{matrix of $f$ wrt standard basis} \\ &= \begin{bmatrix} f(e_1), & \dots &, f(e_{d-1}) \end{bmatrix} \end{align}
In other words, $\nabla f(x)$ is the $1 \times (d-1)$ matrix whose $i^{th}$ entry is $f(e_i)$; I'll leave it to you to compute what $f(e_i)$ is.
As you mentioned, expanding everything out in terms of components and taking partial derivatives is a nightmare in this case. This is why, if you're not already familiar with the Frechet derivative, I highly recommend you learn more about it... it simplifies computations like this immensely. As a reference, I would highly recommend Loomis and Sternberg's book Advanced Calculus (section 3.6 in particular) to learn about this.