Let $C \in \mathbb{R}^{d \times d}$ be symmetric, and
$$Q = \begin{bmatrix} \vert & \vert & & \vert \\ q_1 & q_2 & \dots & q_K \\ \vert & \vert & & \vert \end{bmatrix} \in \mathbb{R}^{d\times K}$$
where $d \geq K$. Using Lagrange multipliers,
$$\begin{array}{ll} \text{maximize} & \mbox{tr} \left( Q^T C Q \right)\\ \text{subject to} & Q^T Q = I\end{array}$$
I am unfamiliar with these kind of constraints with this method, and after reading another post I believe the same specific and simple result given is also applicable, and therefore the lagrangian would be:
$$\mathcal{L}(Q,\lambda)=\mathrm{tr}(Q^TCQ)-\left<\lambda,Q^TQ-I\right>$$
where $\lambda\in\mathbb{R}^{K\times K}$, and $\left<\cdot,\cdot\right>$ is the element wise inner product (what kind of makes sense to me since we're actually adding as many constraints as there are elements in these matrices.
In attempting to do that I start taking $\frac{\partial \mathcal{L}}{\partial Q}=O\in\mathbb{R}^{d\times K}$, and compute that LHS element by element; for the $(l,m)$ one:
\begin{equation} 0=\frac{\partial \mathcal{L}}{\partial Q_{lm}}=(CQ+C^TQ)_{lm}-\underbrace{\frac{\partial}{\partial Q_{lm}}\sum_{i,j}\lambda_{i,j}(Q^TQ-I)_{ij}}_{=\lambda_{lm}\frac{\partial (Q^TQ)_{lm}}{\partial Q_{lm}}}=2(CQ)_{lm}-\lambda_{lm}\frac{\partial (q_l^Tq_m)}{\partial q_m(l)} \tag{1}\end{equation}
where in the last step I've used the definition I made at the beginning for $Q$, and $q_m(l)$ denotes the $l$-th component of the column vector $q_m$.
In trying to compute the very last term: $$\frac{\partial (q_l^Tq_m)}{\partial q_m(l)}=\frac{\partial \left[q_l(1)q_m(1)+ \ldots + q_l(d)q_m(d)\right]}{\partial q_m(l)}= \begin{cases} q_l(l)\equiv Q_{ll} & \text{if } l\neq m\\ 2q_l(l)\equiv 2Q_{ll} & \text{if} l=m \end{cases}$$
The whole equality (1) then can be written:
$$0=2(CQ)_{lm}-\lambda_{lm}Q_{ll}(1+\delta_{lm})$$
where $\delta_{lm}$ is the Kronecker delta.
The equation for the other stationary point of the lagrangian, $\frac{\partial \mathcal{L}}{\partial \lambda}=O\in\mathbb{R}^{K\times K}$, for the $(l,m)$ element as well:
$$ 0=\frac{\partial \mathcal L}{\partial \lambda_{lm}}= \frac{\partial }{\partial \lambda_{lm}}\sum_{i,j}\lambda_{i,j}(Q^TQ-I)_{ij}=(Q^TQ-I)_{lm}\tag{2}$$
what obviously leads to $(Q^TQ)_{lm}=\delta_{lm}$.
All this should tell that the columns of $Q$ are eventually the $K$ first eigenvectors of $C$, but I don't know how to continue from here to prove that, supposing I didn't make a mistake. Please I would sincerely appreciate any help.
Edit:
I have rewritten the inner product as a trace of a product of matrices (after seeing this question):
$$\left<\lambda,Q^TQ-I\right>=\sum_{i,j}\lambda_{i,j}(Q^TQ-I)_{ij}=\mathrm{tr}(\lambda^TQ^TQ) $$
and have thus managed to do the derivative without losing the matrix format (using formulas from the Matrix Cookbook):
\begin{align} O=&\frac{\partial \mathcal{L}}{\partial Q}=\frac{\partial}{\partial Q}\mathrm{tr}(Q^TCQ)-\frac{\partial}{\partial Q}\underbrace{\mathrm{tr}(\lambda^T(Q^TQ-I))}_{\mathrm{tr}(\lambda^TQ^TQ)-\mathrm{tr}(\lambda^T)}\\=&(CQ+C^TQ)-(Q(\lambda^T)^T+Q\lambda^T)=2CQ+Q(\lambda+\lambda^T) \end{align}
And this leads to:
$$CQ=Q\underbrace{\left(-\frac{\lambda+\lambda^T}{2}\right)}_{:=\widetilde{\lambda}};\quad CQ=Q$$
If the defined matrix $\widetilde{\lambda}=Q^TCQ$ were diagonal we would already have the result.
Since $C$ is symmetric real we can write $C=U \Lambda U^T$ where $\Lambda$ is a diagonal matrix of eigenvalues. As $Q^T U U^T Q = I$, we can just assume $C= \operatorname{diag} (\lambda_1,...,\lambda_d)$, where $\lambda_1 \ge \cdots \ge \lambda_d$.
The problem is then $\max_{Q^TQ=I} \operatorname{tr}(Q^T \Lambda Q)$.
Note that $\operatorname{tr}(Q^T \Lambda Q) = \operatorname{tr}(Q^T Q Q^T \Lambda Q) = \operatorname{tr}( Q Q^T \Lambda QQ^T) = \operatorname{tr}(P^T \Lambda P)$, where $P=Q Q^T$.
Note that $P$ is an orthogonal projection onto a subspace of dimension $K$. Furthermore, any such orthogonal projection can be written in the form $Q Q^T$, where $Q^TQ = I$.
So now the problem is $\max_{P \text{ orthogonal projection}, \text{ rk } P=K} \operatorname{tr}(P^T \Lambda P)$.
Note that $\operatorname{tr}(P^T \Lambda P) = \sum_{n=1}^d \lambda_n \|P e_n\|^2$. Furthermore, note that $\|P\|_F^2 = K$ and so $\sum_{n=1}^d \|P e_n\|^2 = K$ with $0 \le \|P e_n\|^2 \le 1$. ($e_n$ is the $n$th unit vector.)
It is straightforward to check that $\max\{ \sum_{n=1}^d \lambda_n \mu_n | \sum_{n=1}^d \mu_n = K, 0 \le \mu_n \le 1 \}$ is $\lambda_1+\cdots+ \lambda_K$.
Hence $\operatorname{tr}(P^T \Lambda P) \le \lambda_1+\cdots+ \lambda_K$ and by choosing ${\cal R} P = \operatorname{sp}\{e_1,...,e_K \}$ we see that this is attained.