I wonder why many books say that the Q-matrix in the Algebraic Riccati Equations:
$$A^T X + X A - X B R^{-1} B^T X + Q = 0 $$ or
$$X = A^T X A -(A^T X B)(R + B^T X B)^{-1}(B^T X A) + Q$$
Is often set to $Q = C^TC$ when finding the steady state kalman gain matrix $K$ in the kalman filter state update:
$$\hat x(k+1) = \hat x(k) + K(y(k) - C\hat x(k))$$
Or finding the optimal control law:
$$u = r - Lx(k+1)$$
Why $Q = C^TC$? I think that is great that it can be at least one answer for the $Q$ matrix instead of saying that the $Q$ matrix should be greater than $0$. But what results will I get if I always say that $Q=C^TC$ ?
The choice $Q = C^T C$ basically puts cost on the system output. Since often, the output of the system is what we want to control, this is a simple choice that can be used as a first try.
In practise, however, the states (respectively the outputs) can be on very different scales - therefore, it might not make much sense to weight the outputs like that.
What will work best depends on your objectives, but there are different tuning rules to choose the $Q$ matrix. Look for example at this lecture note by R. M. Murray, pages 2 and 3. The tuning rule there is based on selecting a diagonal $Q$
$$ Q = \text{diag}(\begin{bmatrix} q_{1} & q_{2} & \dots & q_{n} \end{bmatrix}) $$
Then you choose, based on your system knowledge, values that are "just well enough" for your purpose. For example (adapted from the above linked reference): say $x_1$ is a position error and $x_2$ an angular position error. Based on your system knowledge, you decide
$$ \begin{align} 1 \text{ cm position error is "still OK"} &\Rightarrow q_1 = \Big( \frac{1}{100} \Big)^2 \newline \frac{1}{60} \text{ rad angular error is "still OK"} &\Rightarrow q_2 = 60^2 \end{align} $$
So, you would then choose your $Q$ matrix as
$$ Q = \text{diag}(\begin{bmatrix} q_{1} & q_{2} \end{bmatrix}) = \begin{bmatrix} q_1 & 0 \newline 0 & q_2 \end{bmatrix} = \begin{bmatrix} 0.0001 & 0 \newline 0 & 3600 \end{bmatrix} $$
As you can see, the scale of these two parameters is very different, so $Q = C^T C$ might be a poor choice here, depending on how both $x_1$ and $x_2$ are measured.