Let us look at this simple $(1+1)$-dimensional transport equation:
$$\partial_t u+a\partial_x u = 0.$$
When solving this by the method of characteristics we obtain the following equation:
$$\frac{dt}{1}=\frac{dx}{a}=\frac{du}{0}.$$
From the last part, we find $u = c_1$ and from $dt = dx/a$ we obtain $x-at=c_2$.
To this point, I can explain everything that is happening, but now it is said that $c_1$ is a function of $c_2$, hence $u=c_1=F(x-at)$. I can apply this method, but I am not able to explain that why one constant should be related to the other constant. Is this just using the fact that we can express a constant as a function of another constant? This seems to be too much hand waving to me.
Think this way: Imagine you want to find one function $u=u(x,t)$ that it is constant through time along a path given by the curve $x=x(t)$, $i.e.$: $$\frac{du(x(t),t)}{dt}=\frac{\partial u}{\partial t}(x(t),t)+\frac{\partial u}{\partial x}(x(t),t)\frac{dx(t)}{dt}=0\tag{*}$$ This equation $(*)$ holds if, your PDE does: $$\frac{dx(t)}{dt}= a\tag{**}$$
Therefore the characteristic lines in which the solution of your PDE $u(x,t)$ is constant are $x-x_0=at$, $i.e$: $$u(x_0+at,t) = constant$$ This constant may be fixed without any loss of generality to $u(x_0,0)$. This fact shows that the function $u(x,t)$ can be expressed as a function of a unique parameter: $x_0$, and therefore, substituting the value of $x_0$ from the characteristic curve one has: $$u(x_0+at,t)=u(x_0)\rightarrow u(x,t)=u(x-at)$$