Let $L$ be a linear differential operator and consider the PDE $$\begin{cases} u_t + Lu = 0, \quad x \in \mathbb{R}^n\\ u = f, \quad t = 0\end{cases} \tag{1}$$ It is known that we may construct a continuous semigroup $S(t)$ whose infinitesimal generator is $-L$, and thus for all $t > 0$, $u(t) = S(t)f$ for sufficiently nice $f$.
What I am confused about is how one arrives at the often seen expression $S(t) = e^{Lt}$. For example, on its own it is not clear what it means to apply $e^{Lt}$ to a function $f$, and it seems what is actually meant is to apply $S(t)$ to $f$.
I know that this is inspired by the fact that in (1) we replace $L$ by a constant matrix $A$, then the resulting ODE has a solution $e^{-At}$. I also know that some meaning can be given to the exponential of a matrix either by an infinite series (if $A$ is bounded) or by using the functional calculus (if $A$ is self-adjoint).
However in neither of these two cases do I see why $S(t) = e^{Lt}$. Is there any justification for this or is this only a formal expression/a definition? In other words, do we define $e^{Lt}$ as "the semigroup whose infinitesimal generator is $L$" and the exponential has no further meaning here?
Time evolution has an exponential type of property. For example, suppose you have a state vector $x_1$ at $t=t_1$, and you want to know what the state vector will evolve to become at some $t_2 > t_1$. Then you can symbolically write $$ x_2 = S(t_2,t_1)x_1 $$ This assumes, of course, that $x_2$ is unique, which would be the case for any well-defined time evolution problem. You can see that there must be an exponential type of description because $$ x_3=S(t_3,t_2)S(t_2,t_1)x_1, $$ which implies that $$ S(t_3,t_2)S(t_2,t_1)=S(t_3,t_1). $$ If the system $S$ is time-independent, then $S$ will depend only on the differences of the two arguments, which leads to a time-invariant formulation where $S(t'',t')=\mathscr{S}(t''-t')$, meaning that the evolution depends only on the difference between the two arguments, and not the arguments themselves. The end result is a simple exponential property: $$ \mathscr{S}(t_b)\mathscr{S}(t_a)=\mathscr{S}(t_b+t_a). $$ In other words, time evolution has a simple exponential property when the system itself does not depend on time. As you might expect, in this case, to be able to write $$ \mathscr{S}(t)=e^{tA}. $$ With suitable continuity in $t$, such a representation is generally possible for linear systems.