A locally integrable function $f$ defines a distribution from $\mathcal{D}(U)$ to $\mathbb{R}$ where $U$ is a subset of $\mathbb{R}^n$.
Why is it acceptable to say that $L^1_{\text{loc}}$ (the space of locally integrable functions) is subspace of $\mathcal{D}'(U)$ (the space of distributions over $U$)? Knowing that $L^1_{\text{loc}}$ contains functions not functionals.
Right! I think one great virtue of L. Schwartz' idea of treating "distributions" as continuous linear functionals was that it made their existence completely rigorous. And some other basics.
At the same time, in general the dual space of a topological vector space does not naturally contain the original space. In the case of distributions, the possibility to view them as generalized_functions, enlarging the class of "functions", is/was very important. E.g., now we can differentiate a step function ... and get a Dirac delta at the jump. And there are various theorems showing that (locally, or with other hypotheses) every distribution is a sufficiently high-order derivative of a continuous function... And similar.
But/and, yes, we should be sure to remember that the "functional" given by an $L^1$ function is really integrate-against-it. When I teach courses introducing these ideas, a repeatedly write "(integration-against-) $L^1$ function $f$"...
There are some relatively minor normalization and/or notational complications arising from identifying $f$ with integration-against-$f$, but these are easy to accommodate.