For $p$ = 1, it's clear because for $f \in L^1(\mathbb R^n)$, one can define a tempered distribution by $T_{f}(\varphi) := \int_{\mathbb R^n} f(x) \varphi(x) dx$.
Apparently, one can make the argument that $\mathcal S(\mathbb R^n) \subset L^p(\mathbb R^n)$ implies that the for the dual spaces the opposite inclusion holds. Why is that?
Regarding the "abstract nonsense" statement "$A \subset B \Rightarrow B' \subset A'$", the important thing there is that the inclusion mapping $i : A \to B,i(x)=x$ is continuous. Given $F \in B'$, the restriction of $F$ to $A$ is $F \circ i$. If $i$ is continuous, then $F \circ i$ will be continuous. So the restriction is in $A'$. We identify the restriction with the original functional $F$.
This identification is an abuse of notation: strictly speaking an element of $B'$ can never be an element of $A'$ or vice versa. But it is a reasonable abuse of notation, especially in cases like this one, in which $A$ is dense in $B$, so that the restriction uniquely specifies $F$. This situation in which $A$ is a dense subset of $B$ with a continuous inclusion mapping is very common in functional analysis.
You can expand this out into a statement about norms and seminorms in this particular case, but this is really the main point.