One defines the distributions on the set of test function which are infinitely differentiable and with a compact support (bump functions), call it $\mathscr{D}(\mathbb{R})$. Then to extend the concept of Fourier transform to distributions one takes the space of rapidly decreasing functions (Schwartz space) say it $\mathscr{S}(\mathbb{R})$.
Now it is clear that $\mathscr{D}(\mathbb{R})\subset\mathscr{S}(\mathbb{R})$, but I can not explain myself why $\mathscr{S}^\star(\mathbb{R})\subset\mathscr{D}^\star(\mathbb{R})$, being $X^\star$ the dual to $X$.
Is it true that $\mathscr{D}(\mathbb{R})\subset\mathscr{S}(\mathbb{R})$ and $\mathscr{S}^\star(\mathbb{R})\subset\mathscr{D}^\star(\mathbb{R})$?
In general if you have a vector space $A$ contained in another vector space $B$, then $B' \subset A'$, basically because any functional that's defined on all of $B$ can be restricted to $A$. This statement is already justified for the algebraic dual. For the topological dual (which is what you have here) you need some "topological compatibility"; for example it is sufficient to have $A$ equipped with the subspace topology inherited from $B$.
The more interesting question is whether the inclusion is strict. In the case of $S$ and $D$ the answer is yes, for example Schwarz functions are only required to decay faster than any polynomial at infinity, so $L[f]=\lim_{x \to +\infty} e^x f(x)$ is defined on $D$ (it is identically zero) and not on $S$.