If $V_1$ is smaller than $V_2$, then $V_1^*$ is bigger than $V_2^*$. Because every element in $V_2^*$ will also define a continuous linear functional on $V_1$. For example, $\mathcal D(\mathbb{R}) \subset C_c(\mathbb{R})$, and we have $C_c(\mathbb{R})^* \subset \mathcal D^* (\mathbb{R})$ in the sense that every radon measure defines a distribution. The collection of distribution is a large collection (it contains tempered distributions, measures, $L^p$ functions) since its domain the space of test functions is very small.
If $V_1$ is smaller than $V_2$, then $V_1^*$ is smaller than $V_2^*$. The easiest example to see is in $\mathbb{R}^n$, since they are isomorphic to their dual, so bigger space implies bigger dual space. Also we see that the sequence $\{n\mathbb{\chi}_{[0,1/n)}\}$ converges to $\delta_0$ in the sense of distribution. But when looked as a sequence in ${L^\infty}^*$, although it does not have any convergent subsequence, it has convergent subnet where the limit is a linear functional that acts like $\delta_0$ on continous functions, but differently on $L^\infty$ functions. So in this case, when we restrict the domain, a lot of different linear functional will become the same.
I can see both cases, but I can not understand their connection and causes for this difference.
I think I can answer this question now.
First, given $X\subset Y$, and $y^* \in Y^*$ restrict to $X$ defines a continuous linear functional on $X$ does not show $Y^* \subset X^*$, because the restriction map might not be injective.
Intuitively, there are two different factors which correspond to the two cases.
Finer topology $\Longrightarrow$ bigger dual spaces. This corresponds to the first case. When we have $$H^1 \subset L^2 \quad \text{ and } \quad (L^2)^* \subset (H^1)^*$$ this is not because $H^1$ is contained in $L^2$, it is because the topology we use on $H^1$ is finer than the topology induced by the $L^2$ norm. So the dual space of test functions has the largest dual space because it has a very fine topology.
Under the same norm, we have bigger space $\Longrightarrow$ bigger dual space, the simplest example is $R^n$, and also that $C_0(\mathbb{R}) \subset C_b(\mathbb{R})$ both with the max norm, then for their dual spaces we have $rca(\mathbb{R})\subset rba(\mathbb{R})$.