The second lemma of Strang states that for a certain choice of $V_h$, $a$, $u$ and $f$ there exists a $c>0$ such that
$$||u-u_h|| \leq c \left(\inf_{v\in V_h} ||u-v|| + \sup_{v\in V_h} \frac{|a(u,v)-(f,v)|}{||v||}\right)$$
where $V_h \not\subset H_0^1(\Omega)$.
The first term on the right side is called approximation error, the second one consistence error. What is the motivation of these names, i.e. why are the terms named as they are?
The approximation error comes from the discretization of the function space; it has nothing to do with equation we are solving. This error is the distance from $u$ to the space $V_h$. There is no way for $u_h$ to be closer to $u$ than this number, because $u_h\in V_h$ by construction.
The consistency error comes from the discretization of the equation. Originally it was: $a(u,v)=f(v)$ for all $v\in H_0^1$. When working with elements $V_h$ we actually use discrete analogs of $a$ and $f$, which should have been called $a_h$ and $f_h$ in your equation. In general, the exact solution of the continuous equation is not a solution of the discrete equation. That is, $a_h(u,v)-f_h(v)$ is not necessarily zero for all $v\in V_h$. This is what the consistency error measures; it is about the (imperfect) consistency between the continuous and discrete equations.
In a summary: