Now yes, I know the definition:
A sequence of estimators $\{T_n\}$of $\tau(\theta)$ are consistent if for every $\epsilon > 0$ $$ \lim_{n\to\infty}P[|T_n-\tau(\theta)|\leq\epsilon]=1 \\\text{ or} \\ \lim_{n\to\infty}P[|T_n-\tau(\theta)|>\epsilon]=0 $$ for every $\theta \in \Omega$
But what is this telling me? I'm having trouble translating this into something I can grasp, remember and/or even understand correctly. By taking the $\lim$ of $T_n$ we get a higher probability ofgetting closer to the true $\theta$? Should I look at this at some sort of convergence of $T_n \overset{p}{\to} \tau(\theta) $? Why do we use the word consistent? What is so 'consistent' about it?
I think you want to know about the concept of consistency. Note that consistency is a large sample property of estimators. Here we consider the sequence of estimators $\{T_n\}$ to estimate parameter $\tau (\theta)$ where $n$ is the sample size. As the sample size $n\to \infty$, the data $(x_1,x_2, \dots,x_n)$ are practically the whole population and it is intuitively appealing to desire that a good sequence of estimators $\{T_n\}$ should be one for which values of the estimator tend to concentrate near $\tau(\theta)$ as the sample size increases. If $n \to \infty$ and the values of an estimator are not very close to $\tau (\theta)$ that is the performance of the estimator is not good, then the performance of the estimator will be bad incase the sample size is small.