I'm trying to learn about Agrawal–Kayal–Saxena (AKS for short) primality test but I have some trouble understanding its efficiency.
If I understood correctly, the test amounts to probing whether the entries of $n$-th row of Pascal's triangle divide $n$. There are $n/2$ (because of the symmetry) entries and all are larger than $n$.
Trial divison tries to divide $n$ with all the (odd) numbers below $\sqrt{n}$.
What I don't understand is how could AKS by possibly (asymptotically) faster, as it tests more and larger numbers?
That's leaves me with the assumption that the above description of AKS is horribly inefficient. If that is the case, is it possible to explain in simple terms some strategies for improving the performance?
The explanation you give is describing Lemma 2.1 in the AKS paper (equation 1 in the concepts of the Wikipedia entry). This was known in the 1600's and is exponential time. It is inefficient even compared to naive trial division.
The popular Numberphile video can be easily inferred to state that this, known computationally inefficient method, is the AKS primality test. It is not. The very clever trick in AKS is reducing both the number of entries and their size, and proving that they can be reduced sufficiently such that the test is polynomial in the size of the input.
The Wikipedia page algorithm summarizes the method, and there are some worked examples. This should be sufficient to get started, though testing the polynomials should be done with sufficient optimizations as that's where all the time is spent.
In my experience, trial division is still faster until something like $10^{17}$ or so, after which AKS is faster. The exact crossover will depend on your implementations. That crossover used a significant Bernstein theorem 4.1 improvement, while the algorithm from the v6 paper (as on Wikipedia) crosses over at about $10^{25}$. I'm sure the trial division method could be optimized some more.
There's also not any practical point other than the intellectual challenge. The BPSW test is general, polynomial, deterministic, and unconditional for all 64-bit inputs (i.e. smaller than 18,446,744,073,709,551,616). It is thousands to millions of times faster than AKS. Past that number, we find the APR-CL and ECPP tests to be substantially faster and with small exponents. They are also unconditionally correct and not probabilistic tests. There are some neat things about AKS that make it attractive for stating computational complexity, and it is mathematically fascinating. But it isn't actually used in practice.