Parsers & compilers usually utilize deterministic finite automata to parse input.
It's very easy to implement a generic DFA tool, that simulates any DFA table for example to validate input.
However, DFAs recognize regular languages, which is a subset of context-free languages generated by BNF grammars, so there must be an extra "trick" to use DFAs for BNF.
The easy answer is to use a special solution for the given grammar, for example, use an extra subroutine to parse matching parentheses. (Which is only possible to a bound depth with DFAs)
But advanced tools like ANTLR have a general solution for generating BNF parsers, and they still use DFAs. (I guess this is necessary to maintain high performance)
What's the mathematical background for providing general solution for this problem? I'm satisfied with a draft answer(I can look up the details in my textbook) or some articles that go into mathematical explanation.
They may use DFAs for lexical analysis (identifying tokens), but the "heavy lifting" is going to be done using more powerful algorithms: for example, recursive descent, LALR(1), LL(*), etc.
You might find this table enlightening.