Probabilistic primality testing methods like Rabin Miller and Solovay Strassen, were created at the time when mathematicians were not sure whether there is a deterministic polynomial algorithm.
After 2002 paper with a polynomial algorithm and it improved version which runs in $O(\log(n)^6)$ do we still need to use probabilistic methods for practical applications (cryptography)?
I can guess that the answer is yes, because they are faster, but I am interested to see how much faster are they for 4096 bit RSA keys. (or if I am wrong to know whether we still need them).
You can see a graph and discussion of runtimes of some open source solutions, including BPSW (probable prime test), APR-CL, ECPP, and AKS in this answer to AKS usefulness. Reading your text, it looks like you might be asking "Did AKS have any impact on practical crypto implementations?", and the answer is "No."
Going with Charles' example, 2048 bits is 617 digits. The times I measured for random 600 digit primes (single thread on 4770K) were:
Observed growth rates for my implementation and platform of $\log(n)^{2.4}$ for BPSW, $\log(n)^{3.6}$ for APR-CL, $\log(n)^{4.0}$ for ECPP (both versions), and $\log(n)^{6.4}$ for AKS (both v6 and B+V improvements). AKS is important for the theory, but not useful in implementations. Clearly BPSW is much faster than the proof methods. APR-CL may have a higher exponent as the size increases (this measurement only went to 2000 digits). Only ECPP of these methods gives a certificate.
As Charles points out, we really want to use pre-tests when checking primality for arbitrary numbers, such as the case of selecting random primes or calling a
nextprimefunction. Trial division type testing is of course easy and gets rid of lots of composites. A BPSW test consists of a single M-R test with base 2, followed by some variant of a Lucas test. At this size, almost all composites will fail the single M-R test, so the actual time taken for them is only about 1/3 the time shown for primes.Another interesting point: both ECPP implementations perform BPSW tests throughout the computation. It's very cheap compared to the cost of the other operations. So even when we use proofs, internally the code may be doing some probable prime tests.
For DSA, the Q size is typically 160-256 bits, so a proof is pretty easy (BLS75 or ECPP or APR-CL) and should only take a few milliseconds. I don't think many implementations actually do that however (I'm only aware of one). For the larger sizes (1024+ bits for DSA's P or RSA P/Q) it will start taking appreciable time and spending a little more time on more probable prime tests is likely to be a better payoff (e.g. more random-base M-R tests, add a Lucas or Frobenius test).
Also note that there are two common algorithms for creating random proven primes via construction: Shawe-Taylor and Maurer. Both build up random proven primes by using a chain of Pocklington (or BLS75 theorem 3 or 5) tests. This is fairly straightforward (FIPS 186-4 describes a Shawe-Taylor algorithm) and one can get a certificate output as well if desired. There is a penalty that not every possible prime can be created, but the papers discuss that and indicate it should not be a security issue for large sizes. Do note that I've seen broken implementations that can produce composites, so for mine I added BPSW tests for all partial results and verify the final certificate as well. While this method is faster for large sizes than taking a random probable prime and running a generic proof on it, it is slower than using a good random probable prime generator and adding some decent extra tests (so the resulting number has passed BPSW, a few additional random-base M-R tests, and a Frobenius test). Your implementations and needs may vary.