http://freespace.virgin.net/hugo.elias/models/m_perlin.htm
This method involves getting a random dataset, sampling it at various resolutions, and adding together the result. I've heard it claimed that this is actually "Fractal Noise" or something diffe
http://www.itn.liu.se/~stegu/simplexnoise/simplexnoise.pdf
This PDF claims that there is mass-misinformation about what Perlin Noise is, and explains a completely different method (I'm not talking about the discussion of Simplex noise at the bottom, but about his summary of Classic Perlin noise).
Who is correct?
You can find out all about Perlin noise at the web page about it written by Ken Perlin himself. His first publication about it was the SIGGRAPH 1985 paper "An image synthesizer". He later described an improved version of the algorithm in "Improving Noise" at SIGGRAPH 2002, and has also posted source code for the reference implementation.
The basic idea is that you pick, at each lattice point $(i,j,k)$, the gradient $\mathbf g_{i,j,k}$ of the noise, determining linear functions $\mathbf g_{i,j,k}\cdot(x-i,y-j,z-k)$ which you then interpolate over the rest of the space. The first link you posted, which is regrettably one of the top search hits for "Perlin noise" in most search engines, suggests picking the value of the function and interpolating that instead. So the first link is wrong and the second link is right.