What's the difference between a spectrogram--spectrum over time (${F}_f(t)$?)-- and a, whatsitcalled, soundwave plot--velocity over time $V(t)$--as often encountered in Music production? Are those in principle the same but with different parameters?
I'm reasoning thus:
For the spectrogram, intervals of length $\Delta t$ are taken from a signal and Fourier transformed individually with any crude trick, to get a spectrum for each interval in the spectrogram (also called a sinogram?). The resulting spectra are plotted over time to give a spectrogram [time-axis, frequency axis, and coordinates colored by velocity(f,t)]. Since we can't hear below 20 Hz, we could still get a decent resolution without severely limiting the bandwidth, for some interpretation of decent. If the intervals are too short, low frequencies will be missed.
For the V(t)-graph, normally the current attenuation of the speaker is plotted over time. Instead of plotting attenuation as distance perpendicular to the time axis, we could also choose to shade points on the time axis proportionally.
Now I'm thinking, with very short intervals for the spectrogram, low frequencies will not be filtered out, but appear somehow as aliasing effects in higher frequencies, varying over time. Can we get an infinitesimal slice to get a spectrum at a single point, and doing so for every point instead of taking intervals, can this be plotted to look like the V(t) plot? I mean, that's like taking a single sin(t) and scaling that with some f(t).
I guess the real insight for me would be to learn the various computational methods of transformation. So, I get this is a poor question. The basic question really is, how to get from the 2D picture to a 3D heat-map. I would need to do more work on that.