It is a well-known result that for stationary Gaussian sources with memory having a power spectral density $\Phi_{xx}(\omega)$, the rate distortion function in parametric form is given as $$R(\theta) = \frac{1}{2 \pi}\int_{-\pi}^{\pi} \max \left\{ 0, \frac{1}{2}\log_2 \frac{\Phi_{xx}(\omega)}{\theta} \right\} d \omega$$ $$D(\theta) = \frac{1}{2 \pi}\int_{-\pi}^{\pi} \min \left\{ \Phi_{xx}(\omega),{\theta} \right\} d \omega$$
My question is related to the case when the spectral density, $\Phi_{xx}(\omega)$, is a function of time as well, e.g., a windowed Fourier transform that changes as according to where the window is placed as for non-stationary random process with spectral density $\Phi_{xx}(\tau, \omega)$ where $\tau$ is the center of the window. Is there a way to compute the rate distortion function in this case?
Edit: Maybe some kind of time-frequency analysis?
For parallel Gaussian channels, the way to achieve best Rate-distortion performance is to use the so-called ``water filling procedure'', by bringing up the subchannels with various distortion levels to all the same level, and optimally allocating more bits to worse distortion subchannels. This can also be done for dependent parallel channels.
Establish the dependence and use this idea, treating a set of $K$ consecutive blocks of data $$X_1^n,X_{n+1}^{2n},\ldots,X_{(K-1)n+1}^n$$ as if each block is a parallel channel, $n,K$ being chosen appropriately.
See these notes, for example, for the parallel Gaussian channels case.