I am trying to implement a method for estimating the Power Spectrum Density function of a time-series following this paper: PsyBoG: A scalable botnet detection method for large-scale DNS traffic by Jonghoon Kwon et al. 2016.
In their method they describe the PSD estimation as follows (emphasis mine):
First, we specify a number of segments to use as input time series for the PSD analysis. The number of segments affects the frequency domain and its range. For a high-quality PSD, the number of segments is selected from among the powers of two, and we limit the length of the number of segments to $2^{14} = 16,384$ to ensure a fast PSD analysis. Note that the size of a single segment is 1 s, and we apply the sliding time-window strategy to cover a long input trace. Fig. 3 briefly shows the concept of the segments, time window, and sliding window. The second step in the PSD estimation is to remove the mean value of the Fourier mode from the time series. This is a standard technique [29] that allows a more accurate PSD estimation. In the third step, we use a Hanning window that is used on half-overlapped intervals to ensure the best signal-to-noise ratio (SNR). The last step consists of operating the PSD analysis for the specified segments of the input time series.
The citation ([29]) links through an article to the following book: Discrete-Time Signal Processing - Second Edition. This does mention removing the mean from the time-series (i.e. centering the data) and it does mention averaging the fourier coefficients obtained by each of the sliding windows, but I don't think that that is what they meant with the bold sentence.
So my questions are as follows:
- What is meant with the mean value of the Fourier Mode?
- How would one 'remove' this value from a time-series?
Edit.
I found another bit of information on the method in [29]:
Given the interval, the first step in PSD estimation is to remove the mean value of the Fourier mode from the time series. This is a standard technique (see [12]) that allows more accurate PSD estimation. This mean-removal subtracts the estimated static error from the signal over the interval.
([12] is the book mentioned before).
From this I conclude that the goal is to remove static noise, which would appear in the Fourier transformation as a constant value equal to the Variance of the noise. So that answers the first question. I guess that this gives us an estimate of the noise in our time-series. But how would you remove that from the time-series?