I am reading a paper that describes an approach towards detecting abnormal flows within a network. From what I understand, the way they do so is by keeping a list of $n$ most recent metrics and then they predict the future metric at time $t_{n+1}$ using WMA as their forecasting method. At time $t_{n+1}$ when the actual metric is found, they "compare" the predicted output with the actual output using their ratio metric:
$R(v^{p}_{n+1}) = | \frac{v^{a}_{n+1}}{v^{p}_{n+1}} |$
They then define an upper limit and lower limit to "demonstrate the deviation between history records and the actual" using the two functions:
$R^{u}_{a} = R(avg_{a} + 3 * std_{a})$
$R^{l}_{a} = R(avg_{a} - 3 * std_{a})$
where $avg_{a}$ and $std_{a}$ are the average value and standard deviation of the list of $n$ previous metrics.
My question is what exactly is this "ratio" function doing and why are they choosing of comparing forecasted value vs actual in this way? From my understanding they eventually use the value produced by this function to compare it with the upper and lower limits, however during some light testing I found that this did not produce the results I expected.
As a small example, consider a list $[143,144,143,142,145,143]$ and I want to predict the 7th value. Lets say the predicted value was something like $144$ and the actual value was $145$ The average in this scenario would be $143.333$ and the standard deviation = $.9428$
My problem with this equation starts when you compute the upper and lower bounds. The parameter for the upper bound function would equal $143.333 + 3*.9428 = 146.1614$ while the parameter for the lower bound would be $140.5046$. This makes sense as upper and lower bounds until you use these values in the ratio function itself - which then means you divide the actual by a large value yielding a smaller result while at the same time when you divde by a smaller value you yield a larger result. Suddenly the upper and lower limits are "swapped" and don't make much sense. Am I misunderstanding how this formula is supposed to work? If it is supposed to work what value am I actually gaining from dividing the actual by predicted and so on?
Equations can found on page 5