What is the benefit of using Euler's number e to calculate Exponential Moving Average (EMA)?

120 Views Asked by At

I have found three use cases that people use EMA or Exponential smoothing to analyze time-series data in the computer world. The use case 1 & 2 uses very simple $\alpha$, and the last one uses a complicated expression to act as $\alpha$.

When the sequence of observations begins at time $t=0$, the simplest form of exponential smoothing is given by the formulas, which the raw data sequence is often represented by $x_t$, and the output of the exponential smoothing algorithm is commonly written as $s_{t}$

$$ s_{0}=x_{0}\\ s_{t}=\alpha x_{t}+(1-\alpha )s_{t-1} $$

where $\alpha$ is the smoothing factor, and $0<\alpha <1$

Use case 1 RFC 2988 Computing TCP’s Retransmission Timer

... a TCP sender maintains two state variables, SRTT (smoothed round-trip time) and RTTVAR (round-trip time variation).

... When the first RTT measurement R is made, the host MUST set

SRTT <- R 

When a subsequent RTT measurement R’ is made, a host MUST set

RTTVAR <- (1 - beta) * RTTVAR + beta * |SRTT - R’| 
SRTT <- (1 - alpha) * SRTT + alpha * R’

The above SHOULD be computed using alpha=1/8 and beta=1/4...

Use case 2 Java Virtual Machine source code

#define DEFAULT_ALPHA_VALUE 0.7

// ...

void AbsSeq::add(double val) {
  if (_num == 0) {
    // ...
  } else {
    // otherwise, calculate both
    _davg = (1.0 - _alpha) * val + _alpha * _davg;
    // ...
  }
}

The default value of $\alpha$ is 0.7 in this case.

Use case 3 How Linux calculate system load avgerage

static inline unsigned long
calc_load(unsigned long load, unsigned long exp, unsigned long active)
{
    unsigned long newload;

    newload = load * exp + active * (FIXED_1 - exp);
    if (active >= load)
        newload += FIXED_1-1;

    return newload / FIXED_1;
}

The $\alpha$(corresponding the 2nd parameter exp) in this situation is complicated, which exp is

$$ exp=e^{-\frac{\Delta{t}}{T}}\\ $$

where $e$ is Euler's number, and $\Delta{t}$ is the sampling interval(e.g. 5 second), and $T$ means the exponential moving average of system load is in $T$ time constant(e.g. 1 minute, 5 minute or 15 minute).

I wonder why Linux prefer to use such a complex expression concerning Euler's number $e$ to act as simple, pure decimal $\alpha(0<\alpha <1)$.