Mathematical measure of whether something's on schedule or not?

65 Views Asked by At

Say I decide to jog everyday for 1 km. I start jogging and I keep a log of which days I'm jogging on. Now after a couple of months, I notice in my logbook that although I jogged for 1 km on most days, there were days when I didn't jog at all. Also, on some days I jogged for 2 or 3 km. Now I want to quantify how much I was on my determined schedule (of jogging 1 km per day). Which mathematical measure would be applicable here?

I think calculating standard deviation is the key here, but I'm not sure how to calculate it in this context. Any answer with example data and calculations will be appreciated!

1

There are 1 best solutions below

6
On BEST ANSWER

A standard deviation refers to a probability distribution, but of course we don't actually know what's the probability distribution for the $\text{km}$ you'll jog on a given. We can naively estimate this distribution as follows.

Let $N$ be your sample size, meaning your log has $N$ entries (or days). Here, the $i$-th entry is the number $x_i$ of $\text{km}$ you ran on day $i$, including days when you ran no $\text{km}$ at all $($in this case, $x_i=0)$. To each value $v$ on your log we'll associate the probability

$$p_v=\frac1N\cdot\#\{\text{entries with value $v$ in your log}\}.$$

The expected value of the number of $\text{km}$ ran on a given day will then be

$$\mu=\sum_v\,v\cdot p_v$$

and the standard deviation will be

$$\sigma=\sqrt{\sum_v^{}\,p_v\cdot {(v-\mu)}^2}$$


Example: Suppose your $2$-month log has $61$ days, with the following distribution:

  • $9$ days with $0$ $\text{km}$ jogs
  • $38$ days with $1$ $\text{km}$ jogs
  • $9$ days with $2$ $\text{km}$ jogs
  • $5$ days with $3$ $\text{km}$ jogs

Then our set of values $v$ is $\{0,1,2,3\}$ and we have

$$ \begin{array}{cc} p_0=\frac{9}{61}&&p_1=\frac{38}{61}&& p_2=\frac{9}{61}&&p_3=\frac{5}{61} \end{array} $$

Our mean jog has a length of

$$\mu=0\cdot\frac{9}{61}+1\cdot\frac{38}{61}+2\cdot\frac{9}{61}+3\cdot\frac{5}{61}=\frac{71}{61}\simeq 1.164$$

kilometres, and our standard deviation will hence be

\begin{align} \sigma &=\sqrt{ \frac{9}{61}\cdot{\left(0-\frac{71}{61}\right)}^2+ \frac{38}{61}\cdot{\left(1-\frac{71}{61}\right)}^2+ \frac{9}{61}\cdot{\left(2-\frac{71}{61}\right)}^2+ \frac{5}{61}\cdot{\left(3-\frac{71}{61}\right)}^2}\\ &=\sqrt{ \frac{9\cdot 71^2}{61^3}+ \frac{38\cdot 10^2}{61^3}+ \frac{9\cdot 51^2}{61^3}+ \frac{5\cdot 112^2}{61^3}}\\ &=\sqrt{ \frac{45369+3800+23409+62720}{61^3} }\\ &=\sqrt{\frac{135298}{61^3}}\simeq 0.772 \end{align}

kilometres.


The standard deviation is a measure of how closely you follow your schedule in the following sense: the closer your values $v$ are to the mean $\mu$, the smaller the standard deviation $\sigma$. For instance, if you had run $1$ $\text{km}$ every day, then $\sigma=0$.

The thing is, the standard deviation in principle does not really care that your schedule is $1$ $\text{km}$ per day... If you had run $0$ $\text{km}$ every day, or $3$ $\text{km}$ day, then you'd also have $\sigma = 0$.

With this in mind, what you can do is take a page from least squares and calculate how far from your scheduled $1$ $\text{km}$ your jogs are, on average. In symbols, we'll be looking at

$$\epsilon=\sqrt{\frac{1}{N}\sum_{i=1}^N\,{\left(x_i-1\right)}^2}.$$

Grouping the $x_i$ by their values $v$, it turns out that

$$\epsilon=\sqrt{\sum_v^{}\,p_v\cdot {(v-1)}^2}.$$

Looks familiar, huh? It's like the calculation for $\sigma$, except here we're forcing $\mu=1$ (our scheduled value). Using the numbers for our previous example, we'd get

\begin{align} \epsilon &=\sqrt{ \frac{9}{61}\cdot{\left(0-1\right)}^2+ \frac{38}{61}\cdot{\left(1-1\right)}^2+ \frac{9}{61}\cdot{\left(2-1\right)}^2+ \frac{5}{61}\cdot{\left(3-1\right)}^2}\\ &=\sqrt{ \frac{9\cdot 1}{61}+ \frac{38\cdot 0}{61}+ \frac{9\cdot 1}{61}+ \frac{5\cdot 4}{61}}\\ &=\sqrt{ \frac{38}{61}} \simeq 0.789 \end{align}

Now, $\epsilon$ measures how closely you follow your schedule in the following sense: the closer your values $v$ are to your intended schedule value $($ $1$ in this case$)$, the smaller the standard deviation $\sigma$. If you had run $1$ $\text{km}$ every day, you'd still get $\epsilon=0$.

Moreover, we don't run into the same problems as we did with $\sigma$ before. If you had run $n$ $\text{km}$ every day, then you'd have $\epsilon=|n-1|$ -- you can't get better than this for these cases!