I am creating an algorithm that estimates the State-of-Charge (SoC) of a battery. The table below, compares my algorithm with the true SoC.
| Predicted SoC | Real SoC | Error |
|---|---|---|
| 98.30% | 98.19% | -0.10% |
| 96.43% | 96.28% | -0.14% |
| 94.53% | 95.70% | 1.16% |
| 91.89% | 93.32% | 1.42% |
| 90.57 | 93.13% | 2.56% |
| 89.62% | 93.85% | 4.22% |
| 88.76% | 92.05% | 3.29% |
| 87.56% | 91.20% | 3.63% |
| 86.01% | 89.22% | 3.20% |
| 84.63% | 89.56% | 4.92% |
| 83.20% | 89.97% | 6.76% |
| 81.39 | 86.95% | 5.56% |
| 76.12% | 80.94% | 4.82% |
| 74.65% | 80.97% | 6.32% |
How can I calculate the Mean Error of my algorithm, since the samples do not have equally spacing between each other?
There are different metrics of error you can use, one of which is mean squared error: $$MSE=\sum_{i=1}^n (y_i-\hat y_i)^2$$
Another one is mean absolute error, given by
$$MAE=\sum_{i=1}^n|y_i-\hat y_i|$$
Clearly, there is a pattern in your model's errors. It is over predicting for the larger SoC's and underpredicting for the lower SoC's. Thus, the residuals wouldn't be symmetric/assumption of normality is not met: your fitted curve does not go through the data.