I have a forecasting model $f$ that gives me my estimate $\hat{y}$. I also have the actual values $y$ that happened so I can compare how my model would have done out of sample.
Now the problem I have is that my model is giving me a point estimate -- one scalar value $\hat{y}$. However I think of my prediction as a normally distributed random variable, sometimes the variance is small and the model really will give a value close to the real one $y$, sometimes the uncertainty increases and so does the variance.
The closest intuitively I can get to this is to think that my expectation $\mu$ is $\hat{y}$ and that my $$\hat{\sigma} = \langle|\mathbf{\hat{y}} - \mathbf{y} | \rangle$$ i.e. average absolute error, but that does not seem right.
One approach I quite liked for uncertainty quantification was: $$\hat{\sigma} = \sqrt{\bigg\langle (\mathbf{\hat{y}} - \langle \mathbf{y} \rangle)^{2} \bigg\rangle + \langle|\mathbf{\hat{y}} - \mathbf{y} | \rangle^{2}}$$ but I am uncertain about its mathematical basis.
Consistent estimator of the variance of some random variable $Y$ (regardless of its distribution. Namely, as long as $E|Y|^2< \infty$ and $EY=\mu$ and $var(Y) = \sigma^2$) is $$ \hat{\sigma}^2 = \frac{1}{n}\sum_{i=1}^n (y_i - \bar{y}_n)^2, $$ hence, if your variable of interest is $\hat{Y}$ then $$ \hat{\sigma}_{\hat{Y}}^2 = \frac{1}{n}\sum_{i=1}^n (\hat{y}_i - \bar{\hat{y}}_n)^2, $$ where in linear regression with an intercept term you have $\bar{\hat{y}}_n = \bar{y}_n$.