I have a dataset of 100 observations and 5 numerical variables ($x_1, x_2, ..., x_{5}$) which I used to generate a linear model which to predict some value $y$. I plugged in some values into the model and got a result for $\hat{y}$.
I would like to calculate a 95% confidence interval for this value.
Using Python I get the following model summary:
Adj. R-squared: 0.892
F-statistic: 27.63
Prop(F-stat): 2.45e-18
coef std err t P>|t| [0.025 0.975]
const -12.71 3.713 -3.424 0.001 -20.156 -5.268
x1 0.29 0.255 1.134 0.262 -0.222 -0.800
x2 0.17 0.031 5.283 0.000 0.103 0.228
x3 0.44 0.089 4.970 0.000 0.262 0.617
x4 -0.47 0.194 -2.414 0.019 -0.856 -0.079
x5 0.36 0.226 1.587 0.118 -0.094 0.811
I am aware that in order to calculate a 95% confidence interval for a simple linear regression the formula is $\hat{\beta} \pm t_{0.95, f}\: \cdot \: \hat{\sigma}_{B_0}$ where $f$ is the degree of freedom and $\hat{\sigma}_{B_0}$ is the std err. But how do I calculate this for multiple linear regression?
You cannot compute it from the output you have available in the screenshot you pasted because that only provides the standard errors of the estimated coefficients while ignoring any covariance between them. In other words the variance of the predicted value cannot be simply computed combining the variances of the estimated coefficients because these are not independent.
In a Bayesian setting, what you are looking for is called posterior predictive distribution.
Assuming an improper prior $p(\beta, \sigma^2) \propto \frac1{\sigma^2}$ on the parameters and that your new test points are conditionally independent of your training data, given $\beta$ and $\sigma^2$, then you can find the distribution of new test points as described in [1]. In a nutshell, if you try to predict more than one new test point, the vector of these will follow a multivariate t distribution (since they will all depend on the estimated regression coefficients, the predicted test points will not be independent). If you are only interested in a single test point $\tilde y$, then the maths simplifies and following holds:
$$\frac{\tilde y - \tilde X \hat\beta}{\sqrt{h}} \sim t_{(T-K)}$$
where $T$ is the number of training examples, $K$ is the number of coefficients, $\tilde X$ is the row vector of regressors for the new point, and $\hat\beta$ and $h$ are as follows: \begin{align} \hat\beta &= (X' X)^{-1} X'y \\ s^2 &= \frac{1}{T-K} \, (y-X\hat\beta)'(y-X\hat\beta)\\ h &= \frac{1}{s^2 \, [1 + \tilde X(X' X)^{-1}\tilde X']}, \end{align} where $y$ is the $T\times 1$ vector of observations and $X$ is the $T \times K$ design matrix corresponding to your training data.
[1] Zellner, A.; Chetty, V. K., Prediction and decision problems in regression models from the Bayesian point of view, J. Am. Stat. Assoc. 60, 608-616 (1965). JSTOR.