The formula for mean prediction using Gaussian Process is $K(X_*,X)K(X,X)^{-1}y$, where $K$ is the covariance function. See e.g. equation 2.23 (in chapter 2) from Gaussian Processes for Machine Learning (2006) by C. E. Rasmussen & C. K. I. Williams.
Oversimplifying, the mean prediction of the new point $y_*$ is the weighted average of previously observed $y$, where the weights are calculated by the $K(X_*,X)$ and normalized by $K(X,X)^{-1}$.
Now, the first part $K(X_*,X)$ is easy to interpret. The closer the new data point lies to the previously observed data points, the greater their similarity, the higher will be the weight and impact on the prediction.
But how to interpret the second part $K(X,X)^{-1}$?