Most resources on the binomial distribution assume a constant $p$. So I'm asking this question for a bit more special case.
For the illustration, let's say we are interested in the probability that different players score in basketbal when they throw from a distance. We assume independence. Let’s say we have some models $M_k$ that estimate $p_i (d)$ for player $i$ and the distance $d$. If player $n$ throws from 3 distances $d=[5,3,20]$, $M_1$ estimates $\hat{p}=[0.6,0.7,0.2]$ and $M_2$ estimates $\hat{p}=[0.5,0.5,0.2]$.
I would like to know if the model is good. If we have outcomes such as $y = [1, 1, 0]$, we can calculate the RMSE or LogLikelihood. This I believe are good metric to compare models, and $M_2$ is performing worse.
However, is $M_1$ itself a good estimator? We know that the RMSE for individual predictions will never go to 0. If we have more data where the true $p$ is closer to 0.5, we will see a higher RMSE by definition. Essentially, I wonder if there is a different criterion to use?
In the standard binomial distribution, we look at the success rate. Then we know that the difference between the average of our prediction and the observed success rate will approach $0$ for the best model.
In this case, observed success rate is over the 3 throws is $ 0.66 $ and the predicted one for the two models is $0.5 $ and $0.4 $. However, we seem to no longer reward the model from being expressive: essentially ignoring the parameter $d$ and just aiming for the average.
What is a good way to combine these two perspectives?
In the first case you seem to be rewarding a model only for behaving in a way you expect based on your personal understanding of a universe where getting closer increases your chances of making a basket. In the second case you seem to be rewarding a model only for its predictive power when compared to real-world data.
The answer to your question is no in the sense that there's no predetermined "good" way to combine the perspectives. But the answer is also yes in the sense that which perspective is "good" (or which combination of perspectives is good) is up to the goals of the individual creating the model and of the intended audience for the model.
Either perspective might be considered good depending on the particular situation. This seems to be a question about the sociology of science. If the model is intended for a particular field or for a particular peer reviewed publication, you will likely be able to tell which approach is expected from what's been published in the past.