I just noticed that in TrueSkill they use this implementation of Bayes theorem:
$ \mathcal p(\mathbf s|\mathbf r, A) = \frac { \mathcal P(\mathbf r|\mathbf s, A) \; \mathcal p(\mathbf s) }{ \mathcal P(\mathbf r|A) } $
What puzzles me here is the mixing of probailities ($\mathcal P$) with probability densities ($\mathcal p$) in the formulation, as in my admittedly far form exhaustive readings on Bayes Theorem to date I only see Probabilities mentioned.
To wit, I wonder if anyone can recommend something to read on the use of probability densities and or a mix of probability densities and probabilities in Bayesian inference.
Found the answer here:
https://en.wikipedia.org/wiki/Bayes%27_theorem#Simple_form_2
It seems when mixing discrete and continuous random variables that the discrete ones are managed with probabilities and the continuous ones with probability densities.
And I have a feeling this is possible simply because the ratio of probabilities and the ratio of probability densities is comparable (with hints of L'Hopital's rule in there somewhere), because we can of course rearrange that equation to:
$ \frac { \mathcal p(\mathbf s|\mathbf r, A) } { \mathcal p(\mathbf s) } = \frac { \mathcal P(\mathbf r|\mathbf s, A) }{ \mathcal P(\mathbf r|A) } $
Which has elegant symmetry to it and note that $\mathcal p$ is in fact the derivative of $\mathcal P$ (did I mention shades of L'Hopital's?).