Suppose I'm trying to predict a number $v_p \in \mathbb{R}$ and, thanks to sampling, I know that the prediction $v_p=a$ is true in $P(v_p)=P(a)$ percent of cases. In other words, $P(a)$ percent of the time, the true value of the current state $s_p$ is actually $a$. Now, for $s_p$, I can have various other values $v_g \in \Omega = \{ a, b, c, d, e \} $ with $a, b, c, d, e \in \mathbb{R}$.
I want to define a distance measure $d$ between $v_p=a$ and any other value $v_g$ taking into account the probability of presence/truthfulness $P(a)$ (i.e. $d(e) = f(a,e,P(a))$). Note that the numbers are real numbers, so we can compare them using the usual distance (i.e. Euclidean).
What I'm trying to do is:
- to have a distance of $0$ if probability $P(v_p)=r:=\frac{1}{Card(\Omega)}$. If the probability of truthfulness is close to random, then I want to ignore the measure because it makes no sense to compare other values to $v_p=a$ if we don't know any better than random whether $a$ is indeed the true value.
- decrease the distance if $v_g$ is close to $v_p$ and if $v_p$ has a good probability of being the true value (high $P(v_p)$ value).
- (not sure that's a good idea) increase distance if $P(v_p)$ is low but not close to chance $r$ (because close to chance it must be $0$)
- have smooth functions
What I've already tried is the classic:
$$ d(v_p, v_g) = \frac{|v_p-v_g|}{max(\Omega)-min(\Omega)} \times \frac{P(v_p)-r}{1-r} $$
And then for the similarity $1-d(v_p, v_g)$
Then :
- for $P(v_p)$ near $r$ the whole is close to 0
- for $P(v_p)$ near $1$ the whole is close to the distance between the 2 numbers
I also tried to obtain a smoother curve by replacing the rate $\frac{P(v_p)-r}{1-r}$ with an adjusted rate $\exp(k*(P(v_p)-r)$ with $k$ being an adjustable parameter or even with a sigmoid flat near the inflection point $r$ then increasing slowly with $\frac{1}{1 + exp(-k \times (P(v_p) - r))}$.
I know the measure of the probability distribution, but I don't think it's a probabilistic model since I have to take into account the actual distance between these real numbers (which are kind of not probability events).
But somehow, this doesn't seem right. For example, for $P(a)=0.8$, at the end of the day, I only take 80% of the distance $v_p-v_g$, just because $v_p$ has an 80% chance of being the true value. Even if the formula is correct and the homogeneity is respected, I have the impression that, semantically speaking, I'm comparing two different objects that don't go together.
How can I model my distance measure ? Are there any missing values/probabilities that could help me ?