I am working on a machine translation task and there are two kind of probabilities:
- $p_1 = Pr(\epsilon | \phi)$
- $p_2 = Pr(\phi | \epsilon)$
Where $\epsilon$ and $\phi$ are phrases in source and target languages.
Sometimes this probabilities become hugely different. One of the possible reasons: system made a mistake and issued trashy translation which is better to be discarded.
I would like to construct a metric that equals to $0$ if $p_1 = p_2$ and approaches $1$ the more $p_1$ and $p_2$ are different. It also would be nice to have a hyperparameter to tune how fast it goes towards $1$.
Please help.